* jk/maint-tag-show-fixes:
tag: do not show non-tag contents with "-n"
tag: die when listing missing or corrupt objects
tag: fix output of "tag -n" when errors occur
I'm no longer running the Git smoke testing service at
smoke.git.nix.is due to Smolder being a fragile piece of software not
having time to follow through on making it easy for third parties to
run and submit their own smoke tests.
So remove the support in Git for sending smoke tests to
smoke.git.nix.is, it's still easy to modify the test suite to submit
smokes somewhere else.
This reverts the following commits:
Revert "t/README: Add SMOKE_{COMMENT,TAGS}= to smoke_report target" -- e38efac87d
Revert "t/README: Document the Smoke testing" -- d15e9ebc5c
Revert "t/Makefile: Create test-results dir for smoke target" -- 617344d77b
Revert "tests: Infrastructure for Git smoke testing" -- b6b84d1b74
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Makefile: Change the default compiler from "gcc" to "cc"
Ever since the very first commit to git.git we've been setting CC to
"gcc". Presumably this is behavior that Linus copied from the Linux
Makefile.
However unlike Linux Git is written in ANSI C and supports a multitude
of compilers, including Clang, Sun Studio, xlc etc. On my Linux box
"cc" is a symlink to clang, and on a Solaris box I have access to "cc"
is Sun Studio's CC.
Both of these are perfectly capable of compiling Git, and it's
annoying to have to specify CC=cc on the command-line when compiling
Git when that's the default behavior of most other portable programs.
So change the default to "cc". Users who want to compile with GCC can
still add "CC=gcc" to the make(1) command-line, but those users who
don't have GCC as their "cc" will see expected behavior, and as a
bonus we'll be more likely to smoke out new compilation warnings from
our distributors since they'll me using a more varied set of compilers
by default.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/find-pack-entry-recent-cache-invalidation:
find_pack_entry(): do not keep packed_git pointer locally
sha1_file.c: move the core logic of find_pack_entry() into fill_pack_entry()
* fc/zsh-completion:
completion: simplify __gitcomp and __gitcomp_nl implementations
completion: use ls -1 instead of rolling a loop to do that ourselves
completion: work around zsh option propagation bug
* jk/maint-tag-show-fixes:
tag: do not show non-tag contents with "-n"
tag: die when listing missing or corrupt objects
tag: fix output of "tag -n" when errors occur
mergetools/meld: Use --help output to detect --output support
In v1.7.7-rc0~3^2 (2011-08-19), git mergetool's "meld" support learned
to use the --output option when calling versions of meld that are
detected to support it (1.5.0 and newer, hopefully).
Alas, it misdetects old versions (before 1.1.5, 2006-06-11) of meld as
supporting the option, so on systems with such meld, instead of
getting a nice merge helper, the operator gets a dialog box with the
text "Wrong number of arguments (Got 5)". (Version 1.1.5 is when meld
switched to using optparse. One consequence of that change was that
errors in usage are detected and signalled through the exit status
even when --help was passed.)
Luckily there is a simpler check that is more reliable: the usage
string printed by "meld --help" reliably reflects whether --output is
supported in a given version. Use it.
Reported-by: Jeff Epler <jepler@unpythonic.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explicitly set X to avoid potential build breakage
$X is appended to binary names for Windows builds (ie. git.exe).
Pollution from the environment can inadvertently trigger this behaviour,
resulting in 'git' turning into 'gitwhatever' without warning.
Signed-off-by: Michael Palimaka <kensington@astralcloak.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge: do not launch an editor on "--no-edit $tag"
When the user explicitly asked us not to, don't launch an editor.
But do everything else the same way as the "edit" case, i.e. leave the
comment with verification result in the log template and record the
mergesig in the resulting merge commit for later inspection.
It is necessary to write the else branch as a nested conditional. Also,
write the conditions with parentheses because we use them throughout the
Makefile.
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git tag -n" did not check the type of the object it is reading the top n
lines from. At least, avoid showing the beginning of trees and blobs when
dealing with lightweight tags that point at them.
As the payload of a tag and a commit look similar in that they both start
with a header block, which is skipped for the purpose of "-n" output,
followed by human readable text, allow the message of commit objects to be
shown just like the contents of tag objects. This avoids regression for
people who have been using "tag -n" to show the log messages of commits
that are pointed at by lightweight tags.
* bl/gitweb-project-filter:
gitweb: Make project search respect project_filter
gitweb: improve usability of projects search form
gitweb: place links to parent directories in page header
gitweb: show active project_filter in project_list page header
gitweb: limit links to alternate forms of project_list to active project_filter
gitweb: add project_filter to limit project list to a subdirectory
gitweb: prepare git_get_projects_list for use outside 'forks'.
gitweb: move hard coded .git suffix out of git_get_projects_list
* jn/svn-fe: (36 commits)
vcs-svn: suppress a -Wtype-limits warning
vcs-svn: allow import of > 4GiB files
vcs-svn: rename check_overflow arguments for clarity
vcs-svn/svndiff.c: squelch false "unused" warning from gcc
vcs-svn: reset first_commit_done in fast_export_init
vcs-svn: do not initialize report_buffer twice
vcs-svn: avoid hangs from corrupt deltas
vcs-svn: guard against overflow when computing preimage length
vcs-svn: cap number of bytes read from sliding view
test-svn-fe: split off "test-svn-fe -d" into a separate function
vcs-svn: implement text-delta handling
vcs-svn: let deltas use data from preimage
vcs-svn: let deltas use data from postimage
vcs-svn: verify that deltas consume all inline data
vcs-svn: implement copyfrom_data delta instruction
vcs-svn: read instructions from deltas
vcs-svn: read inline data from deltas
vcs-svn: read the preimage when applying deltas
vcs-svn: parse svndiff0 window header
vcs-svn: skeleton of an svn delta parser
...
commit: ignore intent-to-add entries instead of refusing
Originally, "git add -N" was introduced to help users from forgetting to
add new files to the index before they ran "git commit -a". As an attempt
to help them further so that they do not forget to say "-a", "git commit"
to commit the index as-is was taught to error out, reminding the user that
they may have forgotten to add the final contents of the paths before
running the command.
This turned out to be a false "safety" that is useless. If the user made
changes to already tracked paths and paths added with "git add -N", and
then ran "git add" to register the final contents of the paths added with
"git add -N", "git commit" will happily create a commit out of the index,
without including the local changes made to the already tracked paths. It
was not a useful "safety" measure to prevent "forgetful" mistakes from
happening.
It turns out that this behaviour is not just a useless false "safety", but
actively hurts use cases of "git add -N" that were discovered later and
have become popular, namely, to tell Git to be aware of these paths added
by "git add -N", so that commands like "git status" and "git diff" would
include them in their output, even though the user is not interested in
including them in the next commit they are going to make.
Fix this ancient UI mistake, and instead make a commit from the index
ignoring the paths added by "git add -N" without adding real contents.
Based on the work by Nguyễn Thái Ngọc Duy, and helped by injection of
sanity from Jonathan Nieder and others on the Git mailing list.
drop odd return value semantics from userdiff_config
When the userdiff_config function was introduced in be58e70
(diff: unify external diff and funcname parsing code,
2008-10-05), it used a return value convention unlike any
other config callback. Like other callbacks, it used "-1" to
signal error. But it returned "1" to indicate that it found
something, and "0" otherwise; other callbacks simply
returned "0" to indicate that no error occurred.
This distinction was necessary at the time, because the
userdiff namespace overlapped slightly with the color
configuration namespace. So "diff.color.foo" could mean "the
'foo' slot of diff coloring" or "the 'foo' component of the
"color" userdiff driver". Because the color-parsing code
would die on an unknown color slot, we needed the userdiff
code to indicate that it had matched the variable, letting
us bypass the color-parsing code entirely.
Later, in 8b8e862 (ignore unknown color configuration,
2009-12-12), the color-parsing code learned to silently
ignore unknown slots. This means we no longer need to
protect userdiff-matched variables from reaching the
color-parsing code.
We can therefore change the userdiff_config calling
convention to a more normal one. This drops some code from
each caller, which is nice. But more importantly, it reduces
the cognitive load for readers who may wonder why
userdiff_config is unlike every other config callback.
There's no need to add a new test confirming that this
works; t4020 already contains a test that sets
diff.color.external.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
add -e: do not show difference in a submodule that is merely dirty
When the HEAD of the submodule matches what is recorded in the index of
the superproject, and it has local changes or untracked files, the patch
offered by "git add -e" for editing shows a diff like this:
Because applying such a patch has no effect to the index, this is a
useless noise. Generate the patch with IGNORE_DIRTY_SUBMODULES flag to
prevent such a change from getting reported.
This patch also loses the "-dirty" suffix from the output when the HEAD of
the submodule is different from what is in the index of the superproject.
As such dirtiness expressed by the suffix does not affect the result of
the patch application at all, there is no information lost if we remove
it. The user could still run "git status" before "git add -e" if s/he
cares about the dirtiness.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git checkout -b: allow switching out of an unborn branch
Running "git checkout -b another" immediately after "git init" when you do
not even have a commit on 'master' fails with:
$ git checkout -b another
fatal: You are on a branch yet to be born
This is unnecessary, if we redefine "git checkout -b $name" that does not
take any $start_point (which has to be a commit) as "I want to check out a
new branch $name from the state I am in".
completion: simplify __gitcomp and __gitcomp_nl implementations
These shell functions are written in an unnecessarily verbose way;
simplify their "conditionally use $<number> after checking $# against
<number>" logic by using shell's built-in conditional substitution
facilities.
Also remove the first of the two assignments to IFS in __gitcomp_nl
that does not have any effect.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: use ls -1 instead of rolling a loop to do that ourselves
This simplifies the code a great deal. In particular, it allows us to
get rid of __git_shopt, which is used only in this fuction to enable
'nullglob' in zsh.
[jn: squashed with a patch that actually gets rid of __git_shopt]
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
completion: work around zsh option propagation bug
When listing commands in zsh (git <TAB><TAB>), all of them will show up,
instead of only porcelain ones.
The root cause of this is because zsh versions from 4.3.0 to present
(4.3.15) do not correctly propagate the SH_WORD_SPLIT option into the
subshell in ${foo:=$(bar)} expressions. Because of this bug, the list of
all commands was treated as a single word in __git_list_porcelain_commands
and did not match any of the patterns that would usually cause plumbing to
be excluded.
With problematic versions of zsh, after running
emulate sh
fn () {
var='one two'
for v in $var; do echo $v; done
}
x=$(fn)
: ${y=$(fn)}
printing "$x" results in two lines as expected, but printing "$y" results
in a single line because $var is expanded as a single word when evaluating
fn to compute y.
So avoid the construct, and use an explicit 'test -n "$foo" || foo=$(bar)'
instead.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
mailmap: always return a plain mail address from map_user()
The callers of map_user() give email and name to it, and expect to get the
up-to-date email and/or name to be used in their output. The function
rewrites the given buffers in place. To optimize the majority of cases,
the function returns 0 when it did not do anything, and it returns 1 when
the caller should use the updated contents.
The 'email' input to the function is terminated by '>' or a NUL (whichever
comes first) for historical reasons, but when a rewrite happens, the value
is replaced with the mailbox inside the <> pair. However, it failed to
meet this expectation when it only rewrote the name part without rewriting
the email part, and the email in the input was terminated by '>'.
This causes an extra '>' to appear in the output of "blame -e", because the
caller does send in '>'-terminated email, and when the function returned 1
to tell it that rewriting happened, it appends '>' that is necessary when
the email part was rewritten.
The patch looks bigger than it actually is, because this change makes a
variable that points at the end of the email part in the input 'p' live
much longer than it used to, deserving a more descriptive name.
Noticed and diagnosed by Felipe Contreras and Jeff King.
fsck: give accurate error message on empty loose object files
Since 3ba7a065527a (A loose object is not corrupt if it
cannot be read due to EMFILE), "git fsck" on a repository with an empty
loose object file complains with the error message
fatal: failed to read object <sha1>: Invalid argument
This comes from a failure of mmap on this empty file, which sets errno to
EINVAL. Instead of calling xmmap on empty file, we display a clean error
message ourselves, and return a NULL pointer. The new message is
error: object file .git/objects/09/<rest-of-sha1> is empty
fatal: loose object <sha1> (stored in .git/objects/09/<rest-of-sha1>) is corrupt
The second line was already there before the regression in 3ba7a065527a,
and the first is an additional message, that should help diagnosing the
problem for the user.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't usually bother looking at tagged objects at all
when listing. However, if "-n" is specified, we open the
objects to read the annotations of the tags. If we fail to
read an object, or if the object has zero length, we simply
silently return.
The first case is an indication of a broken or corrupt repo,
and we should notify the user of the error.
The second case is OK to silently ignore; however, the
existing code leaked the buffer returned by read_sha1_file.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git tag" is instructed to print lines from annotated
tags via "-n", it first prints the tag name, then attempts
to parse and print the lines of the tag object, and then
finally adds a trailing newline.
If an error occurs, we return early from the function and
never print the newline, screwing up the output for the next
tag. Let's factor the line-printing into its own function so
we can manage the early returns better, and make sure that
we always terminate the line.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix build problems related to profile-directed optimization
There was a number of problems I ran into when trying the
profile-directed optimizations added by Andi Kleen in git commit 7ddc2710b9. (This was using gcc 4.4 found on many enterprise
distros.)
1) The -fprofile-generate and -fprofile-use commands are incompatible
with ccache; the code ends up looking in the wrong place for the gcda
files based on the ccache object names.
2) If the makefile notices that CFLAGS are different, it will rebuild
all of the binaries. Hence the recipe originally specified by the
INSTALL file ("make profile-all" followed by "make install") doesn't
work. It will appear to work, but the binaries will end up getting
built with no optimization.
This patch fixes this by using an explicit set of options passed via
the PROFILE variable then using this to directly manipulate CFLAGS and
EXTLIBS.
The developer can run "make PROFILE=BUILD all ; sudo make
PROFILE=BUILD install" automatically run a two-pass build with the
test suite run in between as the sample workload for the purpose of
recording profiling information to do the profile-directed
optimization.
Alternatively, the profiling version of binaries can be built using:
make PROFILE=GEN PROFILE_DIR=/var/cache/profile all
make PROFILE=GEN install
and then after git has been used for a while, the optimized version of
the binary can be built as follows:
make PROFILE=USE PROFILE_DIR=/var/cache/profile all
make PROFILE=USE install
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* cb/push-quiet:
t5541: avoid TAP test miscounting
fix push --quiet: add 'quiet' capability to receive-pack
server_supports(): parse feature list more carefully
The imap-send code was adapted from another project, and
still contains many unused bits of code. One of these bits
contains a type "struct string_list" which bears no
resemblence to the "struct string_list" we use elsewhere in
git. This causes the compiler to complain if git's
string_list ever becomes part of cache.h.
Let's just drop the dead code.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
branch --edit-description: protect against mistyped branch name
It is very easy to mistype the branch name when editing its description,
e.g.
$ git checkout -b my-topic master
: work work work
: now we are at a good point to switch working something else
$ git checkout master
: ah, let's write it down before we forget what we were doing
$ git branch --edit-description my-tpoic
The command does not notice that branch 'my-tpoic' does not exist. It is
not lost (it becomes description of an unborn my-tpoic branch), but is not
very useful. So detect such a case and error out to reduce the grief
factor from this common mistake.
This incidentally also errors out --edit-description when the HEAD points
at an unborn branch (immediately after "init", or "checkout --orphan"),
because at that point, you do not even have any commit that is part of
your history and there is no point in describing how this particular
branch is different from the branch it forked off of, which is the useful
bit of information the branch description is designed to capture.
We may want to special case the unborn case later, but that is outside the
scope of this patch to prevent more common mistakes before 1.7.9 series
gains too much widespread use.
Drop system includes from inet_pton/inet_ntop compatibility wrappers
As both of these compatibility wrappers include git-compat-utils.h,
all of the system includes were redundant.
Dropping these system includes also makes git-compat-utils.h the first
include which avoids a compiler warning on Solaris due to the
redefinition of _FILE_OFFSET_BITS.
Signed-off-by: Ben Walton <bwalton@artsci.utoronto.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge: do not create a signed tag merge under --ff-only option
Starting at release v1.7.9, if you ask to merge a signed tag, "git merge"
always creates a merge commit, even when the tag points at a commit that
happens to be a descendant of your current commit.
Unfortunately, this interacts rather badly for people who use --ff-only to
make sure that their branch is free of local developments. It used to be
possible to say:
and expect that the resulting tip of frotz branch matches v1.7.9^0 (aka
the commit tagged as v1.7.9), but this fails with the updated Git with:
fatal: Not possible to fast-forward, aborting.
because a merge that merges v1.7.9 tag to v1.7.9~30 cannot be created by
fast forwarding.
We could teach users that now they have to do
$ git merge --ff-only v1.7.9^0
but it is far more pleasant for users if we DWIMmed this ourselves.
When an integrator pulls in a topic from a lieutenant via a signed tag,
even when the work done by the lieutenant happens to fast-forward, the
integrator wants to have a merge record, so the integrator will not be
asking for --ff-only when running "git pull" in such a case. Therefore,
this change should not regress the support for the use case v1.7.9 wanted
to add.
"git diff --stat" and "git apply --stat" now learn to print the line
"%d files changed, %d insertions(+), %d deletions(-)" in singular form
whenever applicable. "0 insertions" and "0 deletions" are also omitted
unless they are both zero.
This matches how versions of "diffstat" that are not prehistoric produced
their output, and also makes this line translatable.
[jc: with help from Thomas Dickey in archaeology of "diffstat"]
[jc: squashed Jonathan's updates to illustrations in tutorials and a test]
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only place that the issue this series addresses was observed
where we read "cat-file commit" output and put it in GIT_AUTHOR_DATE
in order to replay a commit with an ancient timestamp.
With the previous patch alone, "git commit --date='20100917 +0900'"
can be misinterpreted to mean an ancient timestamp, not September in
year 2010. Guard this codepath by requring an extra '@' in front of
the raw git timestamp on the parsing side. This of course needs to
be compensated by updating get_author_ident_from_commit and the code
for "git commit --amend" to prepend '@' to the string read from the
existing commit in the GIT_AUTHOR_DATE environment variable.
The date-time parser parses out a human-readble datestring piece by
piece, so that it could even parse a string in a rather strange
notation like 'noon november 11, 2005', but restricts itself from
parsing strings in "<seconds since epoch> <timezone>" format only
for reasonably new timestamps (like 1974 or newer) with 10 or more
digits. This is to prevent a string like "20100917" from getting
interpreted as seconds since epoch (we want to treat it as September
17, 2010 instead) while doing so.
The same codepath is used to read back the timestamp that we have
already recorded in the headers of commit and tag objects; because
of this, such a commit with timestamp "0 +0000" cannot be rebased or
amended very easily.
Teach parse_date() codepath to special case a string of the form
"<digits> +<4-digits>" to work this issue around, but require that
there is no other cruft around the string when parsing a timestamp
of this format for safety.
Note that this has a slight backward incompatibility implications.
If somebody writes "git commit --date='20100917 +0900'" and wants it
to mean a timestamp in September 2010 in Japan, this change will
break such a use case.
This is caused by the fact that localized messages do not have their
place in some RPM package. Let's postpone decision where they should
be put (be it git-i18n-Icelandic, or git-i18n, or git package itself)
for later by removing locale files at the end of install phase.
Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
t0300 creates some helper shell scripts, and marks them with
"!/bin/sh". Even though the scripts are fairly simple, they
can fail on broken shells (specifically, Solaris /bin/sh
will persist a temporary assignment to IFS in a "read"
command).
Rather than work around the problem for Solaris /bin/sh,
using write_script will make sure we point to a known-good
shell that the user has given us.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many of the scripts in the test suite write small helper
shell scripts to disk. It's best if these shell scripts
start with "#!$SHELL_PATH" rather than "#!/bin/sh", because
/bin/sh on some platforms is too buggy to be used.
However, it can be cumbersome to expand $SHELL_PATH, because
the usual recipe for writing a script is:
cat >foo.sh <<-\EOF
#!/bin/sh
echo my arguments are "$@"
EOF
To expand $SHELL_PATH, you have to either interpolate the
here-doc (which would require quoting "\$@"), or split the
creation into two commands (interpolating the $SHELL_PATH
line, but not the rest of the script). Let's provide a
helper function that makes that less syntactically painful.
While we're at it, this helper can also take care of the
"chmod +x" that typically comes after the creation of such a
script, saving the caller a line.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current askpass code simply dies if calling an askpass
helper fails. Worse, in some failure modes it doesn't even
print an error (if start_command fails, then it prints its
own error; if reading fails, we print an error; but if the
command exits non-zero, finish_command fails and we print
nothing!).
Let's be more kind to the user by printing an error message
when askpass doesn't work out, and then falling back to the
terminal (which also may fail, of course, but we die already
there with a nice message).
While we're at it, let's clean up the existing error
messages a bit. Now that our prompts are very long and
contain quotes and colons themselves, our error messages are
hard to read.
So the new failure modes look like:
[before, with a terminal]
$ GIT_ASKPASS=false git push
$ echo $?
128
[before, with no terminal, and we must give up]
$ setsid git push
fatal: could not read 'Password for 'https://peff@github.com': ': No such device or address
[after, with a terminal]
$ GIT_ASKPASS=false git push
error: unable to read askpass response from 'false'
Password for 'https://peff@github.com':
[after, with no terminal, and we must give up]
$ GIT_ASKPASS=false setsid git push
error: unable to read askpass response from 'false'
fatal: could not read Password for 'https://peff@github.com': No such device or address
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The do_askpass function inherited a few bad habits from the
original git_getpass. One, there's no need to strbuf_reset a
buffer which was just initialized. And two, it's a good
habit to use strbuf_detach to claim ownership of a buffer's
string (even though in this case the owning buffer goes out
of scope, so it's effectively the same thing).
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb: Allow UTF-8 encoded CGI query parameters and path_info
Gitweb forgot to turn query parameters into UTF-8. This results in a bug
that one cannot search for a string with characters outside US-ASCII. For
example searching for "Michał Kiedrowicz" (containing letter 'ł' - LATIN
SMALL LETTER L WITH STROKE, with Unicode codepoint U+0142, represented
with 0xc5 0x82 bytes in UTF-8 and percent-encoded as %C5%82) result in the
following incorrect data in search field
MichaÅ\202 Kiedrowicz
This is caused by CGI by default treating '0xc5 0x82' bytes as two
characters in Perl legacy encoding latin-1 (iso-8859-1), because 's'
query parameter is not processed explicitly as UTF-8 encoded string.
The solution used here follows "Using Unicode in a Perl CGI script"
article on http://www.lemoda.net/cgi/perl-unicode/index.html:
use CGI;
use Encode 'decode_utf8;
my $value = params('input');
$value = decode_utf8($value);
Decoding UTF-8 is done when filling %input_params hash and $path_info
variable; the former requires to move from explicit $cgi->param(<label>)
to $input_params{<name>} in a few places, which is a good idea anyway.
Also add -override=>1 parameter to $cgi->textfield() invocation in search
form. Otherwise CGI would use values from query string if it is present,
filling value from $cgi->param... without decode_utf8(). As we are using
value of appropriate parameter anyway, -override=>1 doesn't change the
situation but makes gitweb fill search field correctly.
We could simply use the '-utf8' pragma (via "use CGI '-utf8';") to solve
this, but according to CGI.pm documentation, it may cause problems with
POST requests containing binary files, and it requires CGI 3.31 (I think),
released with perl v5.8.9.
Reported-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com> Signed-off-by: Jakub Narębski <jnareb@gmail.com> Tested-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
On 32-bit architectures with 64-bit file offsets, gcc 4.3 and earlier
produce the following warning:
CC vcs-svn/sliding_window.o
vcs-svn/sliding_window.c: In function `check_overflow':
vcs-svn/sliding_window.c:36: warning: comparison is always false \
due to limited range of data type
The warning appears even when gcc is run without any warning flags
(this is gcc bug 12963). In later versions the same warning can be
reproduced with -Wtype-limits, which is implied by -Wextra.
On 64-bit architectures it really is possible for a size_t not to be
representable as an off_t so the check this is warning about is not
actually redundant. But even false positives are distracting. Avoid
the warning by making the "len" argument to check_overflow a
uintmax_t; no functional change intended.
Reported-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no reason in principle that an svn-format dump would not be
able to represent a file whose length does not fit in a 32-bit
integer. Use off_t consistently to represent file lengths (in place
of using uint32_t in some contexts) so we can handle that.
Most svn-fe code is already ready to do that without this patch and
passes values of type off_t around. The type mismatch from stragglers
was noticed with gcc -Wtype-limits.
While at it, tighten the parsing of the Text-content-length field to
make sure it is a number and does not overflow, and tighten other
overflow checks as that value is passed around and manipulated.
Inspired-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>