log --use-mailmap: optimize for cases without --author/--committer search
When we taught the commit_match() mechanism to pay attention to the
new --use-mailmap option, we started to unconditionally copy the
commit object to a temporary buffer, just in case we need the author
and committer lines updated via the mailmap mechanism, and rewrite
author and committer using the mailmap.
It turns out that this has a rather unpleasant performance
implications. In the linux kernel repository, running
$ git log --author='Junio C Hamano' --pretty=short >/dev/null
under /usr/bin/time, with and without --use-mailmap (the .mailmap
file is 118 entries long, the particular author does not appear in
it), cost (with warm cache):
The latter case is an unnecessary performance regression. We may
want to _show_ the result with mailmap applied, but we do not have
to copy and rewrite the author/committer of all commits we try to
match if we do not query for these fields.
Trivially optimize this performace regression by limiting the
rewrites for only when we are matching with author/committer fields.
Simplify map_user(), mostly to avoid copies of string buffers. It
also simplifies caller functions.
map_user() directly receive pointers and length from the commit buffer
as mail and name. If mapping of the user and mail can be done, the
pointer is updated to a new location. Lengths are also updated if
necessary.
The caller of map_user() can then copy the new email and name if
necessary.
Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In map_user(), we have email pointer that points at the beginning of
an e-mail address, but the buffer is not terminated with a NUL after
the e-mail address. It typically has ">" after the address, and it
could have even more if it comes from author/committer line in a
commit object. Or it may not have ">" after it.
We used to copy the e-mail address proper into a temporary buffer
before asking the string-list API to find the e-mail address in the
mailmap, because string_list_lookup() function only takes a NUL
terminated full string.
Introduce a helper function lookup_prefix that takes the email
pointer and the length, and finds a matching entry in the string
list used for the mailmap, by doing the following:
- First ask string_list_find_insert_index() where in its sorted
list the e-mail address we have (including the possible trailing
junk ">...") would be inserted.
- It could find an exact match (e.g. we had a clean e-mail address
without any trailing junk). We can return the item in that case.
- Or it could return the index of an item that sorts after the
e-mail address we have.
- If we did not find an exact match against a clean e-mail address,
then the record we are looking for in the mailmap has to exist
before the index returned by the function (i.e. "email>junk"
always sorts later than "email"). Iterate, starting from that
index, down the map->items[] array until we find the exact record
we are looking for, or we see a record with a key that definitely
sorts earlier than the e-mail we are looking for (i.e. when we
are looking for "email" in "email>junk", a record in the mailmap
that begins with "emaik" strictly sorts before "email", if such a
key existed in the mailmap).
This, together with the earlier enhancement to support
case-insensitive sorting, allow us to remove an extra copy of email
buffer to downcase it.
A part of this is based on Antoine Pelisse's previous work.
Some string list needs to be searched case insensitively, and for
that to work correctly, the string needs to be sorted case
insensitively from the beginning.
Allow a custom comparison function to be defined on a string list
instance and use it throughout in place of strcmp().
Makefile: whitespace style fixes in macro definitions
Consistently use a single space before and after the "=" (or ":=", "+=",
etc.) in assignments to make macros. Granted, this was not a big deal,
but I did find the needless inconsistency quite distracting.
Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git(1): remove a defunct link to "list of authors"
The linked page has not been showing the promised "more complete
list" for more than 6 months by now, and nobody has resurrected
the list there nor elsewhere since then.
Documentation/diff-config: work around AsciiDoc misfortune
The line that happens to begin with indent followed by "3. " was
interpreted as if it was an enumerated list; just wrap the lines
differently to work it around for now.
Various codepaths checked if two encoding names are the same using
ad-hoc code and some of them ended up asking iconv() to convert
between "utf8" and "UTF-8". The former is not a valid way to spell
the encoding name, but often people use it by mistake, and we
equated them in some but not all codepaths. Introduce a new helper
function to make these codepaths consistent.
* jc/same-encoding:
reencode_string(): introduce and use same_encoding()
Merge branch 'lt/diff-stat-show-0-lines' into maint
"git diff --stat" miscounted the total number of changed lines when
binary files were involved and hidden beyond --stat-count. It also
miscounted the total number of changed files when there were
unmerged paths.
* lt/diff-stat-show-0-lines:
t4049: refocus tests
diff --shortstat: do not count "unmerged" entries
diff --stat: do not count "unmerged" entries
diff --stat: move the "total count" logic to the last loop
diff --stat: use "file" temporary variable to refer to data->files[i]
diff --stat: status of unmodified pair in diff-q is not zero
test: add failing tests for "diff --stat" to t4049
Fix "git diff --stat" for interesting - but empty - file changes
Translate 825 new messages came from git.pot update in cc76011 ("l10n: Update git.pot (825 new, 24 removed messages)").
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Reviewed-by: Thomas Rast <trast@student.ethz.ch> Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
status: respect advice.statusHints for ahead/behind advice
If the user has unset advice.statusHints, we already
suppress the "use git reset to..." hints in each stanza. The
new "use git push to publish..." hint is the same type of
hint. Let's respect statusHints for it, rather than making
the user set yet another advice flag.
Signed-off-by: Jeff King <peff@peff.net> Acked-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Enclose tests in single quotes as opposed to double quotes. This is
the prevalent style in other tests.
- Remove the unused variable $head4_full.
- Indent the expected output so that it lines up with the rest of the
test text.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of "cd there and then come back", use the "cd there in a
subshell" pattern. Also fix '&&' chaining in one place.
Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git rev-list --max-count=1 HEAD` is a roundabout way of saying `git
rev-parse --verify HEAD`; replace a bunch of instances of the former
with the latter. Also, don't unnecessarily `cut -c1-7` the rev-parse
output when the `--short` option is available.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --stat" miscounted the total number of changed lines when
binary files were involved and hidden beyond --stat-count. It also
miscounted the total number of changed files when there were
unmerged paths.
* lt/diff-stat-show-0-lines:
t4049: refocus tests
diff --shortstat: do not count "unmerged" entries
diff --stat: do not count "unmerged" entries
diff --stat: move the "total count" logic to the last loop
diff --stat: use "file" temporary variable to refer to data->files[i]
diff --stat: status of unmodified pair in diff-q is not zero
test: add failing tests for "diff --stat" to t4049
* fc/remote-hg: (22 commits)
remote-hg: fix for older versions of python
remote-hg: fix for files with spaces
remote-hg: avoid bad refs
remote-hg: try the 'tip' if no checkout present
remote-hg: fix compatibility with older versions of hg
remote-hg: add missing config for basic tests
remote-hg: the author email can be null
remote-hg: add option to not track branches
remote-hg: add extra author test
remote-hg: add tests to compare with hg-git
remote-hg: add bidirectional tests
test-lib: avoid full path to store test results
remote-hg: add basic tests
remote-hg: fake bookmark when there's none
remote-hg: add compat for hg-git author fixes
remote-hg: add support for hg-git compat mode
remote-hg: match hg merge behavior
remote-hg: make sure the encoding is correct
remote-hg: add support to push URLs
remote-hg: add support for remote pushing
...
* km/send-email-remove-cruft-in-address:
git-send-email: allow edit invalid email address
git-send-email: ask what to do with an invalid email address
git-send-email: remove invalid addresses earlier
git-send-email: fix fallback code in extract_valid_address()
git-send-email: remove garbage after email address
General clean-ups in various areas, originally written to support a
patch that later turned out to be unneeded.
* jk/send-email-sender-prompt:
t9001: check send-email behavior with implicit sender
t: add tests for "git var"
ident: keep separate "explicit" flags for author and committer
ident: make user_ident_explicitly_given static
t7502: factor out autoident prerequisite
test-lib: allow negation of prerequisites
Clean up completion tests. Use of conslidated helper may make
instrumenting one particular test during debugging of the test
itself, but I think that issue should be addressed in some other
way (e.g. making sure individual tests in 9902 can be skipped).
* fc/completion-test-simplification:
completion: simplify __gitcomp() test helper
completion: refactor __gitcomp related tests
completion: consolidate test_completion*() tests
completion: simplify tests using test_completion_long()
completion: standardize final space marker in tests
completion: add comment for test_completion()
git-fast-import.txt: improve documentation for quoted paths
The documentation mentioned only newlines and double quotes as
characters needing escaping, but the backslash also needs it. Also, the
documentation was not clearly saying that double quotes around the file
name were required (double quotes in the examples could be interpreted as
part of the sentence, not part of the actual string).
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-remote-mediawiki: escape ", \, and LF in file names
A mediawiki page can contain, and even start with a " character, we have
to escape it when generating the fast-export stream, as well as \
character. While we're there, also escape newlines, but I don't think we
can get them from MediaWiki pages.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The primary thing Linus's patch wanted to change was to make sure
that 0-line change appears for a mode-only change. Update the
first test to chmod a file that we can see in the output (limited
by --stat-count) to demonstrate it. Also make sure to use test_chmod
and compare the index and the tree, so that we can run this test
even on a filesystem without permission bits.
Later two tests are about fixes to separate issues that were
introduced and/or uncovered by Linus's patch as a side effect, but
the issues are not related to mode-only changes. Remove chmod from
the tests.
t9001: check send-email behavior with implicit sender
We allow send-email to use an implicitly-defined identity
for the sender (because there is still a confirmation step),
but we abort when we cannot generate such an identity. Let's
make sure that we test this.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not currently have any explicit tests for "git var" at
all (though we do exercise it to some degree as a part of
other tests). Let's add a few basic sanity checks.
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/git-push.txt: clarify the "push from satellite" workflow
The context of the example to push into refs/remotes/satellite/
hierarchy of the other repository needs to be spelled out explicitly
for the value of this example to be fully appreciated. Make it so.
As Amit Bakshi reported, older versions of python (< 2.7) don't have
subprocess.check_output, so let's use subprocess.Popen directly as
suggested.
Suggested-by: Amit Bakshi <ambakshi@gmail.com> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge branch 'jh/update-ref-d-through-symref' into maint
* jh/update-ref-d-through-symref:
Fix failure to delete a packed ref through a symref
t1400-update-ref: Add test verifying bug with symrefs in delete_ref()
Even though we show a separate *UNMERGED* entry in the patch and
diffstat output (or in the --raw format, for that matter) in
addition to and separately from the diff against the specified stage
(defaulting to #2) for unmerged paths, they should not be counted in
the total number of files affected---that would lead to counting the
same path twice.
The separation done by the previous step makes this fix simple and
straightforward. Among the filepairs in diff_queue, paths that
weren't modified, and the extra "unmerged" entries do not count as
total number of files.
diff --stat: move the "total count" logic to the last loop
The diffstat generation logic, with --stat-count limit, is
implemented as three loops.
- The first counts the width necessary to show stats up to
specified number of entries, and notes up to how many entries in
the data we need to iterate to show the graph;
- The second iterates that many times to draw the graph, adjusts
the number of "total modified files", and counts the total
added/deleted lines for the part that was shown in the graph;
- The third iterates over the remainder and only does the part to
count "total added/deleted lines" and to adjust "total modified
files" without drawing anything.
Move the logic to count added/deleted lines and modified files from
the second loop to the third loop.
This incidentally fixes a bug. The third loop was not filtering
binary changes (counted in bytes) from the total added/deleted as it
should. The second loop implemented this correctly, so if a binary
change appeared earlier than the --stat-count cutoff, the code
counted number of added/deleted lines correctly, but if it appeared
beyond the cutoff, the number of lines would have mixed with the
byte count in the buggy third loop.
test: add failing tests for "diff --stat" to t4049
There are a few problems in diff.c around --stat area, partially
caused by the recent 74faaa1 (Fix "git diff --stat" for interesting
- but empty - file changes, 2012-10-17), and largely caused by the
earlier change that introduced when --stat-count was added.
Add a few test pieces to t4049 to expose the issues.
tcsh users sometimes alias the 'git' command to another name. In
this case, the user expects to only have to issue a new 'complete'
command using the alias name.
However, the tcsh script currently uses the command typed by the
user to call the appropriate function in git-completion.bash, either
_git() or _gitk(). When using an alias, this technique no longer
works.
This change specifies the real name of the command (either 'git' or
'gitk') as a parameter to the script handling tcsh completion. This
allows the user to use any alias for the 'git' or 'gitk' commands,
while still getting completion to work.
A check for the presence of ${HOME}/.git-completion.bash is also
added to help the user make use of the script properly.
Signed-off-by: Marc Khouzam <marc.khouzam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Further suppose that the other person already pushed changes leading to
A back to the original repository you two obtained the original commit
X.
which doesn't parse for me; I've changed it to
Further suppose that the other person already pushed changes leading to
A back to the original repository from which you two obtained the
original commit X.
Signed-off-by: Mark Szepieniec <mszepien@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases the user may want to send email with "Cc:" line with
email address we cannot extract. Now we allow user to extract
such email address for us.
Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-send-email: ask what to do with an invalid email address
We used to warn about invalid emails and just drop them. Such warnings
can be unnoticed by user or noticed after sending email when we are not
giving the "final sanity check [Y/n]?"
Now we quit by default.
Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some addresses are passed twice to unique_email_list() and invalid addresses
may be reported twice per send_message. Now we warn about them earlier
and we also remove invalid addresses.
This also removes using of undefined values for string comparison
for invalid addresses in cc list processing.
Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-email: avoid questions when user has an ident
Currently we keep getting questions even when the user has properly
configured his full name and password:
Who should the emails appear to be from?
[Felipe Contreras <felipe.contreras@gmail.com>]
And once a question pops up, other questions are turned on. This is
annoying.
The reason it's safe to avoid this question is because currently the
script fails completely when the author (or committer) is not correct,
so we won't even be reaching this point in the code.
The scenarios, and the current situation:
1) No information at all, no fully qualified domain name
fatal: empty ident name (for <felipec@nysa.(none)>) not allowed
2) Only full name
fatal: unable to auto-detect email address (got 'felipec@nysa.(none)')
3) Full name + fqdm
Who should the emails appear to be from?
[Felipe Contreras <felipec@nysa.felipec.org>]
4) Full name + EMAIL
Who should the emails appear to be from?
[Felipe Contreras <felipe.contreras@gmail.com>]
5) User configured
6) GIT_COMMITTER
7) GIT_AUTHOR
All these are the same as 4)
After this patch:
1) 2) won't change: git send-email would still die
4) 5) 6) 7) will change: git send-email won't ask the user
This is good, that's what we would expect, because the identity is
explicit.
3) will change: git send-email won't ask the user
This is bad, because we will try with an address such as
'felipec@nysa.felipec.org', which is most likely not what the user
wants, but the user will get warned by default (confirm=auto), and if
not, most likely the sending won't work, which the user would readily
note and fix.
The worst possible scenario is that such mail address does work, and the
user sends an email from that address unintentionally, when in fact the
user expected to correct that address in the prompt. This is a very,
very, very unlikely scenario, with many dependencies:
1) No configured user.name/user.email
2) No specified $EMAIL
3) No configured sendemail.from
4) No specified --from argument
5) A fully qualified domain name
6) A full name in the geckos field
7) A sendmail configuration that allows sending from this domain name
8) confirm=never, or
8.1) confirm configuration not hitting, or
8.2) Getting the error, not being aware of it
9) The user expecting to correct this address in the prompt
In a more likely scenario where 7) is not the case (can't send from
nysa.felipec.org), the user will simply see the mail was not sent
properly, and fix the problem.
The much more likely scenario though, is where 5) is not the case
(nysa.(none)), and git send-email will fail right away like it does now.
So the likelihood of this affecting anybody seriously is very very slim,
and the chances of this affecting somebody slightly are still very
small. The vast majority, if not all, of git users won't be affected
negatively, and a lot will benefit from this.
Tests-by: Jeff King <peff@peff.net> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
git p4: catch p4 errors when streaming file contents
Error messages that arise during the "p4 print" phase of
generating commits were silently ignored. Catch them,
abort the fast-import, and exit.
Without this fix, the sync/clone appears to work, but files that
are inaccessible by the p4d server will still be imported to git,
although without the proper contents. Instead the errant files
will contain a p4 error message, such as "Librarian checkout
//depot/path failed".
Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>