Documentation / git-range-diff.txton commit Merge branch 'js/pre-merge-commit-hook' (f76bd8c)
   1git-range-diff(1)
   2=================
   3
   4NAME
   5----
   6git-range-diff - Compare two commit ranges (e.g. two versions of a branch)
   7
   8SYNOPSIS
   9--------
  10[verse]
  11'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
  12        [--no-dual-color] [--creation-factor=<factor>]
  13        ( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
  14
  15DESCRIPTION
  16-----------
  17
  18This command shows the differences between two versions of a patch
  19series, or more generally, two commit ranges (ignoring merge commits).
  20
  21To that end, it first finds pairs of commits from both commit ranges
  22that correspond with each other. Two commits are said to correspond when
  23the diff between their patches (i.e. the author information, the commit
  24message and the commit diff) is reasonably small compared to the
  25patches' size. See ``Algorithm`` below for details.
  26
  27Finally, the list of matching commits is shown in the order of the
  28second commit range, with unmatched commits being inserted just after
  29all of their ancestors have been shown.
  30
  31
  32OPTIONS
  33-------
  34--no-dual-color::
  35        When the commit diffs differ, `git range-diff` recreates the
  36        original diffs' coloring, and adds outer -/+ diff markers with
  37        the *background* being red/green to make it easier to see e.g.
  38        when there was a change in what exact lines were added.
  39+
  40Additionally, the commit diff lines that are only present in the first commit
  41range are shown "dimmed" (this can be overridden using the `color.diff.<slot>`
  42config setting where `<slot>` is one of `contextDimmed`, `oldDimmed` and
  43`newDimmed`), and the commit diff lines that are only present in the second
  44commit range are shown in bold (which can be overridden using the config
  45settings `color.diff.<slot>` with `<slot>` being one of `contextBold`,
  46`oldBold` or `newBold`).
  47+
  48This is known to `range-diff` as "dual coloring". Use `--no-dual-color`
  49to revert to color all lines according to the outer diff markers
  50(and completely ignore the inner diff when it comes to color).
  51
  52--creation-factor=<percent>::
  53        Set the creation/deletion cost fudge factor to `<percent>`.
  54        Defaults to 60. Try a larger value if `git range-diff` erroneously
  55        considers a large change a total rewrite (deletion of one commit
  56        and addition of another), and a smaller one in the reverse case.
  57        See the ``Algorithm`` section below for an explanation why this is
  58        needed.
  59
  60<range1> <range2>::
  61        Compare the commits specified by the two ranges, where
  62        `<range1>` is considered an older version of `<range2>`.
  63
  64<rev1>...<rev2>::
  65        Equivalent to passing `<rev2>..<rev1>` and `<rev1>..<rev2>`.
  66
  67<base> <rev1> <rev2>::
  68        Equivalent to passing `<base>..<rev1>` and `<base>..<rev2>`.
  69        Note that `<base>` does not need to be the exact branch point
  70        of the branches. Example: after rebasing a branch `my-topic`,
  71        `git range-diff my-topic@{u} my-topic@{1} my-topic` would
  72        show the differences introduced by the rebase.
  73
  74`git range-diff` also accepts the regular diff options (see
  75linkgit:git-diff[1]), most notably the `--color=[<when>]` and
  76`--no-color` options. These options are used when generating the "diff
  77between patches", i.e. to compare the author, commit message and diff of
  78corresponding old/new commits. There is currently no means to tweak the
  79diff options passed to `git log` when generating those patches.
  80
  81OUTPUT STABILITY
  82----------------
  83
  84The output of the `range-diff` command is subject to change. It is
  85intended to be human-readable porcelain output, not something that can
  86be used across versions of Git to get a textually stable `range-diff`
  87(as opposed to something like the `--stable` option to
  88linkgit:git-patch-id[1]). There's also no equivalent of
  89linkgit:git-apply[1] for `range-diff`, the output is not intended to
  90be machine-readable.
  91
  92This is particularly true when passing in diff options. Currently some
  93options like `--stat` can, as an emergent effect, produce output
  94that's quite useless in the context of `range-diff`. Future versions
  95of `range-diff` may learn to interpret such options in a manner
  96specific to `range-diff` (e.g. for `--stat` producing human-readable
  97output which summarizes how the diffstat changed).
  98
  99CONFIGURATION
 100-------------
 101This command uses the `diff.color.*` and `pager.range-diff` settings
 102(the latter is on by default).
 103See linkgit:git-config[1].
 104
 105
 106EXAMPLES
 107--------
 108
 109When a rebase required merge conflicts to be resolved, compare the changes
 110introduced by the rebase directly afterwards using:
 111
 112------------
 113$ git range-diff @{u} @{1} @
 114------------
 115
 116
 117A typical output of `git range-diff` would look like this:
 118
 119------------
 120-:  ------- > 1:  0ddba11 Prepare for the inevitable!
 1211:  c0debee = 2:  cab005e Add a helpful message at the start
 1222:  f00dbal ! 3:  decafe1 Describe a bug
 123    @@ -1,3 +1,3 @@
 124     Author: A U Thor <author@example.com>
 125
 126    -TODO: Describe a bug
 127    +Describe a bug
 128    @@ -324,5 +324,6
 129      This is expected.
 130
 131    -+What is unexpected is that it will also crash.
 132    ++Unexpectedly, it also crashes. This is a bug, and the jury is
 133    ++still out there how to fix it best. See ticket #314 for details.
 134
 135      Contact
 1363:  bedead < -:  ------- TO-UNDO
 137------------
 138
 139In this example, there are 3 old and 3 new commits, where the developer
 140removed the 3rd, added a new one before the first two, and modified the
 141commit message of the 2nd commit as well its diff.
 142
 143When the output goes to a terminal, it is color-coded by default, just
 144like regular `git diff`'s output. In addition, the first line (adding a
 145commit) is green, the last line (deleting a commit) is red, the second
 146line (with a perfect match) is yellow like the commit header of `git
 147show`'s output, and the third line colors the old commit red, the new
 148one green and the rest like `git show`'s commit header.
 149
 150A naive color-coded diff of diffs is actually a bit hard to read,
 151though, as it colors the entire lines red or green. The line that added
 152"What is unexpected" in the old commit, for example, is completely red,
 153even if the intent of the old commit was to add something.
 154
 155To help with that, `range` uses the `--dual-color` mode by default. In
 156this mode, the diff of diffs will retain the original diff colors, and
 157prefix the lines with -/+ markers that have their *background* red or
 158green, to make it more obvious that they describe how the diff itself
 159changed.
 160
 161
 162Algorithm
 163---------
 164
 165The general idea is this: we generate a cost matrix between the commits
 166in both commit ranges, then solve the least-cost assignment.
 167
 168The cost matrix is populated thusly: for each pair of commits, both
 169diffs are generated and the "diff of diffs" is generated, with 3 context
 170lines, then the number of lines in that diff is used as cost.
 171
 172To avoid false positives (e.g. when a patch has been removed, and an
 173unrelated patch has been added between two iterations of the same patch
 174series), the cost matrix is extended to allow for that, by adding
 175fixed-cost entries for wholesale deletes/adds.
 176
 177Example: Let commits `1--2` be the first iteration of a patch series and
 178`A--C` the second iteration. Let's assume that `A` is a cherry-pick of
 179`2,` and `C` is a cherry-pick of `1` but with a small modification (say,
 180a fixed typo). Visualize the commits as a bipartite graph:
 181
 182------------
 183    1            A
 184
 185    2            B
 186
 187                 C
 188------------
 189
 190We are looking for a "best" explanation of the new series in terms of
 191the old one. We can represent an "explanation" as an edge in the graph:
 192
 193
 194------------
 195    1            A
 196               /
 197    2 --------'  B
 198
 199                 C
 200------------
 201
 202This explanation comes for "free" because there was no change. Similarly
 203`C` could be explained using `1`, but that comes at some cost c>0
 204because of the modification:
 205
 206------------
 207    1 ----.      A
 208          |    /
 209    2 ----+---'  B
 210          |
 211          `----- C
 212          c>0
 213------------
 214
 215In mathematical terms, what we are looking for is some sort of a minimum
 216cost bipartite matching; `1` is matched to `C` at some cost, etc. The
 217underlying graph is in fact a complete bipartite graph; the cost we
 218associate with every edge is the size of the diff between the two
 219commits' patches. To explain also new commits, we introduce dummy nodes
 220on both sides:
 221
 222------------
 223    1 ----.      A
 224          |    /
 225    2 ----+---'  B
 226          |
 227    o     `----- C
 228          c>0
 229    o            o
 230
 231    o            o
 232------------
 233
 234The cost of an edge `o--C` is the size of `C`'s diff, modified by a
 235fudge factor that should be smaller than 100%. The cost of an edge
 236`o--o` is free. The fudge factor is necessary because even if `1` and
 237`C` have nothing in common, they may still share a few empty lines and
 238such, possibly making the assignment `1--C`, `o--o` slightly cheaper
 239than `1--o`, `o--C` even if `1` and `C` have nothing in common. With the
 240fudge factor we require a much larger common part to consider patches as
 241corresponding.
 242
 243The overall time needed to compute this algorithm is the time needed to
 244compute n+m commit diffs and then n*m diffs of patches, plus the time
 245needed to compute the least-cost assigment between n and m diffs. Git
 246uses an implementation of the Jonker-Volgenant algorithm to solve the
 247assignment problem, which has cubic runtime complexity. The matching
 248found in this case will look like this:
 249
 250------------
 251    1 ----.      A
 252          |    /
 253    2 ----+---'  B
 254       .--+-----'
 255    o -'  `----- C
 256          c>0
 257    o ---------- o
 258
 259    o ---------- o
 260------------
 261
 262
 263SEE ALSO
 264--------
 265linkgit:git-log[1]
 266
 267GIT
 268---
 269Part of the linkgit:git[1] suite