bebb47d4290cb48ff5f76667a81e95800c52e336
   1git-range-diff(1)
   2=================
   3
   4NAME
   5----
   6git-range-diff - Compare two commit ranges (e.g. two versions of a branch)
   7
   8SYNOPSIS
   9--------
  10[verse]
  11'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
  12        [--dual-color] [--creation-factor=<factor>]
  13        ( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
  14
  15DESCRIPTION
  16-----------
  17
  18This command shows the differences between two versions of a patch
  19series, or more generally, two commit ranges (ignoring merge commits).
  20
  21To that end, it first finds pairs of commits from both commit ranges
  22that correspond with each other. Two commits are said to correspond when
  23the diff between their patches (i.e. the author information, the commit
  24message and the commit diff) is reasonably small compared to the
  25patches' size. See ``Algorithm`` below for details.
  26
  27Finally, the list of matching commits is shown in the order of the
  28second commit range, with unmatched commits being inserted just after
  29all of their ancestors have been shown.
  30
  31
  32OPTIONS
  33-------
  34--dual-color::
  35        When the commit diffs differ, recreate the original diffs'
  36        coloring, and add outer -/+ diff markers with the *background*
  37        being red/green to make it easier to see e.g. when there was a
  38        change in what exact lines were added.
  39
  40--creation-factor=<percent>::
  41        Set the creation/deletion cost fudge factor to `<percent>`.
  42        Defaults to 60. Try a larger value if `git range-diff` erroneously
  43        considers a large change a total rewrite (deletion of one commit
  44        and addition of another), and a smaller one in the reverse case.
  45        See the ``Algorithm`` section below for an explanation why this is
  46        needed.
  47
  48<range1> <range2>::
  49        Compare the commits specified by the two ranges, where
  50        `<range1>` is considered an older version of `<range2>`.
  51
  52<rev1>...<rev2>::
  53        Equivalent to passing `<rev2>..<rev1>` and `<rev1>..<rev2>`.
  54
  55<base> <rev1> <rev2>::
  56        Equivalent to passing `<base>..<rev1>` and `<base>..<rev2>`.
  57        Note that `<base>` does not need to be the exact branch point
  58        of the branches. Example: after rebasing a branch `my-topic`,
  59        `git range-diff my-topic@{u} my-topic@{1} my-topic` would
  60        show the differences introduced by the rebase.
  61
  62`git range-diff` also accepts the regular diff options (see
  63linkgit:git-diff[1]), most notably the `--color=[<when>]` and
  64`--no-color` options. These options are used when generating the "diff
  65between patches", i.e. to compare the author, commit message and diff of
  66corresponding old/new commits. There is currently no means to tweak the
  67diff options passed to `git log` when generating those patches.
  68
  69
  70CONFIGURATION
  71-------------
  72This command uses the `diff.color.*` and `pager.range-diff` settings
  73(the latter is on by default).
  74See linkgit:git-config[1].
  75
  76
  77EXAMPLES
  78--------
  79
  80When a rebase required merge conflicts to be resolved, compare the changes
  81introduced by the rebase directly afterwards using:
  82
  83------------
  84$ git range-diff @{u} @{1} @
  85------------
  86
  87
  88A typical output of `git range-diff` would look like this:
  89
  90------------
  91-:  ------- > 1:  0ddba11 Prepare for the inevitable!
  921:  c0debee = 2:  cab005e Add a helpful message at the start
  932:  f00dbal ! 3:  decafe1 Describe a bug
  94    @@ -1,3 +1,3 @@
  95     Author: A U Thor <author@example.com>
  96
  97    -TODO: Describe a bug
  98    +Describe a bug
  99    @@ -324,5 +324,6
 100      This is expected.
 101
 102    -+What is unexpected is that it will also crash.
 103    ++Unexpectedly, it also crashes. This is a bug, and the jury is
 104    ++still out there how to fix it best. See ticket #314 for details.
 105
 106      Contact
 1073:  bedead < -:  ------- TO-UNDO
 108------------
 109
 110In this example, there are 3 old and 3 new commits, where the developer
 111removed the 3rd, added a new one before the first two, and modified the
 112commit message of the 2nd commit as well its diff.
 113
 114When the output goes to a terminal, it is color-coded by default, just
 115like regular `git diff`'s output. In addition, the first line (adding a
 116commit) is green, the last line (deleting a commit) is red, the second
 117line (with a perfect match) is yellow like the commit header of `git
 118show`'s output, and the third line colors the old commit red, the new
 119one green and the rest like `git show`'s commit header.
 120
 121The color-coded diff is actually a bit hard to read, though, as it
 122colors the entire lines red or green. The line that added "What is
 123unexpected" in the old commit, for example, is completely red, even if
 124the intent of the old commit was to add something.
 125
 126To help with that, use the `--dual-color` mode. In this mode, the diff
 127of diffs will retain the original diff colors, and prefix the lines with
 128-/+ markers that have their *background* red or green, to make it more
 129obvious that they describe how the diff itself changed.
 130
 131
 132Algorithm
 133---------
 134
 135The general idea is this: we generate a cost matrix between the commits
 136in both commit ranges, then solve the least-cost assignment.
 137
 138The cost matrix is populated thusly: for each pair of commits, both
 139diffs are generated and the "diff of diffs" is generated, with 3 context
 140lines, then the number of lines in that diff is used as cost.
 141
 142To avoid false positives (e.g. when a patch has been removed, and an
 143unrelated patch has been added between two iterations of the same patch
 144series), the cost matrix is extended to allow for that, by adding
 145fixed-cost entries for wholesale deletes/adds.
 146
 147Example: Let commits `1--2` be the first iteration of a patch series and
 148`A--C` the second iteration. Let's assume that `A` is a cherry-pick of
 149`2,` and `C` is a cherry-pick of `1` but with a small modification (say,
 150a fixed typo). Visualize the commits as a bipartite graph:
 151
 152------------
 153    1            A
 154
 155    2            B
 156
 157                 C
 158------------
 159
 160We are looking for a "best" explanation of the new series in terms of
 161the old one. We can represent an "explanation" as an edge in the graph:
 162
 163
 164------------
 165    1            A
 166               /
 167    2 --------'  B
 168
 169                 C
 170------------
 171
 172This explanation comes for "free" because there was no change. Similarly
 173`C` could be explained using `1`, but that comes at some cost c>0
 174because of the modification:
 175
 176------------
 177    1 ----.      A
 178          |    /
 179    2 ----+---'  B
 180          |
 181          `----- C
 182          c>0
 183------------
 184
 185In mathematical terms, what we are looking for is some sort of a minimum
 186cost bipartite matching; `1` is matched to `C` at some cost, etc. The
 187underlying graph is in fact a complete bipartite graph; the cost we
 188associate with every edge is the size of the diff between the two
 189commits' patches. To explain also new commits, we introduce dummy nodes
 190on both sides:
 191
 192------------
 193    1 ----.      A
 194          |    /
 195    2 ----+---'  B
 196          |
 197    o     `----- C
 198          c>0
 199    o            o
 200
 201    o            o
 202------------
 203
 204The cost of an edge `o--C` is the size of `C`'s diff, modified by a
 205fudge factor that should be smaller than 100%. The cost of an edge
 206`o--o` is free. The fudge factor is necessary because even if `1` and
 207`C` have nothing in common, they may still share a few empty lines and
 208such, possibly making the assignment `1--C`, `o--o` slightly cheaper
 209than `1--o`, `o--C` even if `1` and `C` have nothing in common. With the
 210fudge factor we require a much larger common part to consider patches as
 211corresponding.
 212
 213The overall time needed to compute this algorithm is the time needed to
 214compute n+m commit diffs and then n*m diffs of patches, plus the time
 215needed to compute the least-cost assigment between n and m diffs. Git
 216uses an implementation of the Jonker-Volgenant algorithm to solve the
 217assignment problem, which has cubic runtime complexity. The matching
 218found in this case will look like this:
 219
 220------------
 221    1 ----.      A
 222          |    /
 223    2 ----+---'  B
 224       .--+-----'
 225    o -'  `----- C
 226          c>0
 227    o ---------- o
 228
 229    o ---------- o
 230------------
 231
 232
 233SEE ALSO
 234--------
 235linkgit:git-log[1]
 236
 237GIT
 238---
 239Part of the linkgit:git[1] suite