Documentation / howto / keep-canonical-history-correct.txton commit sha1_file: allow prepare_alt_odb to handle arbitrary repositories (13068bf)
   1From: Junio C Hamano <gitster@pobox.com>
   2Date: Wed, 07 May 2014 13:15:39 -0700
   3Subject: Beginner question on "Pull is mostly evil"
   4Abstract: This how-to explains a method for keeping a
   5 project's history correct when using git pull.
   6Content-type: text/asciidoc
   7
   8Keep authoritative canonical history correct with git pull
   9==========================================================
  10
  11Sometimes a new project integrator will end up with project history
  12that appears to be "backwards" from what other project developers
  13expect. This howto presents a suggested integration workflow for
  14maintaining a central repository.
  15
  16Suppose that that central repository has this history:
  17
  18------------
  19    ---o---o---A
  20------------
  21
  22which ends at commit `A` (time flows from left to right and each node
  23in the graph is a commit, lines between them indicating parent-child
  24relationship).
  25
  26Then you clone it and work on your own commits, which leads you to
  27have this history in *your* repository:
  28
  29------------
  30    ---o---o---A---B---C
  31------------
  32
  33Imagine your coworker did the same and built on top of `A` in *his*
  34repository in the meantime, and then pushed it to the
  35central repository:
  36
  37------------
  38    ---o---o---A---X---Y---Z
  39------------
  40
  41Now, if you `git push` at this point, because your history that leads
  42to `C` lacks `X`, `Y` and `Z`, it will fail.  You need to somehow make
  43the tip of your history a descendant of `Z`.
  44
  45One suggested way to solve the problem is "fetch and then merge", aka
  46`git pull`. When you fetch, your repository will have a history like
  47this:
  48
  49------------
  50    ---o---o---A---B---C
  51                \
  52                 X---Y---Z
  53------------
  54
  55Once you run merge after that, while still on *your* branch, i.e. `C`,
  56you will create a merge `M` and make the history look like this:
  57
  58------------
  59    ---o---o---A---B---C---M
  60                \         /
  61                 X---Y---Z
  62------------
  63
  64`M` is a descendant of `Z`, so you can push to update the central
  65repository.  Such a merge `M` does not lose any commit in both
  66histories, so in that sense it may not be wrong, but when people want
  67to talk about "the authoritative canonical history that is shared
  68among the project participants", i.e. "the trunk", they often view
  69it as "commits you see by following the first-parent chain", and use
  70this command to view it:
  71
  72------------
  73    $ git log --first-parent
  74------------
  75
  76For all other people who observed the central repository after your
  77coworker pushed `Z` but before you pushed `M`, the commit on the trunk
  78used to be `o-o-A-X-Y-Z`.  But because you made `M` while you were on
  79`C`, `M`'s first parent is `C`, so by pushing `M` to advance the
  80central repository, you made `X-Y-Z` a side branch, not on the trunk.
  81
  82You would rather want to have a history of this shape:
  83
  84------------
  85    ---o---o---A---X---Y---Z---M'
  86                \             /
  87                 B-----------C
  88------------
  89
  90so that in the first-parent chain, it is clear that the project first
  91did `X` and then `Y` and then `Z` and merged a change that consists of
  92two commits `B` and `C` that achieves a single goal.  You may have
  93worked on fixing the bug #12345 with these two patches, and the merge
  94`M'` with swapped parents can say in its log message "Merge
  95fix-bug-12345". Having a way to tell `git pull` to create a merge
  96but record the parents in reverse order may be a way to do so.
  97
  98Note that I said "achieves a single goal" above, because this is
  99important.  "Swapping the merge order" only covers a special case
 100where the project does not care too much about having unrelated
 101things done on a single merge but cares a lot about first-parent
 102chain.
 103
 104There are multiple schools of thought about the "trunk" management.
 105
 106 1. Some projects want to keep a completely linear history without any
 107    merges.  Obviously, swapping the merge order would not match their
 108    taste.  You would need to flatten your history on top of the
 109    updated upstream to result in a history of this shape instead:
 110+
 111------------
 112    ---o---o---A---X---Y---Z---B---C
 113------------
 114+
 115with `git pull --rebase` or something.
 116
 117 2. Some projects tolerate merges in their history, but do not worry
 118    too much about the first-parent order, and allow fast-forward
 119    merges.  To them, swapping the merge order does not hurt, but
 120    it is unnecessary.
 121
 122 3. Some projects want each commit on the "trunk" to do one single
 123    thing.  The output of `git log --first-parent` in such a project
 124    would show either a merge of a side branch that completes a single
 125    theme, or a single commit that completes a single theme by itself.
 126    If your two commits `B` and `C` (or they may even be two groups of
 127    commits) were solving two independent issues, then the merge `M'`
 128    we made in the earlier example by swapping the merge order is
 129    still not up to the project standard.  It merges two unrelated
 130    efforts `B` and `C` at the same time.
 131
 132For projects in the last category (Git itself is one of them),
 133individual developers would want to prepare a history more like
 134this:
 135
 136------------
 137                 C0--C1--C2     topic-c
 138                /
 139    ---o---o---A                master
 140                \
 141                 B0--B1--B2     topic-b
 142------------
 143
 144That is, keeping separate topics on separate branches, perhaps like
 145so:
 146
 147------------
 148    $ git clone $URL work && cd work
 149    $ git checkout -b topic-b master
 150    $ ... work to create B0, B1 and B2 to complete one theme
 151    $ git checkout -b topic-c master
 152    $ ... same for the theme of topic-c
 153------------
 154
 155And then
 156
 157------------
 158    $ git checkout master
 159    $ git pull --ff-only
 160------------
 161
 162would grab `X`, `Y` and `Z` from the upstream and advance your master
 163branch:
 164
 165------------
 166                 C0--C1--C2     topic-c
 167                /
 168    ---o---o---A---X---Y---Z    master
 169                \
 170                 B0--B1--B2     topic-b
 171------------
 172
 173And then you would merge these two branches separately:
 174
 175------------
 176    $ git merge topic-b
 177    $ git merge topic-c
 178------------
 179
 180to result in
 181
 182------------
 183                 C0--C1---------C2
 184                /                 \
 185    ---o---o---A---X---Y---Z---M---N
 186                \             /
 187                 B0--B1-----B2
 188------------
 189
 190and push it back to the central repository.
 191
 192It is very much possible that while you are merging topic-b and
 193topic-c, somebody again advanced the history in the central repository
 194to put `W` on top of `Z`, and make your `git push` fail.
 195
 196In such a case, you would rewind to discard `M` and `N`, update the
 197tip of your 'master' again and redo the two merges:
 198
 199------------
 200    $ git reset --hard origin/master
 201    $ git pull --ff-only
 202    $ git merge topic-b
 203    $ git merge topic-c
 204------------
 205
 206The procedure will result in a history that looks like this:
 207
 208------------
 209                 C0--C1--------------C2
 210                /                     \
 211    ---o---o---A---X---Y---Z---W---M'--N'
 212                \                 /
 213                 B0--B1---------B2
 214------------
 215
 216See also http://git-blame.blogspot.com/2013/09/fun-with-first-parent-history.html