Documentation / gittutorial-2.txton commit glossary: improve a few links (850d3a7)
   1gittutorial-2(7)
   2================
   3
   4NAME
   5----
   6gittutorial-2 - A tutorial introduction to git: part two
   7
   8SYNOPSIS
   9--------
  10git *
  11
  12DESCRIPTION
  13-----------
  14
  15You should work through linkgit:gittutorial[7][A tutorial introduction to
  16git] before reading this tutorial.
  17
  18The goal of this tutorial is to introduce two fundamental pieces of
  19git's architecture--the object database and the index file--and to
  20provide the reader with everything necessary to understand the rest
  21of the git documentation.
  22
  23The git object database
  24-----------------------
  25
  26Let's start a new project and create a small amount of history:
  27
  28------------------------------------------------
  29$ mkdir test-project
  30$ cd test-project
  31$ git init
  32Initialized empty Git repository in .git/
  33$ echo 'hello world' > file.txt
  34$ git add .
  35$ git commit -a -m "initial commit"
  36Created initial commit 54196cc2703dc165cbd373a65a4dcf22d50ae7f7
  37 create mode 100644 file.txt
  38$ echo 'hello world!' >file.txt
  39$ git commit -a -m "add emphasis"
  40Created commit c4d59f390b9cfd4318117afde11d601c1085f241
  41------------------------------------------------
  42
  43What are the 40 digits of hex that git responded to the commit with?
  44
  45We saw in part one of the tutorial that commits have names like this.
  46It turns out that every object in the git history is stored under
  47such a 40-digit hex name.  That name is the SHA1 hash of the object's
  48contents; among other things, this ensures that git will never store
  49the same data twice (since identical data is given an identical SHA1
  50name), and that the contents of a git object will never change (since
  51that would change the object's name as well).
  52
  53It is expected that the content of the commit object you created while
  54following the example above generates a different SHA1 hash than
  55the one shown above because the commit object records the time when
  56it was created and the name of the person performing the commit.
  57
  58We can ask git about this particular object with the cat-file
  59command. Don't copy the 40 hex digits from this example but use those
  60from your own version. Note that you can shorten it to only a few
  61characters to save yourself typing all 40 hex digits:
  62
  63------------------------------------------------
  64$ git-cat-file -t 54196cc2
  65commit
  66$ git-cat-file commit 54196cc2
  67tree 92b8b694ffb1675e5975148e1121810081dbdffe
  68author J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
  69committer J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
  70
  71initial commit
  72------------------------------------------------
  73
  74A tree can refer to one or more "blob" objects, each corresponding to
  75a file.  In addition, a tree can also refer to other tree objects,
  76thus creating a directory hierarchy.  You can examine the contents of
  77any tree using ls-tree (remember that a long enough initial portion
  78of the SHA1 will also work):
  79
  80------------------------------------------------
  81$ git ls-tree 92b8b694
  82100644 blob 3b18e512dba79e4c8300dd08aeb37f8e728b8dad    file.txt
  83------------------------------------------------
  84
  85Thus we see that this tree has one file in it.  The SHA1 hash is a
  86reference to that file's data:
  87
  88------------------------------------------------
  89$ git cat-file -t 3b18e512
  90blob
  91------------------------------------------------
  92
  93A "blob" is just file data, which we can also examine with cat-file:
  94
  95------------------------------------------------
  96$ git cat-file blob 3b18e512
  97hello world
  98------------------------------------------------
  99
 100Note that this is the old file data; so the object that git named in
 101its response to the initial tree was a tree with a snapshot of the
 102directory state that was recorded by the first commit.
 103
 104All of these objects are stored under their SHA1 names inside the git
 105directory:
 106
 107------------------------------------------------
 108$ find .git/objects/
 109.git/objects/
 110.git/objects/pack
 111.git/objects/info
 112.git/objects/3b
 113.git/objects/3b/18e512dba79e4c8300dd08aeb37f8e728b8dad
 114.git/objects/92
 115.git/objects/92/b8b694ffb1675e5975148e1121810081dbdffe
 116.git/objects/54
 117.git/objects/54/196cc2703dc165cbd373a65a4dcf22d50ae7f7
 118.git/objects/a0
 119.git/objects/a0/423896973644771497bdc03eb99d5281615b51
 120.git/objects/d0
 121.git/objects/d0/492b368b66bdabf2ac1fd8c92b39d3db916e59
 122.git/objects/c4
 123.git/objects/c4/d59f390b9cfd4318117afde11d601c1085f241
 124------------------------------------------------
 125
 126and the contents of these files is just the compressed data plus a
 127header identifying their length and their type.  The type is either a
 128blob, a tree, a commit, or a tag.
 129
 130The simplest commit to find is the HEAD commit, which we can find
 131from .git/HEAD:
 132
 133------------------------------------------------
 134$ cat .git/HEAD
 135ref: refs/heads/master
 136------------------------------------------------
 137
 138As you can see, this tells us which branch we're currently on, and it
 139tells us this by naming a file under the .git directory, which itself
 140contains a SHA1 name referring to a commit object, which we can
 141examine with cat-file:
 142
 143------------------------------------------------
 144$ cat .git/refs/heads/master
 145c4d59f390b9cfd4318117afde11d601c1085f241
 146$ git cat-file -t c4d59f39
 147commit
 148$ git cat-file commit c4d59f39
 149tree d0492b368b66bdabf2ac1fd8c92b39d3db916e59
 150parent 54196cc2703dc165cbd373a65a4dcf22d50ae7f7
 151author J. Bruce Fields <bfields@puzzle.fieldses.org> 1143418702 -0500
 152committer J. Bruce Fields <bfields@puzzle.fieldses.org> 1143418702 -0500
 153
 154add emphasis
 155------------------------------------------------
 156
 157The "tree" object here refers to the new state of the tree:
 158
 159------------------------------------------------
 160$ git ls-tree d0492b36
 161100644 blob a0423896973644771497bdc03eb99d5281615b51    file.txt
 162$ git cat-file blob a0423896
 163hello world!
 164------------------------------------------------
 165
 166and the "parent" object refers to the previous commit:
 167
 168------------------------------------------------
 169$ git-cat-file commit 54196cc2
 170tree 92b8b694ffb1675e5975148e1121810081dbdffe
 171author J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
 172committer J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
 173
 174initial commit
 175------------------------------------------------
 176
 177The tree object is the tree we examined first, and this commit is
 178unusual in that it lacks any parent.
 179
 180Most commits have only one parent, but it is also common for a commit
 181to have multiple parents.   In that case the commit represents a
 182merge, with the parent references pointing to the heads of the merged
 183branches.
 184
 185Besides blobs, trees, and commits, the only remaining type of object
 186is a "tag", which we won't discuss here; refer to linkgit:git-tag[1]
 187for details.
 188
 189So now we know how git uses the object database to represent a
 190project's history:
 191
 192  * "commit" objects refer to "tree" objects representing the
 193    snapshot of a directory tree at a particular point in the
 194    history, and refer to "parent" commits to show how they're
 195    connected into the project history.
 196  * "tree" objects represent the state of a single directory,
 197    associating directory names to "blob" objects containing file
 198    data and "tree" objects containing subdirectory information.
 199  * "blob" objects contain file data without any other structure.
 200  * References to commit objects at the head of each branch are
 201    stored in files under .git/refs/heads/.
 202  * The name of the current branch is stored in .git/HEAD.
 203
 204Note, by the way, that lots of commands take a tree as an argument.
 205But as we can see above, a tree can be referred to in many different
 206ways--by the SHA1 name for that tree, by the name of a commit that
 207refers to the tree, by the name of a branch whose head refers to that
 208tree, etc.--and most such commands can accept any of these names.
 209
 210In command synopses, the word "tree-ish" is sometimes used to
 211designate such an argument.
 212
 213The index file
 214--------------
 215
 216The primary tool we've been using to create commits is "git commit
 217-a", which creates a commit including every change you've made to
 218your working tree.  But what if you want to commit changes only to
 219certain files?  Or only certain changes to certain files?
 220
 221If we look at the way commits are created under the cover, we'll see
 222that there are more flexible ways creating commits.
 223
 224Continuing with our test-project, let's modify file.txt again:
 225
 226------------------------------------------------
 227$ echo "hello world, again" >>file.txt
 228------------------------------------------------
 229
 230but this time instead of immediately making the commit, let's take an
 231intermediate step, and ask for diffs along the way to keep track of
 232what's happening:
 233
 234------------------------------------------------
 235$ git diff
 236--- a/file.txt
 237+++ b/file.txt
 238@@ -1 +1,2 @@
 239 hello world!
 240+hello world, again
 241$ git add file.txt
 242$ git diff
 243------------------------------------------------
 244
 245The last diff is empty, but no new commits have been made, and the
 246head still doesn't contain the new line:
 247
 248------------------------------------------------
 249$ git-diff HEAD
 250diff --git a/file.txt b/file.txt
 251index a042389..513feba 100644
 252--- a/file.txt
 253+++ b/file.txt
 254@@ -1 +1,2 @@
 255 hello world!
 256+hello world, again
 257------------------------------------------------
 258
 259So "git diff" is comparing against something other than the head.
 260The thing that it's comparing against is actually the index file,
 261which is stored in .git/index in a binary format, but whose contents
 262we can examine with ls-files:
 263
 264------------------------------------------------
 265$ git ls-files --stage
 266100644 513feba2e53ebbd2532419ded848ba19de88ba00 0       file.txt
 267$ git cat-file -t 513feba2
 268blob
 269$ git cat-file blob 513feba2
 270hello world!
 271hello world, again
 272------------------------------------------------
 273
 274So what our "git add" did was store a new blob and then put
 275a reference to it in the index file.  If we modify the file again,
 276we'll see that the new modifications are reflected in the "git-diff"
 277output:
 278
 279------------------------------------------------
 280$ echo 'again?' >>file.txt
 281$ git diff
 282index 513feba..ba3da7b 100644
 283--- a/file.txt
 284+++ b/file.txt
 285@@ -1,2 +1,3 @@
 286 hello world!
 287 hello world, again
 288+again?
 289------------------------------------------------
 290
 291With the right arguments, git diff can also show us the difference
 292between the working directory and the last commit, or between the
 293index and the last commit:
 294
 295------------------------------------------------
 296$ git diff HEAD
 297diff --git a/file.txt b/file.txt
 298index a042389..ba3da7b 100644
 299--- a/file.txt
 300+++ b/file.txt
 301@@ -1 +1,3 @@
 302 hello world!
 303+hello world, again
 304+again?
 305$ git diff --cached
 306diff --git a/file.txt b/file.txt
 307index a042389..513feba 100644
 308--- a/file.txt
 309+++ b/file.txt
 310@@ -1 +1,2 @@
 311 hello world!
 312+hello world, again
 313------------------------------------------------
 314
 315At any time, we can create a new commit using "git commit" (without
 316the -a option), and verify that the state committed only includes the
 317changes stored in the index file, not the additional change that is
 318still only in our working tree:
 319
 320------------------------------------------------
 321$ git commit -m "repeat"
 322$ git diff HEAD
 323diff --git a/file.txt b/file.txt
 324index 513feba..ba3da7b 100644
 325--- a/file.txt
 326+++ b/file.txt
 327@@ -1,2 +1,3 @@
 328 hello world!
 329 hello world, again
 330+again?
 331------------------------------------------------
 332
 333So by default "git commit" uses the index to create the commit, not
 334the working tree; the -a option to commit tells it to first update
 335the index with all changes in the working tree.
 336
 337Finally, it's worth looking at the effect of "git add" on the index
 338file:
 339
 340------------------------------------------------
 341$ echo "goodbye, world" >closing.txt
 342$ git add closing.txt
 343------------------------------------------------
 344
 345The effect of the "git add" was to add one entry to the index file:
 346
 347------------------------------------------------
 348$ git ls-files --stage
 349100644 8b9743b20d4b15be3955fc8d5cd2b09cd2336138 0       closing.txt
 350100644 513feba2e53ebbd2532419ded848ba19de88ba00 0       file.txt
 351------------------------------------------------
 352
 353And, as you can see with cat-file, this new entry refers to the
 354current contents of the file:
 355
 356------------------------------------------------
 357$ git cat-file blob 8b9743b2
 358goodbye, world
 359------------------------------------------------
 360
 361The "status" command is a useful way to get a quick summary of the
 362situation:
 363
 364------------------------------------------------
 365$ git status
 366# On branch master
 367# Changes to be committed:
 368#   (use "git reset HEAD <file>..." to unstage)
 369#
 370#       new file: closing.txt
 371#
 372# Changed but not updated:
 373#   (use "git add <file>..." to update what will be committed)
 374#
 375#       modified: file.txt
 376#
 377------------------------------------------------
 378
 379Since the current state of closing.txt is cached in the index file,
 380it is listed as "Changes to be committed".  Since file.txt has
 381changes in the working directory that aren't reflected in the index,
 382it is marked "changed but not updated".  At this point, running "git
 383commit" would create a commit that added closing.txt (with its new
 384contents), but that didn't modify file.txt.
 385
 386Also, note that a bare "git diff" shows the changes to file.txt, but
 387not the addition of closing.txt, because the version of closing.txt
 388in the index file is identical to the one in the working directory.
 389
 390In addition to being the staging area for new commits, the index file
 391is also populated from the object database when checking out a
 392branch, and is used to hold the trees involved in a merge operation.
 393See the linkgit:gitcore-tutorial[7][core tutorial] and the relevant man
 394pages for details.
 395
 396What next?
 397----------
 398
 399At this point you should know everything necessary to read the man
 400pages for any of the git commands; one good place to start would be
 401with the commands mentioned in link:everyday.html[Everyday git].  You
 402should be able to find any unknown jargon in the
 403linkgit:gitglossary[7][Glossary].
 404
 405The link:user-manual.html[Git User's Manual] provides a more
 406comprehensive introduction to git.
 407
 408The linkgit:gitcvs-migration[7][CVS migration] document explains how to
 409import a CVS repository into git, and shows how to use git in a
 410CVS-like way.
 411
 412For some interesting examples of git use, see the
 413link:howto-index.html[howtos].
 414
 415For git developers, the linkgit:gitcore-tutorial[7][Core tutorial] goes
 416into detail on the lower-level git mechanisms involved in, for
 417example, creating a new commit.
 418
 419SEE ALSO
 420--------
 421linkgit:gittutorial[7],
 422linkgit:gitcvs-migration[7],
 423linkgit:gitcore-tutorial[7],
 424linkgit:gitglossary[7],
 425link:everyday.html[Everyday git],
 426link:user-manual.html[The Git User's Manual]
 427
 428GIT
 429---
 430Part of the linkgit:git[7] suite.