1A short git tutorial 2==================== 3v0.99.5, Aug 2005 4 5Introduction 6------------ 7 8This is trying to be a short tutorial on setting up and using a git 9repository, mainly because being hands-on and using explicit examples is 10often the best way of explaining what is going on. 11 12In normal life, most people wouldn't use the "core" git programs 13directly, but rather script around them to make them more palatable. 14Understanding the core git stuff may help some people get those scripts 15done, though, and it may also be instructive in helping people 16understand what it is that the higher-level helper scripts are actually 17doing. 18 19The core git is often called "plumbing", with the prettier user 20interfaces on top of it called "porcelain". You may not want to use the 21plumbing directly very often, but it can be good to know what the 22plumbing does for when the porcelain isn't flushing... 23 24 25Creating a git repository 26------------------------- 27 28Creating a new git repository couldn't be easier: all git repositories start 29out empty, and the only thing you need to do is find yourself a 30subdirectory that you want to use as a working tree - either an empty 31one for a totally new project, or an existing working tree that you want 32to import into git. 33 34For our first example, we're going to start a totally new repository from 35scratch, with no pre-existing files, and we'll call it `git-tutorial`. 36To start up, create a subdirectory for it, change into that 37subdirectory, and initialize the git infrastructure with `git-init-db`: 38 39------------------------------------------------ 40mkdir git-tutorial 41cd git-tutorial 42git-init-db 43------------------------------------------------ 44 45to which git will reply 46 47 defaulting to local storage area 48 49which is just git's way of saying that you haven't been doing anything 50strange, and that it will have created a local `.git` directory setup for 51your new project. You will now have a `.git` directory, and you can 52inspect that with `ls`. For your new empty project, it should show you 53three entries, among other things: 54 55 - a symlink called `HEAD`, pointing to `refs/heads/master` 56+ 57Don't worry about the fact that the file that the `HEAD` link points to 58doesn't even exist yet -- you haven't created the commit that will 59start your `HEAD` development branch yet. 60 61 - a subdirectory called `objects`, which will contain all the 62 objects of your project. You should never have any real reason to 63 look at the objects directly, but you might want to know that these 64 objects are what contains all the real 'data' in your repository. 65 66 - a subdirectory called `refs`, which contains references to objects. 67 68In particular, the `refs` subdirectory will contain two other 69subdirectories, named `heads` and `tags` respectively. They do 70exactly what their names imply: they contain references to any number 71of different 'heads' of development (aka 'branches'), and to any 72'tags' that you have created to name specific versions in your 73repository. 74 75One note: the special `master` head is the default branch, which is 76why the `.git/HEAD` file was created as a symlink to it even if it 77doesn't yet exist. Basically, the `HEAD` link is supposed to always 78point to the branch you are working on right now, and you always 79start out expecting to work on the `master` branch. 80 81However, this is only a convention, and you can name your branches 82anything you want, and don't have to ever even 'have' a `master` 83branch. A number of the git tools will assume that `.git/HEAD` is 84valid, though. 85 86[NOTE] 87An 'object' is identified by its 160-bit SHA1 hash, aka 'object name', 88and a reference to an object is always the 40-byte hex 89representation of that SHA1 name. The files in the `refs` 90subdirectory are expected to contain these hex references 91(usually with a final `\'\n\'` at the end), and you should thus 92expect to see a number of 41-byte files containing these 93references in these `refs` subdirectories when you actually start 94populating your tree. 95 96[NOTE] 97An advanced user may want to take a look at the 98link:repository-layout.html[repository layout] document 99after finishing this tutorial. 100 101You have now created your first git repository. Of course, since it's 102empty, that's not very useful, so let's start populating it with data. 103 104 105Populating a git repository 106--------------------------- 107 108We'll keep this simple and stupid, so we'll start off with populating a 109few trivial files just to get a feel for it. 110 111Start off with just creating any random files that you want to maintain 112in your git repository. We'll start off with a few bad examples, just to 113get a feel for how this works: 114 115------------------------------------------------ 116echo "Hello World" >hello 117echo "Silly example" >example 118------------------------------------------------ 119 120you have now created two files in your working tree (aka 'working directory'), but to 121actually check in your hard work, you will have to go through two steps: 122 123 - fill in the 'index' file (aka 'cache') with the information about your 124 working tree state. 125 126 - commit that index file as an object. 127 128The first step is trivial: when you want to tell git about any changes 129to your working tree, you use the `git-update-index` program. That 130program normally just takes a list of filenames you want to update, but 131to avoid trivial mistakes, it refuses to add new entries to the cache 132(or remove existing ones) unless you explicitly tell it that you're 133adding a new entry with the `\--add` flag (or removing an entry with the 134`\--remove`) flag. 135 136So to populate the index with the two files you just created, you can do 137 138------------------------------------------------ 139git-update-index --add hello example 140------------------------------------------------ 141 142and you have now told git to track those two files. 143 144In fact, as you did that, if you now look into your object directory, 145you'll notice that git will have added two new objects to the object 146database. If you did exactly the steps above, you should now be able to do 147 148 ls .git/objects/??/* 149 150and see two files: 151 152 .git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238 153 .git/objects/f2/4c74a2e500f5ee1332c86b94199f52b1d1d962 154 155which correspond with the objects with names of 557db... and f24c7.. 156respectively. 157 158If you want to, you can use `git-cat-file` to look at those objects, but 159you'll have to use the object name, not the filename of the object: 160 161 git-cat-file -t 557db03de997c86a4a028e1ebd3a1ceb225be238 162 163where the `-t` tells `git-cat-file` to tell you what the "type" of the 164object is. Git will tell you that you have a "blob" object (ie just a 165regular file), and you can see the contents with 166 167 git-cat-file "blob" 557db03 168 169which will print out "Hello World". The object 557db03 is nothing 170more than the contents of your file `hello`. 171 172[NOTE] 173Don't confuse that object with the file `hello` itself. The 174object is literally just those specific *contents* of the file, and 175however much you later change the contents in file `hello`, the object 176we just looked at will never change. Objects are immutable. 177 178[NOTE] 179The second example demonstrates that you can 180abbreviate the object name to only the first several 181hexadecimal digits in most places. 182 183Anyway, as we mentioned previously, you normally never actually take a 184look at the objects themselves, and typing long 40-character hex 185names is not something you'd normally want to do. The above digression 186was just to show that `git-update-index` did something magical, and 187actually saved away the contents of your files into the git object 188database. 189 190Updating the cache did something else too: it created a `.git/index` 191file. This is the index that describes your current working tree, and 192something you should be very aware of. Again, you normally never worry 193about the index file itself, but you should be aware of the fact that 194you have not actually really "checked in" your files into git so far, 195you've only *told* git about them. 196 197However, since git knows about them, you can now start using some of the 198most basic git commands to manipulate the files or look at their status. 199 200In particular, let's not even check in the two files into git yet, we'll 201start off by adding another line to `hello` first: 202 203------------------------------------------------ 204echo "It's a new day for git" >>hello 205------------------------------------------------ 206 207and you can now, since you told git about the previous state of `hello`, ask 208git what has changed in the tree compared to your old index, using the 209`git-diff-files` command: 210 211------------ 212git-diff-files 213------------ 214 215Oops. That wasn't very readable. It just spit out its own internal 216version of a `diff`, but that internal version really just tells you 217that it has noticed that "hello" has been modified, and that the old object 218contents it had have been replaced with something else. 219 220To make it readable, we can tell git-diff-files to output the 221differences as a patch, using the `-p` flag: 222 223------------ 224git-diff-files -p 225------------ 226 227which will spit out 228 229------------ 230diff --git a/hello b/hello 231--- a/hello 232+++ b/hello 233@@ -1 +1,2 @@ 234 Hello World 235+It's a new day for git 236---- 237 238i.e. the diff of the change we caused by adding another line to `hello`. 239 240In other words, `git-diff-files` always shows us the difference between 241what is recorded in the index, and what is currently in the working 242tree. That's very useful. 243 244A common shorthand for `git-diff-files -p` is to just write `git 245diff`, which will do the same thing. 246 247 248Committing git state 249-------------------- 250 251Now, we want to go to the next stage in git, which is to take the files 252that git knows about in the index, and commit them as a real tree. We do 253that in two phases: creating a 'tree' object, and committing that 'tree' 254object as a 'commit' object together with an explanation of what the 255tree was all about, along with information of how we came to that state. 256 257Creating a tree object is trivial, and is done with `git-write-tree`. 258There are no options or other input: git-write-tree will take the 259current index state, and write an object that describes that whole 260index. In other words, we're now tying together all the different 261filenames with their contents (and their permissions), and we're 262creating the equivalent of a git "directory" object: 263 264------------------------------------------------ 265git-write-tree 266------------------------------------------------ 267 268and this will just output the name of the resulting tree, in this case 269(if you have done exactly as I've described) it should be 270 271 8988da15d077d4829fc51d8544c097def6644dbb 272 273which is another incomprehensible object name. Again, if you want to, 274you can use `git-cat-file -t 8988d\...` to see that this time the object 275is not a "blob" object, but a "tree" object (you can also use 276`git-cat-file` to actually output the raw object contents, but you'll see 277mainly a binary mess, so that's less interesting). 278 279However -- normally you'd never use `git-write-tree` on its own, because 280normally you always commit a tree into a commit object using the 281`git-commit-tree` command. In fact, it's easier to not actually use 282`git-write-tree` on its own at all, but to just pass its result in as an 283argument to `git-commit-tree`. 284 285`git-commit-tree` normally takes several arguments -- it wants to know 286what the 'parent' of a commit was, but since this is the first commit 287ever in this new repository, and it has no parents, we only need to pass in 288the object name of the tree. However, `git-commit-tree` 289also wants to get a commit message 290on its standard input, and it will write out the resulting object name for the 291commit to its standard output. 292 293And this is where we start using the `.git/HEAD` file. The `HEAD` file is 294supposed to contain the reference to the top-of-tree, and since that's 295exactly what `git-commit-tree` spits out, we can do this all with a simple 296shell pipeline: 297 298------------------------------------------------ 299echo "Initial commit" | git-commit-tree $(git-write-tree) > .git/HEAD 300------------------------------------------------ 301 302which will say: 303 304 Committing initial tree 8988da15d077d4829fc51d8544c097def6644dbb 305 306just to warn you about the fact that it created a totally new commit 307that is not related to anything else. Normally you do this only *once* 308for a project ever, and all later commits will be parented on top of an 309earlier commit, and you'll never see this "Committing initial tree" 310message ever again. 311 312Again, normally you'd never actually do this by hand. There is a 313helpful script called `git commit` that will do all of this for you. So 314you could have just written `git commit` 315instead, and it would have done the above magic scripting for you. 316 317 318Making a change 319--------------- 320 321Remember how we did the `git-update-index` on file `hello` and then we 322changed `hello` afterward, and could compare the new state of `hello` with the 323state we saved in the index file? 324 325Further, remember how I said that `git-write-tree` writes the contents 326of the *index* file to the tree, and thus what we just committed was in 327fact the *original* contents of the file `hello`, not the new ones. We did 328that on purpose, to show the difference between the index state, and the 329state in the working tree, and how they don't have to match, even 330when we commit things. 331 332As before, if we do `git-diff-files -p` in our git-tutorial project, 333we'll still see the same difference we saw last time: the index file 334hasn't changed by the act of committing anything. However, now that we 335have committed something, we can also learn to use a new command: 336`git-diff-index`. 337 338Unlike `git-diff-files`, which showed the difference between the index 339file and the working tree, `git-diff-index` shows the differences 340between a committed *tree* and either the index file or the working 341tree. In other words, `git-diff-index` wants a tree to be diffed 342against, and before we did the commit, we couldn't do that, because we 343didn't have anything to diff against. 344 345But now we can do 346 347 git-diff-index -p HEAD 348 349(where `-p` has the same meaning as it did in `git-diff-files`), and it 350will show us the same difference, but for a totally different reason. 351Now we're comparing the working tree not against the index file, 352but against the tree we just wrote. It just so happens that those two 353are obviously the same, so we get the same result. 354 355Again, because this is a common operation, you can also just shorthand 356it with 357 358 git diff HEAD 359 360which ends up doing the above for you. 361 362In other words, `git-diff-index` normally compares a tree against the 363working tree, but when given the `\--cached` flag, it is told to 364instead compare against just the index cache contents, and ignore the 365current working tree state entirely. Since we just wrote the index 366file to HEAD, doing `git-diff-index \--cached -p HEAD` should thus return 367an empty set of differences, and that's exactly what it does. 368 369[NOTE] 370================ 371`git-diff-index` really always uses the index for its 372comparisons, and saying that it compares a tree against the working 373tree is thus not strictly accurate. In particular, the list of 374files to compare (the "meta-data") *always* comes from the index file, 375regardless of whether the `\--cached` flag is used or not. The `\--cached` 376flag really only determines whether the file *contents* to be compared 377come from the working tree or not. 378 379This is not hard to understand, as soon as you realize that git simply 380never knows (or cares) about files that it is not told about 381explicitly. Git will never go *looking* for files to compare, it 382expects you to tell it what the files are, and that's what the index 383is there for. 384================ 385 386However, our next step is to commit the *change* we did, and again, to 387understand what's going on, keep in mind the difference between "working 388tree contents", "index file" and "committed tree". We have changes 389in the working tree that we want to commit, and we always have to 390work through the index file, so the first thing we need to do is to 391update the index cache: 392 393------------------------------------------------ 394git-update-index hello 395------------------------------------------------ 396 397(note how we didn't need the `\--add` flag this time, since git knew 398about the file already). 399 400Note what happens to the different `git-diff-\*` versions here. After 401we've updated `hello` in the index, `git-diff-files -p` now shows no 402differences, but `git-diff-index -p HEAD` still *does* show that the 403current state is different from the state we committed. In fact, now 404`git-diff-index` shows the same difference whether we use the `--cached` 405flag or not, since now the index is coherent with the working tree. 406 407Now, since we've updated `hello` in the index, we can commit the new 408version. We could do it by writing the tree by hand again, and 409committing the tree (this time we'd have to use the `-p HEAD` flag to 410tell commit that the HEAD was the *parent* of the new commit, and that 411this wasn't an initial commit any more), but you've done that once 412already, so let's just use the helpful script this time: 413 414------------------------------------------------ 415git commit 416------------------------------------------------ 417 418which starts an editor for you to write the commit message and tells you 419a bit about what you have done. 420 421Write whatever message you want, and all the lines that start with '#' 422will be pruned out, and the rest will be used as the commit message for 423the change. If you decide you don't want to commit anything after all at 424this point (you can continue to edit things and update the cache), you 425can just leave an empty message. Otherwise `git commit` will commit 426the change for you. 427 428You've now made your first real git commit. And if you're interested in 429looking at what `git commit` really does, feel free to investigate: 430it's a few very simple shell scripts to generate the helpful (?) commit 431message headers, and a few one-liners that actually do the 432commit itself (`git-commit`). 433 434 435Inspecting Changes 436------------------ 437 438While creating changes is useful, it's even more useful if you can tell 439later what changed. The most useful command for this is another of the 440`diff` family, namely `git-diff-tree`. 441 442`git-diff-tree` can be given two arbitrary trees, and it will tell you the 443differences between them. Perhaps even more commonly, though, you can 444give it just a single commit object, and it will figure out the parent 445of that commit itself, and show the difference directly. Thus, to get 446the same diff that we've already seen several times, we can now do 447 448 git-diff-tree -p HEAD 449 450(again, `-p` means to show the difference as a human-readable patch), 451and it will show what the last commit (in `HEAD`) actually changed. 452 453More interestingly, you can also give `git-diff-tree` the `-v` flag, which 454tells it to also show the commit message and author and date of the 455commit, and you can tell it to show a whole series of diffs. 456Alternatively, you can tell it to be "silent", and not show the diffs at 457all, but just show the actual commit message. 458 459In fact, together with the `git-rev-list` program (which generates a 460list of revisions), `git-diff-tree` ends up being a veritable fount of 461changes. A trivial (but very useful) script called `git-whatchanged` is 462included with git which does exactly this, and shows a log of recent 463activities. 464 465To see the whole history of our pitiful little git-tutorial project, you 466can do 467 468 git log 469 470which shows just the log messages, or if we want to see the log together 471with the associated patches use the more complex (and much more 472powerful) 473 474 git-whatchanged -p --root 475 476and you will see exactly what has changed in the repository over its 477short history. 478 479[NOTE] 480The `\--root` flag is a flag to `git-diff-tree` to tell it to 481show the initial aka 'root' commit too. Normally you'd probably not 482want to see the initial import diff, but since the tutorial project 483was started from scratch and is so small, we use it to make the result 484a bit more interesting. 485 486With that, you should now be having some inkling of what git does, and 487can explore on your own. 488 489[NOTE] 490Most likely, you are not directly using the core 491git Plumbing commands, but using Porcelain like Cogito on top 492of it. Cogito works a bit differently and you usually do not 493have to run `git-update-index` yourself for changed files (you 494do tell underlying git about additions and removals via 495`cg-add` and `cg-rm` commands). Just before you make a commit 496with `cg-commit`, Cogito figures out which files you modified, 497and runs `git-update-index` on them for you. 498 499 500Tagging a version 501----------------- 502 503In git, there are two kinds of tags, a "light" one, and an "annotated tag". 504 505A "light" tag is technically nothing more than a branch, except we put 506it in the `.git/refs/tags/` subdirectory instead of calling it a `head`. 507So the simplest form of tag involves nothing more than 508 509------------------------------------------------ 510git tag my-first-tag 511------------------------------------------------ 512 513which just writes the current `HEAD` into the `.git/refs/tags/my-first-tag` 514file, after which point you can then use this symbolic name for that 515particular state. You can, for example, do 516 517 git diff my-first-tag 518 519to diff your current state against that tag (which at this point will 520obviously be an empty diff, but if you continue to develop and commit 521stuff, you can use your tag as an "anchor-point" to see what has changed 522since you tagged it. 523 524An "annotated tag" is actually a real git object, and contains not only a 525pointer to the state you want to tag, but also a small tag name and 526message, along with optionally a PGP signature that says that yes, 527you really did 528that tag. You create these annotated tags with either the `-a` or 529`-s` flag to `git tag`: 530 531 git tag -s <tagname> 532 533which will sign the current `HEAD` (but you can also give it another 534argument that specifies the thing to tag, ie you could have tagged the 535current `mybranch` point by using `git tag <tagname> mybranch`). 536 537You normally only do signed tags for major releases or things 538like that, while the light-weight tags are useful for any marking you 539want to do -- any time you decide that you want to remember a certain 540point, just create a private tag for it, and you have a nice symbolic 541name for the state at that point. 542 543 544Copying repositories 545-------------------- 546 547Git repositories are normally totally self-sufficient, and it's worth noting 548that unlike CVS, for example, there is no separate notion of 549"repository" and "working tree". A git repository normally *is* the 550working tree, with the local git information hidden in the `.git` 551subdirectory. There is nothing else. What you see is what you got. 552 553[NOTE] 554You can tell git to split the git internal information from 555the directory that it tracks, but we'll ignore that for now: it's not 556how normal projects work, and it's really only meant for special uses. 557So the mental model of "the git information is always tied directly to 558the working tree that it describes" may not be technically 100% 559accurate, but it's a good model for all normal use. 560 561This has two implications: 562 563 - if you grow bored with the tutorial repository you created (or you've 564 made a mistake and want to start all over), you can just do simple 565 566 rm -rf git-tutorial 567+ 568and it will be gone. There's no external repository, and there's no 569history outside the project you created. 570 571 - if you want to move or duplicate a git repository, you can do so. There 572 is `git clone` command, but if all you want to do is just to 573 create a copy of your repository (with all the full history that 574 went along with it), you can do so with a regular 575 `cp -a git-tutorial new-git-tutorial`. 576+ 577Note that when you've moved or copied a git repository, your git index 578file (which caches various information, notably some of the "stat" 579information for the files involved) will likely need to be refreshed. 580So after you do a `cp -a` to create a new copy, you'll want to do 581 582 git-update-index --refresh 583+ 584in the new repository to make sure that the index file is up-to-date. 585 586Note that the second point is true even across machines. You can 587duplicate a remote git repository with *any* regular copy mechanism, be it 588`scp`, `rsync` or `wget`. 589 590When copying a remote repository, you'll want to at a minimum update the 591index cache when you do this, and especially with other peoples' 592repositories you often want to make sure that the index cache is in some 593known state (you don't know *what* they've done and not yet checked in), 594so usually you'll precede the `git-update-index` with a 595 596 git-read-tree --reset HEAD 597 git-update-index --refresh 598 599which will force a total index re-build from the tree pointed to by `HEAD`. 600It resets the index contents to `HEAD`, and then the `git-update-index` 601makes sure to match up all index entries with the checked-out files. 602If the original repository had uncommitted changes in its 603working tree, `git-update-index --refresh` notices them and 604tells you they need to be updated. 605 606The above can also be written as simply 607 608 git reset 609 610and in fact a lot of the common git command combinations can be scripted 611with the `git xyz` interfaces. You can learn things by just looking 612at what the various git scripts do. For example, `git reset` is the 613above two lines implemented in `git-reset`, but some things like 614`git status` and `git commit` are slightly more complex scripts around 615the basic git commands. 616 617Many (most?) public remote repositories will not contain any of 618the checked out files or even an index file, and will *only* contain the 619actual core git files. Such a repository usually doesn't even have the 620`.git` subdirectory, but has all the git files directly in the 621repository. 622 623To create your own local live copy of such a "raw" git repository, you'd 624first create your own subdirectory for the project, and then copy the 625raw repository contents into the `.git` directory. For example, to 626create your own copy of the git repository, you'd do the following 627 628 mkdir my-git 629 cd my-git 630 rsync -rL rsync://rsync.kernel.org/pub/scm/git/git.git/ .git 631 632followed by 633 634 git-read-tree HEAD 635 636to populate the index. However, now you have populated the index, and 637you have all the git internal files, but you will notice that you don't 638actually have any of the working tree files to work on. To get 639those, you'd check them out with 640 641 git-checkout-index -u -a 642 643where the `-u` flag means that you want the checkout to keep the index 644up-to-date (so that you don't have to refresh it afterward), and the 645`-a` flag means "check out all files" (if you have a stale copy or an 646older version of a checked out tree you may also need to add the `-f` 647flag first, to tell git-checkout-index to *force* overwriting of any old 648files). 649 650Again, this can all be simplified with 651 652 git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ my-git 653 cd my-git 654 git checkout 655 656which will end up doing all of the above for you. 657 658You have now successfully copied somebody else's (mine) remote 659repository, and checked it out. 660 661 662Creating a new branch 663--------------------- 664 665Branches in git are really nothing more than pointers into the git 666object database from within the `.git/refs/` subdirectory, and as we 667already discussed, the `HEAD` branch is nothing but a symlink to one of 668these object pointers. 669 670You can at any time create a new branch by just picking an arbitrary 671point in the project history, and just writing the SHA1 name of that 672object into a file under `.git/refs/heads/`. You can use any filename you 673want (and indeed, subdirectories), but the convention is that the 674"normal" branch is called `master`. That's just a convention, though, 675and nothing enforces it. 676 677To show that as an example, let's go back to the git-tutorial repository we 678used earlier, and create a branch in it. You do that by simply just 679saying that you want to check out a new branch: 680 681------------ 682git checkout -b mybranch 683------------ 684 685will create a new branch based at the current `HEAD` position, and switch 686to it. 687 688[NOTE] 689================================================ 690If you make the decision to start your new branch at some 691other point in the history than the current `HEAD`, you can do so by 692just telling `git checkout` what the base of the checkout would be. 693In other words, if you have an earlier tag or branch, you'd just do 694 695 git checkout -b mybranch earlier-commit 696 697and it would create the new branch `mybranch` at the earlier commit, 698and check out the state at that time. 699================================================ 700 701You can always just jump back to your original `master` branch by doing 702 703 git checkout master 704 705(or any other branch-name, for that matter) and if you forget which 706branch you happen to be on, a simple 707 708 ls -l .git/HEAD 709 710will tell you where it's pointing. To get the list of branches 711you have, you can say 712 713 git branch 714 715which is nothing more than a simple script around `ls .git/refs/heads`. 716There will be asterisk in front of the branch you are currently on. 717 718Sometimes you may wish to create a new branch _without_ actually 719checking it out and switching to it. If so, just use the command 720 721 git branch <branchname> [startingpoint] 722 723which will simply _create_ the branch, but will not do anything further. 724You can then later -- once you decide that you want to actually develop 725on that branch -- switch to that branch with a regular `git checkout` 726with the branchname as the argument. 727 728 729Merging two branches 730-------------------- 731 732One of the ideas of having a branch is that you do some (possibly 733experimental) work in it, and eventually merge it back to the main 734branch. So assuming you created the above `mybranch` that started out 735being the same as the original `master` branch, let's make sure we're in 736that branch, and do some work there. 737 738------------------------------------------------ 739git checkout mybranch 740echo "Work, work, work" >>hello 741git commit -m 'Some work.' hello 742------------------------------------------------ 743 744Here, we just added another line to `hello`, and we used a shorthand for 745doing both `git-update-index hello` and `git commit` by just giving the 746filename directly to `git commit`. The `-m` flag is to give the 747commit log message from the command line. 748 749Now, to make it a bit more interesting, let's assume that somebody else 750does some work in the original branch, and simulate that by going back 751to the master branch, and editing the same file differently there: 752 753------------ 754git checkout master 755------------ 756 757Here, take a moment to look at the contents of `hello`, and notice how they 758don't contain the work we just did in `mybranch` -- because that work 759hasn't happened in the `master` branch at all. Then do 760 761------------ 762echo "Play, play, play" >>hello 763echo "Lots of fun" >>example 764git commit -m 'Some fun.' hello example 765------------ 766 767since the master branch is obviously in a much better mood. 768 769Now, you've got two branches, and you decide that you want to merge the 770work done. Before we do that, let's introduce a cool graphical tool that 771helps you view what's going on: 772 773 gitk --all 774 775will show you graphically both of your branches (that's what the `\--all` 776means: normally it will just show you your current `HEAD`) and their 777histories. You can also see exactly how they came to be from a common 778source. 779 780Anyway, let's exit `gitk` (`^Q` or the File menu), and decide that we want 781to merge the work we did on the `mybranch` branch into the `master` 782branch (which is currently our `HEAD` too). To do that, there's a nice 783script called `git resolve`, which wants to know which branches you want 784to resolve and what the merge is all about: 785 786------------ 787git resolve HEAD mybranch "Merge work in mybranch" 788------------ 789 790where the third argument is going to be used as the commit message if 791the merge can be resolved automatically. 792 793Now, in this case we've intentionally created a situation where the 794merge will need to be fixed up by hand, though, so git will do as much 795of it as it can automatically (which in this case is just merge the `example` 796file, which had no differences in the `mybranch` branch), and say: 797 798 Simple merge failed, trying Automatic merge 799 Auto-merging hello. 800 merge: warning: conflicts during merge 801 ERROR: Merge conflict in hello. 802 fatal: merge program failed 803 Automatic merge failed, fix up by hand 804 805which is way too verbose, but it basically tells you that it failed the 806really trivial merge ("Simple merge") and did an "Automatic merge" 807instead, but that too failed due to conflicts in `hello`. 808 809Not to worry. It left the (trivial) conflict in `hello` in the same form you 810should already be well used to if you've ever used CVS, so let's just 811open `hello` in our editor (whatever that may be), and fix it up somehow. 812I'd suggest just making it so that `hello` contains all four lines: 813 814------------ 815Hello World 816It's a new day for git 817Play, play, play 818Work, work, work 819------------ 820 821and once you're happy with your manual merge, just do a 822 823------------ 824git commit hello 825------------ 826 827which will very loudly warn you that you're now committing a merge 828(which is correct, so never mind), and you can write a small merge 829message about your adventures in git-merge-land. 830 831After you're done, start up `gitk \--all` to see graphically what the 832history looks like. Notice that `mybranch` still exists, and you can 833switch to it, and continue to work with it if you want to. The 834`mybranch` branch will not contain the merge, but next time you merge it 835from the `master` branch, git will know how you merged it, so you'll not 836have to do _that_ merge again. 837 838Another useful tool, especially if you do not always work in X-Window 839environment, is `git show-branch`. 840 841------------------------------------------------ 842$ git show-branch master mybranch 843* [master] Merged "mybranch" changes. 844 ! [mybranch] Some work. 845-- 846+ [master] Merged "mybranch" changes. 847+ [master~1] Some fun. 848++ [mybranch] Some work. 849------------------------------------------------ 850 851The first two lines indicate that it is showing the two branches 852and the first line of the commit log message from their 853top-of-the-tree commits, you are currently on `master` branch 854(notice the asterisk `*` character), and the first column for 855the later output lines is used to show commits contained in the 856`master` branch, and the second column for the `mybranch` 857branch. Three commits are shown along with their log messages. 858All of them have plus `+` characters in the first column, which 859means they are now part of the `master` branch. Only the "Some 860work" commit has the plus `+` character in the second column, 861because `mybranch` has not been merged to incorporate these 862commits from the master branch. The string inside brackets 863before the commit log message is a short name you can use to 864name the commit. In the above example, 'master' and 'mybranch' 865are branch heads. 'master~1' is the first parent of 'master' 866branch head. Please see 'git-rev-parse' documentation if you 867see more complex cases. 868 869Now, let's pretend you are the one who did all the work in 870`mybranch`, and the fruit of your hard work has finally been merged 871to the `master` branch. Let's go back to `mybranch`, and run 872resolve to get the "upstream changes" back to your branch. 873 874 git checkout mybranch 875 git resolve HEAD master "Merge upstream changes." 876 877This outputs something like this (the actual commit object names 878would be different) 879 880 Updating from ae3a2da... to a80b4aa.... 881 example | 1 + 882 hello | 1 + 883 2 files changed, 2 insertions(+), 0 deletions(-) 884 885Because your branch did not contain anything more than what are 886already merged into the `master` branch, the resolve operation did 887not actually do a merge. Instead, it just updated the top of 888the tree of your branch to that of the `master` branch. This is 889often called 'fast forward' merge. 890 891You can run `gitk \--all` again to see how the commit ancestry 892looks like, or run `show-branch`, which tells you this. 893 894------------------------------------------------ 895$ git show-branch master mybranch 896! [master] Merged "mybranch" changes. 897 * [mybranch] Merged "mybranch" changes. 898-- 899++ [master] Merged "mybranch" changes. 900------------------------------------------------ 901 902 903Merging external work 904--------------------- 905 906It's usually much more common that you merge with somebody else than 907merging with your own branches, so it's worth pointing out that git 908makes that very easy too, and in fact, it's not that different from 909doing a `git resolve`. In fact, a remote merge ends up being nothing 910more than "fetch the work from a remote repository into a temporary tag" 911followed by a `git resolve`. 912 913Fetching from a remote repository is done by, unsurprisingly, 914`git fetch`: 915 916 git fetch <remote-repository> 917 918One of the following transports can be used to name the 919repository to download from: 920 921Rsync:: 922 `rsync://remote.machine/path/to/repo.git/` 923+ 924Rsync transport is usable for both uploading and downloading, 925but is completely unaware of what git does, and can produce 926unexpected results when you download from the public repository 927while the repository owner is uploading into it via `rsync` 928transport. Most notably, it could update the files under 929`refs/` which holds the object name of the topmost commits 930before uploading the files in `objects/` -- the downloader would 931obtain head commit object name while that object itself is still 932not available in the repository. For this reason, it is 933considered deprecated. 934 935SSH:: 936 `remote.machine:/path/to/repo.git/` or 937+ 938`ssh://remote.machine/path/to/repo.git/` 939+ 940This transport can be used for both uploading and downloading, 941and requires you to have a log-in privilege over `ssh` to the 942remote machine. It finds out the set of objects the other side 943lacks by exchanging the head commits both ends have and 944transfers (close to) minimum set of objects. It is by far the 945most efficient way to exchange git objects between repositories. 946 947Local directory:: 948 `/path/to/repo.git/` 949+ 950This transport is the same as SSH transport but uses `sh` to run 951both ends on the local machine instead of running other end on 952the remote machine via `ssh`. 953 954GIT Native:: 955 `git://remote.machine/path/to/repo.git/` 956+ 957This transport was designed for anonymous downloading. Like SSH 958transport, it finds out the set of objects the downstream side 959lacks and transfers (close to) minimum set of objects. 960 961HTTP(s):: 962 `http://remote.machine/path/to/repo.git/` 963+ 964HTTP and HTTPS transport are used only for downloading. They 965first obtain the topmost commit object name from the remote site 966by looking at `repo.git/info/refs` file, tries to obtain the 967commit object by downloading from `repo.git/objects/xx/xxx\...` 968using the object name of that commit object. Then it reads the 969commit object to find out its parent commits and the associate 970tree object; it repeats this process until it gets all the 971necessary objects. Because of this behaviour, they are 972sometimes also called 'commit walkers'. 973+ 974The 'commit walkers' are sometimes also called 'dumb 975transports', because they do not require any GIT aware smart 976server like GIT Native transport does. Any stock HTTP server 977would suffice. 978+ 979There are (confusingly enough) `git-ssh-fetch` and `git-ssh-upload` 980programs, which are 'commit walkers'; they outlived their 981usefulness when GIT Native and SSH transports were introduced, 982and not used by `git pull` or `git push` scripts. 983 984Once you fetch from the remote repository, you `resolve` that 985with your current branch. 986 987However -- it's such a common thing to `fetch` and then 988immediately `resolve`, that it's called `git pull`, and you can 989simply do 990 991 git pull <remote-repository> 992 993and optionally give a branch-name for the remote end as a second 994argument. 995 996[NOTE] 997You could do without using any branches at all, by 998keeping as many local repositories as you would like to have 999branches, and merging between them with `git pull`, just like1000you merge between branches. The advantage of this approach is1001that it lets you keep set of files for each `branch` checked1002out and you may find it easier to switch back and forth if you1003juggle multiple lines of development simultaneously. Of1004course, you will pay the price of more disk usage to hold1005multiple working trees, but disk space is cheap these days.10061007[NOTE]1008You could even pull from your own repository by1009giving '.' as <remote-repository> parameter to `git pull`.10101011It is likely that you will be pulling from the same remote1012repository from time to time. As a short hand, you can store1013the remote repository URL in a file under .git/remotes/1014directory, like this:10151016------------------------------------------------1017mkdir -p .git/remotes/1018cat >.git/remotes/linus <<\EOF1019URL: http://www.kernel.org/pub/scm/git/git.git/1020EOF1021------------------------------------------------10221023and use the filename to `git pull` instead of the full URL.1024The URL specified in such file can even be a prefix1025of a full URL, like this:10261027------------------------------------------------1028cat >.git/remotes/jgarzik <<\EOF1029URL: http://www.kernel.org/pub/scm/linux/git/jgarzik/1030EOF1031------------------------------------------------103210331034Examples.10351036. `git pull linus`1037. `git pull linus tag v0.99.1`1038. `git pull jgarzik/netdev-2.6.git/ e100`10391040the above are equivalent to:10411042. `git pull http://www.kernel.org/pub/scm/git/git.git/ HEAD`1043. `git pull http://www.kernel.org/pub/scm/git/git.git/ tag v0.99.1`1044. `git pull http://www.kernel.org/pub/.../jgarzik/netdev-2.6.git e100`104510461047Publishing your work1048--------------------10491050So we can use somebody else's work from a remote repository; but1051how can *you* prepare a repository to let other people pull from1052it?10531054Your do your real work in your working tree that has your1055primary repository hanging under it as its `.git` subdirectory.1056You *could* make that repository accessible remotely and ask1057people to pull from it, but in practice that is not the way1058things are usually done. A recommended way is to have a public1059repository, make it reachable by other people, and when the1060changes you made in your primary working tree are in good shape,1061update the public repository from it. This is often called1062'pushing'.10631064[NOTE]1065This public repository could further be mirrored, and that is1066how git repositories at `kernel.org` are managed.10671068Publishing the changes from your local (private) repository to1069your remote (public) repository requires a write privilege on1070the remote machine. You need to have an SSH account there to1071run a single command, `git-receive-pack`.10721073First, you need to create an empty repository on the remote1074machine that will house your public repository. This empty1075repository will be populated and be kept up-to-date by pushing1076into it later. Obviously, this repository creation needs to be1077done only once.10781079[NOTE]1080`git push` uses a pair of programs,1081`git-send-pack` on your local machine, and `git-receive-pack`1082on the remote machine. The communication between the two over1083the network internally uses an SSH connection.10841085Your private repository's GIT directory is usually `.git`, but1086your public repository is often named after the project name,1087i.e. `<project>.git`. Let's create such a public repository for1088project `my-git`. After logging into the remote machine, create1089an empty directory:10901091 mkdir my-git.git10921093Then, make that directory into a GIT repository by running1094`git init-db`, but this time, since its name is not the usual1095`.git`, we do things slightly differently:10961097 GIT_DIR=my-git.git git-init-db10981099Make sure this directory is available for others you want your1100changes to be pulled by via the transport of your choice. Also1101you need to make sure that you have the `git-receive-pack`1102program on the `$PATH`.11031104[NOTE]1105Many installations of sshd do not invoke your shell as the login1106shell when you directly run programs; what this means is that if1107your login shell is `bash`, only `.bashrc` is read and not1108`.bash_profile`. As a workaround, make sure `.bashrc` sets up1109`$PATH` so that you can run `git-receive-pack` program.11101111[NOTE]1112If you plan to publish this repository to be accessed over http,1113you should do `chmod +x my-git.git/hooks/post-update` at this1114point. This makes sure that every time you push into this1115repository, `git-update-server-info` is run.11161117Your "public repository" is now ready to accept your changes.1118Come back to the machine you have your private repository. From1119there, run this command:11201121 git push <public-host>:/path/to/my-git.git master11221123This synchronizes your public repository to match the named1124branch head (i.e. `master` in this case) and objects reachable1125from them in your current repository.11261127As a real example, this is how I update my public git1128repository. Kernel.org mirror network takes care of the1129propagation to other publicly visible machines:11301131 git push master.kernel.org:/pub/scm/git/git.git/ 113211331134Packing your repository1135-----------------------11361137Earlier, we saw that one file under `.git/objects/??/` directory1138is stored for each git object you create. This representation1139is efficient to create atomically and safely, but1140not so convenient to transport over the network. Since git objects are1141immutable once they are created, there is a way to optimize the1142storage by "packing them together". The command11431144 git repack11451146will do it for you. If you followed the tutorial examples, you1147would have accumulated about 17 objects in `.git/objects/??/`1148directories by now. `git repack` tells you how many objects it1149packed, and stores the packed file in `.git/objects/pack`1150directory.11511152[NOTE]1153You will see two files, `pack-\*.pack` and `pack-\*.idx`,1154in `.git/objects/pack` directory. They are closely related to1155each other, and if you ever copy them by hand to a different1156repository for whatever reason, you should make sure you copy1157them together. The former holds all the data from the objects1158in the pack, and the latter holds the index for random1159access.11601161If you are paranoid, running `git-verify-pack` command would1162detect if you have a corrupt pack, but do not worry too much.1163Our programs are always perfect ;-).11641165Once you have packed objects, you do not need to leave the1166unpacked objects that are contained in the pack file anymore.11671168 git prune-packed11691170would remove them for you.11711172You can try running `find .git/objects -type f` before and after1173you run `git prune-packed` if you are curious. Also `git1174count-objects` would tell you how many unpacked objects are in1175your repository and how much space they are consuming.11761177[NOTE]1178`git pull` is slightly cumbersome for HTTP transport, as a1179packed repository may contain relatively few objects in a1180relatively large pack. If you expect many HTTP pulls from your1181public repository you might want to repack & prune often, or1182never.11831184If you run `git repack` again at this point, it will say1185"Nothing to pack". Once you continue your development and1186accumulate the changes, running `git repack` again will create a1187new pack, that contains objects created since you packed your1188repository the last time. We recommend that you pack your project1189soon after the initial import (unless you are starting your1190project from scratch), and then run `git repack` every once in a1191while, depending on how active your project is.11921193When a repository is synchronized via `git push` and `git pull`1194objects packed in the source repository are usually stored1195unpacked in the destination, unless rsync transport is used.1196While this allows you to use different packing strategies on1197both ends, it also means you may need to repack both1198repositories every once in a while.119912001201Working with Others1202-------------------12031204Although git is a truly distributed system, it is often1205convenient to organize your project with an informal hierarchy1206of developers. Linux kernel development is run this way. There1207is a nice illustration (page 17, "Merges to Mainline") in Randy1208Dunlap's presentation (`http://tinyurl.com/a2jdg`).12091210It should be stressed that this hierarchy is purely *informal*.1211There is nothing fundamental in git that enforces the "chain of1212patch flow" this hierarchy implies. You do not have to pull1213from only one remote repository.12141215A recommended workflow for a "project lead" goes like this:121612171. Prepare your primary repository on your local machine. Your1218 work is done there.121912202. Prepare a public repository accessible to others.1221+1222If other people are pulling from your repository over dumb1223transport protocols, you need to keep this repository 'dumb1224transport friendly'. After `git init-db`,1225`$GIT_DIR/hooks/post-update` copied from the standard templates1226would contain a call to `git-update-server-info` but the1227`post-update` hook itself is disabled by default -- enable it1228with `chmod +x post-update`.122912303. Push into the public repository from your primary1231 repository.123212334. `git repack` the public repository. This establishes a big1234 pack that contains the initial set of objects as the1235 baseline, and possibly `git prune` if the transport1236 used for pulling from your repository supports packed1237 repositories.123812395. Keep working in your primary repository. Your changes1240 include modifications of your own, patches you receive via1241 e-mails, and merges resulting from pulling the "public"1242 repositories of your "subsystem maintainers".1243+1244You can repack this private repository whenever you feel like.124512466. Push your changes to the public repository, and announce it1247 to the public.124812497. Every once in a while, "git repack" the public repository.1250 Go back to step 5. and continue working.125112521253A recommended work cycle for a "subsystem maintainer" who works1254on that project and has an own "public repository" goes like this:125512561. Prepare your work repository, by `git clone` the public1257 repository of the "project lead". The URL used for the1258 initial cloning is stored in `.git/remotes/origin`.125912602. Prepare a public repository accessible to others, just like1261 the "project lead" person does.126212633. Copy over the packed files from "project lead" public1264 repository to your public repository.126512664. Push into the public repository from your primary1267 repository. Run `git repack`, and possibly `git prune` if the1268 transport used for pulling from your repository supports1269 packed repositories.127012715. Keep working in your primary repository. Your changes1272 include modifications of your own, patches you receive via1273 e-mails, and merges resulting from pulling the "public"1274 repositories of your "project lead" and possibly your1275 "sub-subsystem maintainers".1276+1277You can repack this private repository whenever you feel1278like.127912806. Push your changes to your public repository, and ask your1281 "project lead" and possibly your "sub-subsystem1282 maintainers" to pull from it.128312847. Every once in a while, `git repack` the public repository.1285 Go back to step 5. and continue working.128612871288A recommended work cycle for an "individual developer" who does1289not have a "public" repository is somewhat different. It goes1290like this:129112921. Prepare your work repository, by `git clone` the public1293 repository of the "project lead" (or a "subsystem1294 maintainer", if you work on a subsystem). The URL used for1295 the initial cloning is stored in `.git/remotes/origin`.129612972. Do your work in your repository on 'master' branch.129812993. Run `git fetch origin` from the public repository of your1300 upstream every once in a while. This does only the first1301 half of `git pull` but does not merge. The head of the1302 public repository is stored in `.git/refs/heads/origin`.130313044. Use `git cherry origin` to see which ones of your patches1305 were accepted, and/or use `git rebase origin` to port your1306 unmerged changes forward to the updated upstream.130713085. Use `git format-patch origin` to prepare patches for e-mail1309 submission to your upstream and send it out. Go back to1310 step 2. and continue.131113121313Working with Others, Shared Repository Style1314--------------------------------------------13151316If you are coming from CVS background, the style of cooperation1317suggested in the previous section may be new to you. You do not1318have to worry. git supports "shared public repository" style of1319cooperation you are probably more familiar with as well.13201321For this, set up a public repository on a machine that is1322reachable via SSH by people with "commit privileges". Put the1323committers in the same user group and make the repository1324writable by that group.13251326You, as an individual committer, then:13271328- First clone the shared repository to a local repository:1329------------------------------------------------1330$ git clone repo.shared.xz:/pub/scm/project.git/ my-project1331$ cd my-project1332$ hack away1333------------------------------------------------13341335- Merge the work others might have done while you were hacking1336 away:1337------------------------------------------------1338$ git pull origin1339$ test the merge result1340------------------------------------------------1341[NOTE]1342================================1343The first `git clone` would have placed the following in1344`my-project/.git/remotes/origin` file, and that's why this and1345the next step work.1346------------1347URL: repo.shared.xz:/pub/scm/project.git/ my-project1348Pull: master:origin1349------------1350================================13511352- push your work as the new head of the shared1353 repository.1354------------------------------------------------1355$ git push origin master1356------------------------------------------------1357If somebody else pushed into the same shared repository while1358you were working locally, `git push` in the last step would1359complain, telling you that the remote `master` head does not1360fast forward. You need to pull and merge those other changes1361back before you push your work when it happens.136213631364Bundling your work together1365---------------------------13661367It is likely that you will be working on more than one thing at1368a time. It is easy to use those more-or-less independent tasks1369using branches with git.13701371We have already seen how branches work in a previous example,1372with "fun and work" example using two branches. The idea is the1373same if there are more than two branches. Let's say you started1374out from "master" head, and have some new code in the "master"1375branch, and two independent fixes in the "commit-fix" and1376"diff-fix" branches:13771378------------1379$ git show-branch1380! [commit-fix] Fix commit message normalization.1381 ! [diff-fix] Fix rename detection.1382 * [master] Release candidate #11383---1384 + [diff-fix] Fix rename detection.1385 + [diff-fix~1] Better common substring algorithm.1386+ [commit-fix] Fix commit message normalization.1387 + [master] Release candidate #11388+++ [diff-fix~2] Pretty-print messages.1389------------13901391Both fixes are tested well, and at this point, you want to merge1392in both of them. You could merge in 'diff-fix' first and then1393'commit-fix' next, like this:13941395------------1396$ git resolve master diff-fix 'Merge fix in diff-fix'1397$ git resolve master commit-fix 'Merge fix in commit-fix'1398------------13991400Which would result in:14011402------------1403$ git show-branch1404! [commit-fix] Fix commit message normalization.1405 ! [diff-fix] Fix rename detection.1406 * [master] Merge fix in commit-fix1407---1408 + [master] Merge fix in commit-fix1409+ + [commit-fix] Fix commit message normalization.1410 + [master~1] Merge fix in diff-fix1411 ++ [diff-fix] Fix rename detection.1412 ++ [diff-fix~1] Better common substring algorithm.1413 + [master~2] Release candidate #11414+++ [master~3] Pretty-print messages.1415------------14161417However, there is no particular reason to merge in one branch1418first and the other next, when what you have are a set of truly1419independent changes (if the order mattered, then they are not1420independent by definition). You could instead merge those two1421branches into the current branch at once. First let's undo what1422we just did and start over. We would want to get the master1423branch before these two merges by resetting it to 'master~2':14241425------------1426$ git reset --hard master~21427------------14281429You can make sure 'git show-branch' matches the state before1430those two 'git resolve' you just did. Then, instead of running1431two 'git resolve' commands in a row, you would pull these two1432branch heads (this is known as 'making an Octopus'):14331434------------1435$ git pull . commit-fix diff-fix1436$ git show-branch1437! [commit-fix] Fix commit message normalization.1438 ! [diff-fix] Fix rename detection.1439 * [master] Octopus merge of branches 'diff-fix' and 'commit-fix'1440---1441 + [master] Octopus merge of branches 'diff-fix' and 'commit-fix'1442+ + [commit-fix] Fix commit message normalization.1443 ++ [diff-fix] Fix rename detection.1444 ++ [diff-fix~1] Better common substring algorithm.1445 + [master~1] Release candidate #11446+++ [master~2] Pretty-print messages.1447------------14481449Note that you should not do Octopus because you can. An octopus1450is a valid thing to do and often makes it easier to view the1451commit history if you are pulling more than two independent1452changes at the same time. However, if you have merge conflicts1453with any of the branches you are merging in and need to hand1454resolve, that is an indication that the development happened in1455those branches were not independent after all, and you should1456merge two at a time, documenting how you resolved the conflicts,1457and the reason why you preferred changes made in one side over1458the other. Otherwise it would make the project history harder1459to follow, not easier.14601461[ to be continued.. cvsimports ]