1A short git tutorial 2==================== 3v0.99.5, Aug 2005 4 5Introduction 6------------ 7 8This is trying to be a short tutorial on setting up and using a git 9repository, mainly because being hands-on and using explicit examples is 10often the best way of explaining what is going on. 11 12In normal life, most people wouldn't use the "core" git programs 13directly, but rather script around them to make them more palatable. 14Understanding the core git stuff may help some people get those scripts 15done, though, and it may also be instructive in helping people 16understand what it is that the higher-level helper scripts are actually 17doing. 18 19The core git is often called "plumbing", with the prettier user 20interfaces on top of it called "porcelain". You may not want to use the 21plumbing directly very often, but it can be good to know what the 22plumbing does for when the porcelain isn't flushing... 23 24 25Creating a git repository 26------------------------- 27 28Creating a new git repository couldn't be easier: all git repositories start 29out empty, and the only thing you need to do is find yourself a 30subdirectory that you want to use as a working tree - either an empty 31one for a totally new project, or an existing working tree that you want 32to import into git. 33 34For our first example, we're going to start a totally new repository from 35scratch, with no pre-existing files, and we'll call it `git-tutorial`. 36To start up, create a subdirectory for it, change into that 37subdirectory, and initialize the git infrastructure with `git-init-db`: 38 39------------------------------------------------ 40mkdir git-tutorial 41cd git-tutorial 42git-init-db 43------------------------------------------------ 44 45to which git will reply 46 47 defaulting to local storage area 48 49which is just git's way of saying that you haven't been doing anything 50strange, and that it will have created a local `.git` directory setup for 51your new project. You will now have a `.git` directory, and you can 52inspect that with `ls`. For your new empty project, it should show you 53three entries, among other things: 54 55 - a symlink called `HEAD`, pointing to `refs/heads/master` (if your 56 platform does not have native symlinks, it is a file containing the 57 line "ref: refs/heads/master") 58+ 59Don't worry about the fact that the file that the `HEAD` link points to 60doesn't even exist yet -- you haven't created the commit that will 61start your `HEAD` development branch yet. 62 63 - a subdirectory called `objects`, which will contain all the 64 objects of your project. You should never have any real reason to 65 look at the objects directly, but you might want to know that these 66 objects are what contains all the real 'data' in your repository. 67 68 - a subdirectory called `refs`, which contains references to objects. 69 70In particular, the `refs` subdirectory will contain two other 71subdirectories, named `heads` and `tags` respectively. They do 72exactly what their names imply: they contain references to any number 73of different 'heads' of development (aka 'branches'), and to any 74'tags' that you have created to name specific versions in your 75repository. 76 77One note: the special `master` head is the default branch, which is 78why the `.git/HEAD` file was created as a symlink to it even if it 79doesn't yet exist. Basically, the `HEAD` link is supposed to always 80point to the branch you are working on right now, and you always 81start out expecting to work on the `master` branch. 82 83However, this is only a convention, and you can name your branches 84anything you want, and don't have to ever even 'have' a `master` 85branch. A number of the git tools will assume that `.git/HEAD` is 86valid, though. 87 88[NOTE] 89An 'object' is identified by its 160-bit SHA1 hash, aka 'object name', 90and a reference to an object is always the 40-byte hex 91representation of that SHA1 name. The files in the `refs` 92subdirectory are expected to contain these hex references 93(usually with a final `\'\n\'` at the end), and you should thus 94expect to see a number of 41-byte files containing these 95references in these `refs` subdirectories when you actually start 96populating your tree. 97 98[NOTE] 99An advanced user may want to take a look at the 100link:repository-layout.html[repository layout] document 101after finishing this tutorial. 102 103You have now created your first git repository. Of course, since it's 104empty, that's not very useful, so let's start populating it with data. 105 106 107Populating a git repository 108--------------------------- 109 110We'll keep this simple and stupid, so we'll start off with populating a 111few trivial files just to get a feel for it. 112 113Start off with just creating any random files that you want to maintain 114in your git repository. We'll start off with a few bad examples, just to 115get a feel for how this works: 116 117------------------------------------------------ 118echo "Hello World" >hello 119echo "Silly example" >example 120------------------------------------------------ 121 122you have now created two files in your working tree (aka 'working directory'), but to 123actually check in your hard work, you will have to go through two steps: 124 125 - fill in the 'index' file (aka 'cache') with the information about your 126 working tree state. 127 128 - commit that index file as an object. 129 130The first step is trivial: when you want to tell git about any changes 131to your working tree, you use the `git-update-index` program. That 132program normally just takes a list of filenames you want to update, but 133to avoid trivial mistakes, it refuses to add new entries to the cache 134(or remove existing ones) unless you explicitly tell it that you're 135adding a new entry with the `\--add` flag (or removing an entry with the 136`\--remove`) flag. 137 138So to populate the index with the two files you just created, you can do 139 140------------------------------------------------ 141git-update-index --add hello example 142------------------------------------------------ 143 144and you have now told git to track those two files. 145 146In fact, as you did that, if you now look into your object directory, 147you'll notice that git will have added two new objects to the object 148database. If you did exactly the steps above, you should now be able to do 149 150 ls .git/objects/??/* 151 152and see two files: 153 154 .git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238 155 .git/objects/f2/4c74a2e500f5ee1332c86b94199f52b1d1d962 156 157which correspond with the objects with names of 557db... and f24c7.. 158respectively. 159 160If you want to, you can use `git-cat-file` to look at those objects, but 161you'll have to use the object name, not the filename of the object: 162 163 git-cat-file -t 557db03de997c86a4a028e1ebd3a1ceb225be238 164 165where the `-t` tells `git-cat-file` to tell you what the "type" of the 166object is. Git will tell you that you have a "blob" object (ie just a 167regular file), and you can see the contents with 168 169 git-cat-file "blob" 557db03 170 171which will print out "Hello World". The object 557db03 is nothing 172more than the contents of your file `hello`. 173 174[NOTE] 175Don't confuse that object with the file `hello` itself. The 176object is literally just those specific *contents* of the file, and 177however much you later change the contents in file `hello`, the object 178we just looked at will never change. Objects are immutable. 179 180[NOTE] 181The second example demonstrates that you can 182abbreviate the object name to only the first several 183hexadecimal digits in most places. 184 185Anyway, as we mentioned previously, you normally never actually take a 186look at the objects themselves, and typing long 40-character hex 187names is not something you'd normally want to do. The above digression 188was just to show that `git-update-index` did something magical, and 189actually saved away the contents of your files into the git object 190database. 191 192Updating the cache did something else too: it created a `.git/index` 193file. This is the index that describes your current working tree, and 194something you should be very aware of. Again, you normally never worry 195about the index file itself, but you should be aware of the fact that 196you have not actually really "checked in" your files into git so far, 197you've only *told* git about them. 198 199However, since git knows about them, you can now start using some of the 200most basic git commands to manipulate the files or look at their status. 201 202In particular, let's not even check in the two files into git yet, we'll 203start off by adding another line to `hello` first: 204 205------------------------------------------------ 206echo "It's a new day for git" >>hello 207------------------------------------------------ 208 209and you can now, since you told git about the previous state of `hello`, ask 210git what has changed in the tree compared to your old index, using the 211`git-diff-files` command: 212 213------------ 214git-diff-files 215------------ 216 217Oops. That wasn't very readable. It just spit out its own internal 218version of a `diff`, but that internal version really just tells you 219that it has noticed that "hello" has been modified, and that the old object 220contents it had have been replaced with something else. 221 222To make it readable, we can tell git-diff-files to output the 223differences as a patch, using the `-p` flag: 224 225------------ 226git-diff-files -p 227------------ 228 229which will spit out 230 231------------ 232diff --git a/hello b/hello 233index 557db03..263414f 100644 234--- a/hello 235+++ b/hello 236@@ -1 +1,2 @@ 237 Hello World 238+It's a new day for git 239---- 240 241i.e. the diff of the change we caused by adding another line to `hello`. 242 243In other words, `git-diff-files` always shows us the difference between 244what is recorded in the index, and what is currently in the working 245tree. That's very useful. 246 247A common shorthand for `git-diff-files -p` is to just write `git 248diff`, which will do the same thing. 249 250 251Committing git state 252-------------------- 253 254Now, we want to go to the next stage in git, which is to take the files 255that git knows about in the index, and commit them as a real tree. We do 256that in two phases: creating a 'tree' object, and committing that 'tree' 257object as a 'commit' object together with an explanation of what the 258tree was all about, along with information of how we came to that state. 259 260Creating a tree object is trivial, and is done with `git-write-tree`. 261There are no options or other input: git-write-tree will take the 262current index state, and write an object that describes that whole 263index. In other words, we're now tying together all the different 264filenames with their contents (and their permissions), and we're 265creating the equivalent of a git "directory" object: 266 267------------------------------------------------ 268git-write-tree 269------------------------------------------------ 270 271and this will just output the name of the resulting tree, in this case 272(if you have done exactly as I've described) it should be 273 274 8988da15d077d4829fc51d8544c097def6644dbb 275 276which is another incomprehensible object name. Again, if you want to, 277you can use `git-cat-file -t 8988d\...` to see that this time the object 278is not a "blob" object, but a "tree" object (you can also use 279`git-cat-file` to actually output the raw object contents, but you'll see 280mainly a binary mess, so that's less interesting). 281 282However -- normally you'd never use `git-write-tree` on its own, because 283normally you always commit a tree into a commit object using the 284`git-commit-tree` command. In fact, it's easier to not actually use 285`git-write-tree` on its own at all, but to just pass its result in as an 286argument to `git-commit-tree`. 287 288`git-commit-tree` normally takes several arguments -- it wants to know 289what the 'parent' of a commit was, but since this is the first commit 290ever in this new repository, and it has no parents, we only need to pass in 291the object name of the tree. However, `git-commit-tree` 292also wants to get a commit message 293on its standard input, and it will write out the resulting object name for the 294commit to its standard output. 295 296And this is where we create the `.git/refs/heads/master` file 297which is pointed at by `HEAD`. This file is supposed to contain 298the reference to the top-of-tree of the master branch, and since 299that's exactly what `git-commit-tree` spits out, we can do this 300all with a sequence of simple shell commands: 301 302------------------------------------------------ 303tree=$(git-write-tree) 304commit=$(echo 'Initial commit' | git-commit-tree $tree) 305git-update-ref HEAD $(commit) 306------------------------------------------------ 307 308which will say: 309 310 Committing initial tree 8988da15d077d4829fc51d8544c097def6644dbb 311 312just to warn you about the fact that it created a totally new commit 313that is not related to anything else. Normally you do this only *once* 314for a project ever, and all later commits will be parented on top of an 315earlier commit, and you'll never see this "Committing initial tree" 316message ever again. 317 318Again, normally you'd never actually do this by hand. There is a 319helpful script called `git commit` that will do all of this for you. So 320you could have just written `git commit` 321instead, and it would have done the above magic scripting for you. 322 323 324Making a change 325--------------- 326 327Remember how we did the `git-update-index` on file `hello` and then we 328changed `hello` afterward, and could compare the new state of `hello` with the 329state we saved in the index file? 330 331Further, remember how I said that `git-write-tree` writes the contents 332of the *index* file to the tree, and thus what we just committed was in 333fact the *original* contents of the file `hello`, not the new ones. We did 334that on purpose, to show the difference between the index state, and the 335state in the working tree, and how they don't have to match, even 336when we commit things. 337 338As before, if we do `git-diff-files -p` in our git-tutorial project, 339we'll still see the same difference we saw last time: the index file 340hasn't changed by the act of committing anything. However, now that we 341have committed something, we can also learn to use a new command: 342`git-diff-index`. 343 344Unlike `git-diff-files`, which showed the difference between the index 345file and the working tree, `git-diff-index` shows the differences 346between a committed *tree* and either the index file or the working 347tree. In other words, `git-diff-index` wants a tree to be diffed 348against, and before we did the commit, we couldn't do that, because we 349didn't have anything to diff against. 350 351But now we can do 352 353 git-diff-index -p HEAD 354 355(where `-p` has the same meaning as it did in `git-diff-files`), and it 356will show us the same difference, but for a totally different reason. 357Now we're comparing the working tree not against the index file, 358but against the tree we just wrote. It just so happens that those two 359are obviously the same, so we get the same result. 360 361Again, because this is a common operation, you can also just shorthand 362it with 363 364 git diff HEAD 365 366which ends up doing the above for you. 367 368In other words, `git-diff-index` normally compares a tree against the 369working tree, but when given the `\--cached` flag, it is told to 370instead compare against just the index cache contents, and ignore the 371current working tree state entirely. Since we just wrote the index 372file to HEAD, doing `git-diff-index \--cached -p HEAD` should thus return 373an empty set of differences, and that's exactly what it does. 374 375[NOTE] 376================ 377`git-diff-index` really always uses the index for its 378comparisons, and saying that it compares a tree against the working 379tree is thus not strictly accurate. In particular, the list of 380files to compare (the "meta-data") *always* comes from the index file, 381regardless of whether the `\--cached` flag is used or not. The `\--cached` 382flag really only determines whether the file *contents* to be compared 383come from the working tree or not. 384 385This is not hard to understand, as soon as you realize that git simply 386never knows (or cares) about files that it is not told about 387explicitly. Git will never go *looking* for files to compare, it 388expects you to tell it what the files are, and that's what the index 389is there for. 390================ 391 392However, our next step is to commit the *change* we did, and again, to 393understand what's going on, keep in mind the difference between "working 394tree contents", "index file" and "committed tree". We have changes 395in the working tree that we want to commit, and we always have to 396work through the index file, so the first thing we need to do is to 397update the index cache: 398 399------------------------------------------------ 400git-update-index hello 401------------------------------------------------ 402 403(note how we didn't need the `\--add` flag this time, since git knew 404about the file already). 405 406Note what happens to the different `git-diff-\*` versions here. After 407we've updated `hello` in the index, `git-diff-files -p` now shows no 408differences, but `git-diff-index -p HEAD` still *does* show that the 409current state is different from the state we committed. In fact, now 410`git-diff-index` shows the same difference whether we use the `--cached` 411flag or not, since now the index is coherent with the working tree. 412 413Now, since we've updated `hello` in the index, we can commit the new 414version. We could do it by writing the tree by hand again, and 415committing the tree (this time we'd have to use the `-p HEAD` flag to 416tell commit that the HEAD was the *parent* of the new commit, and that 417this wasn't an initial commit any more), but you've done that once 418already, so let's just use the helpful script this time: 419 420------------------------------------------------ 421git commit 422------------------------------------------------ 423 424which starts an editor for you to write the commit message and tells you 425a bit about what you have done. 426 427Write whatever message you want, and all the lines that start with '#' 428will be pruned out, and the rest will be used as the commit message for 429the change. If you decide you don't want to commit anything after all at 430this point (you can continue to edit things and update the cache), you 431can just leave an empty message. Otherwise `git commit` will commit 432the change for you. 433 434You've now made your first real git commit. And if you're interested in 435looking at what `git commit` really does, feel free to investigate: 436it's a few very simple shell scripts to generate the helpful (?) commit 437message headers, and a few one-liners that actually do the 438commit itself (`git-commit`). 439 440 441Inspecting Changes 442------------------ 443 444While creating changes is useful, it's even more useful if you can tell 445later what changed. The most useful command for this is another of the 446`diff` family, namely `git-diff-tree`. 447 448`git-diff-tree` can be given two arbitrary trees, and it will tell you the 449differences between them. Perhaps even more commonly, though, you can 450give it just a single commit object, and it will figure out the parent 451of that commit itself, and show the difference directly. Thus, to get 452the same diff that we've already seen several times, we can now do 453 454 git-diff-tree -p HEAD 455 456(again, `-p` means to show the difference as a human-readable patch), 457and it will show what the last commit (in `HEAD`) actually changed. 458 459More interestingly, you can also give `git-diff-tree` the `-v` flag, which 460tells it to also show the commit message and author and date of the 461commit, and you can tell it to show a whole series of diffs. 462Alternatively, you can tell it to be "silent", and not show the diffs at 463all, but just show the actual commit message. 464 465In fact, together with the `git-rev-list` program (which generates a 466list of revisions), `git-diff-tree` ends up being a veritable fount of 467changes. A trivial (but very useful) script called `git-whatchanged` is 468included with git which does exactly this, and shows a log of recent 469activities. 470 471To see the whole history of our pitiful little git-tutorial project, you 472can do 473 474 git log 475 476which shows just the log messages, or if we want to see the log together 477with the associated patches use the more complex (and much more 478powerful) 479 480 git-whatchanged -p --root 481 482and you will see exactly what has changed in the repository over its 483short history. 484 485[NOTE] 486The `\--root` flag is a flag to `git-diff-tree` to tell it to 487show the initial aka 'root' commit too. Normally you'd probably not 488want to see the initial import diff, but since the tutorial project 489was started from scratch and is so small, we use it to make the result 490a bit more interesting. 491 492With that, you should now be having some inkling of what git does, and 493can explore on your own. 494 495[NOTE] 496Most likely, you are not directly using the core 497git Plumbing commands, but using Porcelain like Cogito on top 498of it. Cogito works a bit differently and you usually do not 499have to run `git-update-index` yourself for changed files (you 500do tell underlying git about additions and removals via 501`cg-add` and `cg-rm` commands). Just before you make a commit 502with `cg-commit`, Cogito figures out which files you modified, 503and runs `git-update-index` on them for you. 504 505 506Tagging a version 507----------------- 508 509In git, there are two kinds of tags, a "light" one, and an "annotated tag". 510 511A "light" tag is technically nothing more than a branch, except we put 512it in the `.git/refs/tags/` subdirectory instead of calling it a `head`. 513So the simplest form of tag involves nothing more than 514 515------------------------------------------------ 516git tag my-first-tag 517------------------------------------------------ 518 519which just writes the current `HEAD` into the `.git/refs/tags/my-first-tag` 520file, after which point you can then use this symbolic name for that 521particular state. You can, for example, do 522 523 git diff my-first-tag 524 525to diff your current state against that tag (which at this point will 526obviously be an empty diff, but if you continue to develop and commit 527stuff, you can use your tag as an "anchor-point" to see what has changed 528since you tagged it. 529 530An "annotated tag" is actually a real git object, and contains not only a 531pointer to the state you want to tag, but also a small tag name and 532message, along with optionally a PGP signature that says that yes, 533you really did 534that tag. You create these annotated tags with either the `-a` or 535`-s` flag to `git tag`: 536 537 git tag -s <tagname> 538 539which will sign the current `HEAD` (but you can also give it another 540argument that specifies the thing to tag, ie you could have tagged the 541current `mybranch` point by using `git tag <tagname> mybranch`). 542 543You normally only do signed tags for major releases or things 544like that, while the light-weight tags are useful for any marking you 545want to do -- any time you decide that you want to remember a certain 546point, just create a private tag for it, and you have a nice symbolic 547name for the state at that point. 548 549 550Copying repositories 551-------------------- 552 553Git repositories are normally totally self-sufficient, and it's worth noting 554that unlike CVS, for example, there is no separate notion of 555"repository" and "working tree". A git repository normally *is* the 556working tree, with the local git information hidden in the `.git` 557subdirectory. There is nothing else. What you see is what you got. 558 559[NOTE] 560You can tell git to split the git internal information from 561the directory that it tracks, but we'll ignore that for now: it's not 562how normal projects work, and it's really only meant for special uses. 563So the mental model of "the git information is always tied directly to 564the working tree that it describes" may not be technically 100% 565accurate, but it's a good model for all normal use. 566 567This has two implications: 568 569 - if you grow bored with the tutorial repository you created (or you've 570 made a mistake and want to start all over), you can just do simple 571 572 rm -rf git-tutorial 573+ 574and it will be gone. There's no external repository, and there's no 575history outside the project you created. 576 577 - if you want to move or duplicate a git repository, you can do so. There 578 is `git clone` command, but if all you want to do is just to 579 create a copy of your repository (with all the full history that 580 went along with it), you can do so with a regular 581 `cp -a git-tutorial new-git-tutorial`. 582+ 583Note that when you've moved or copied a git repository, your git index 584file (which caches various information, notably some of the "stat" 585information for the files involved) will likely need to be refreshed. 586So after you do a `cp -a` to create a new copy, you'll want to do 587 588 git-update-index --refresh 589+ 590in the new repository to make sure that the index file is up-to-date. 591 592Note that the second point is true even across machines. You can 593duplicate a remote git repository with *any* regular copy mechanism, be it 594`scp`, `rsync` or `wget`. 595 596When copying a remote repository, you'll want to at a minimum update the 597index cache when you do this, and especially with other peoples' 598repositories you often want to make sure that the index cache is in some 599known state (you don't know *what* they've done and not yet checked in), 600so usually you'll precede the `git-update-index` with a 601 602 git-read-tree --reset HEAD 603 git-update-index --refresh 604 605which will force a total index re-build from the tree pointed to by `HEAD`. 606It resets the index contents to `HEAD`, and then the `git-update-index` 607makes sure to match up all index entries with the checked-out files. 608If the original repository had uncommitted changes in its 609working tree, `git-update-index --refresh` notices them and 610tells you they need to be updated. 611 612The above can also be written as simply 613 614 git reset 615 616and in fact a lot of the common git command combinations can be scripted 617with the `git xyz` interfaces. You can learn things by just looking 618at what the various git scripts do. For example, `git reset` is the 619above two lines implemented in `git-reset`, but some things like 620`git status` and `git commit` are slightly more complex scripts around 621the basic git commands. 622 623Many (most?) public remote repositories will not contain any of 624the checked out files or even an index file, and will *only* contain the 625actual core git files. Such a repository usually doesn't even have the 626`.git` subdirectory, but has all the git files directly in the 627repository. 628 629To create your own local live copy of such a "raw" git repository, you'd 630first create your own subdirectory for the project, and then copy the 631raw repository contents into the `.git` directory. For example, to 632create your own copy of the git repository, you'd do the following 633 634 mkdir my-git 635 cd my-git 636 rsync -rL rsync://rsync.kernel.org/pub/scm/git/git.git/ .git 637 638followed by 639 640 git-read-tree HEAD 641 642to populate the index. However, now you have populated the index, and 643you have all the git internal files, but you will notice that you don't 644actually have any of the working tree files to work on. To get 645those, you'd check them out with 646 647 git-checkout-index -u -a 648 649where the `-u` flag means that you want the checkout to keep the index 650up-to-date (so that you don't have to refresh it afterward), and the 651`-a` flag means "check out all files" (if you have a stale copy or an 652older version of a checked out tree you may also need to add the `-f` 653flag first, to tell git-checkout-index to *force* overwriting of any old 654files). 655 656Again, this can all be simplified with 657 658 git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ my-git 659 cd my-git 660 git checkout 661 662which will end up doing all of the above for you. 663 664You have now successfully copied somebody else's (mine) remote 665repository, and checked it out. 666 667 668Creating a new branch 669--------------------- 670 671Branches in git are really nothing more than pointers into the git 672object database from within the `.git/refs/` subdirectory, and as we 673already discussed, the `HEAD` branch is nothing but a symlink to one of 674these object pointers. 675 676You can at any time create a new branch by just picking an arbitrary 677point in the project history, and just writing the SHA1 name of that 678object into a file under `.git/refs/heads/`. You can use any filename you 679want (and indeed, subdirectories), but the convention is that the 680"normal" branch is called `master`. That's just a convention, though, 681and nothing enforces it. 682 683To show that as an example, let's go back to the git-tutorial repository we 684used earlier, and create a branch in it. You do that by simply just 685saying that you want to check out a new branch: 686 687------------ 688git checkout -b mybranch 689------------ 690 691will create a new branch based at the current `HEAD` position, and switch 692to it. 693 694[NOTE] 695================================================ 696If you make the decision to start your new branch at some 697other point in the history than the current `HEAD`, you can do so by 698just telling `git checkout` what the base of the checkout would be. 699In other words, if you have an earlier tag or branch, you'd just do 700 701------------ 702git checkout -b mybranch earlier-commit 703------------ 704 705and it would create the new branch `mybranch` at the earlier commit, 706and check out the state at that time. 707================================================ 708 709You can always just jump back to your original `master` branch by doing 710 711------------ 712git checkout master 713------------ 714 715(or any other branch-name, for that matter) and if you forget which 716branch you happen to be on, a simple 717 718------------ 719ls -l .git/HEAD 720------------ 721 722will tell you where it's pointing (Note that on platforms with bad or no 723symlink support, you have to execute 724 725------------ 726cat .git/HEAD 727------------ 728 729instead). To get the list of branches you have, you can say 730 731------------ 732git branch 733------------ 734 735which is nothing more than a simple script around `ls .git/refs/heads`. 736There will be asterisk in front of the branch you are currently on. 737 738Sometimes you may wish to create a new branch _without_ actually 739checking it out and switching to it. If so, just use the command 740 741------------ 742git branch <branchname> [startingpoint] 743------------ 744 745which will simply _create_ the branch, but will not do anything further. 746You can then later -- once you decide that you want to actually develop 747on that branch -- switch to that branch with a regular `git checkout` 748with the branchname as the argument. 749 750 751Merging two branches 752-------------------- 753 754One of the ideas of having a branch is that you do some (possibly 755experimental) work in it, and eventually merge it back to the main 756branch. So assuming you created the above `mybranch` that started out 757being the same as the original `master` branch, let's make sure we're in 758that branch, and do some work there. 759 760------------------------------------------------ 761git checkout mybranch 762echo "Work, work, work" >>hello 763git commit -m 'Some work.' hello 764------------------------------------------------ 765 766Here, we just added another line to `hello`, and we used a shorthand for 767doing both `git-update-index hello` and `git commit` by just giving the 768filename directly to `git commit`. The `-m` flag is to give the 769commit log message from the command line. 770 771Now, to make it a bit more interesting, let's assume that somebody else 772does some work in the original branch, and simulate that by going back 773to the master branch, and editing the same file differently there: 774 775------------ 776git checkout master 777------------ 778 779Here, take a moment to look at the contents of `hello`, and notice how they 780don't contain the work we just did in `mybranch` -- because that work 781hasn't happened in the `master` branch at all. Then do 782 783------------ 784echo "Play, play, play" >>hello 785echo "Lots of fun" >>example 786git commit -m 'Some fun.' hello example 787------------ 788 789since the master branch is obviously in a much better mood. 790 791Now, you've got two branches, and you decide that you want to merge the 792work done. Before we do that, let's introduce a cool graphical tool that 793helps you view what's going on: 794 795 gitk --all 796 797will show you graphically both of your branches (that's what the `\--all` 798means: normally it will just show you your current `HEAD`) and their 799histories. You can also see exactly how they came to be from a common 800source. 801 802Anyway, let's exit `gitk` (`^Q` or the File menu), and decide that we want 803to merge the work we did on the `mybranch` branch into the `master` 804branch (which is currently our `HEAD` too). To do that, there's a nice 805script called `git resolve`, which wants to know which branches you want 806to resolve and what the merge is all about: 807 808------------ 809git resolve HEAD mybranch "Merge work in mybranch" 810------------ 811 812where the third argument is going to be used as the commit message if 813the merge can be resolved automatically. 814 815Now, in this case we've intentionally created a situation where the 816merge will need to be fixed up by hand, though, so git will do as much 817of it as it can automatically (which in this case is just merge the `example` 818file, which had no differences in the `mybranch` branch), and say: 819 820 Simple merge failed, trying Automatic merge 821 Auto-merging hello. 822 merge: warning: conflicts during merge 823 ERROR: Merge conflict in hello. 824 fatal: merge program failed 825 Automatic merge failed, fix up by hand 826 827which is way too verbose, but it basically tells you that it failed the 828really trivial merge ("Simple merge") and did an "Automatic merge" 829instead, but that too failed due to conflicts in `hello`. 830 831Not to worry. It left the (trivial) conflict in `hello` in the same form you 832should already be well used to if you've ever used CVS, so let's just 833open `hello` in our editor (whatever that may be), and fix it up somehow. 834I'd suggest just making it so that `hello` contains all four lines: 835 836------------ 837Hello World 838It's a new day for git 839Play, play, play 840Work, work, work 841------------ 842 843and once you're happy with your manual merge, just do a 844 845------------ 846git commit hello 847------------ 848 849which will very loudly warn you that you're now committing a merge 850(which is correct, so never mind), and you can write a small merge 851message about your adventures in git-merge-land. 852 853After you're done, start up `gitk \--all` to see graphically what the 854history looks like. Notice that `mybranch` still exists, and you can 855switch to it, and continue to work with it if you want to. The 856`mybranch` branch will not contain the merge, but next time you merge it 857from the `master` branch, git will know how you merged it, so you'll not 858have to do _that_ merge again. 859 860Another useful tool, especially if you do not always work in X-Window 861environment, is `git show-branch`. 862 863------------------------------------------------ 864$ git show-branch master mybranch 865* [master] Merged "mybranch" changes. 866 ! [mybranch] Some work. 867-- 868+ [master] Merged "mybranch" changes. 869++ [mybranch] Some work. 870------------------------------------------------ 871 872The first two lines indicate that it is showing the two branches 873and the first line of the commit log message from their 874top-of-the-tree commits, you are currently on `master` branch 875(notice the asterisk `*` character), and the first column for 876the later output lines is used to show commits contained in the 877`master` branch, and the second column for the `mybranch` 878branch. Three commits are shown along with their log messages. 879All of them have plus `+` characters in the first column, which 880means they are now part of the `master` branch. Only the "Some 881work" commit has the plus `+` character in the second column, 882because `mybranch` has not been merged to incorporate these 883commits from the master branch. The string inside brackets 884before the commit log message is a short name you can use to 885name the commit. In the above example, 'master' and 'mybranch' 886are branch heads. 'master~1' is the first parent of 'master' 887branch head. Please see 'git-rev-parse' documentation if you 888see more complex cases. 889 890Now, let's pretend you are the one who did all the work in 891`mybranch`, and the fruit of your hard work has finally been merged 892to the `master` branch. Let's go back to `mybranch`, and run 893resolve to get the "upstream changes" back to your branch. 894 895------------ 896git checkout mybranch 897git resolve HEAD master "Merge upstream changes." 898------------ 899 900This outputs something like this (the actual commit object names 901would be different) 902 903 Updating from ae3a2da... to a80b4aa.... 904 example | 1 + 905 hello | 1 + 906 2 files changed, 2 insertions(+), 0 deletions(-) 907 908Because your branch did not contain anything more than what are 909already merged into the `master` branch, the resolve operation did 910not actually do a merge. Instead, it just updated the top of 911the tree of your branch to that of the `master` branch. This is 912often called 'fast forward' merge. 913 914You can run `gitk \--all` again to see how the commit ancestry 915looks like, or run `show-branch`, which tells you this. 916 917------------------------------------------------ 918$ git show-branch master mybranch 919! [master] Merged "mybranch" changes. 920 * [mybranch] Merged "mybranch" changes. 921-- 922++ [master] Merged "mybranch" changes. 923------------------------------------------------ 924 925 926Merging external work 927--------------------- 928 929It's usually much more common that you merge with somebody else than 930merging with your own branches, so it's worth pointing out that git 931makes that very easy too, and in fact, it's not that different from 932doing a `git resolve`. In fact, a remote merge ends up being nothing 933more than "fetch the work from a remote repository into a temporary tag" 934followed by a `git resolve`. 935 936Fetching from a remote repository is done by, unsurprisingly, 937`git fetch`: 938 939 git fetch <remote-repository> 940 941One of the following transports can be used to name the 942repository to download from: 943 944Rsync:: 945 `rsync://remote.machine/path/to/repo.git/` 946+ 947Rsync transport is usable for both uploading and downloading, 948but is completely unaware of what git does, and can produce 949unexpected results when you download from the public repository 950while the repository owner is uploading into it via `rsync` 951transport. Most notably, it could update the files under 952`refs/` which holds the object name of the topmost commits 953before uploading the files in `objects/` -- the downloader would 954obtain head commit object name while that object itself is still 955not available in the repository. For this reason, it is 956considered deprecated. 957 958SSH:: 959 `remote.machine:/path/to/repo.git/` or 960+ 961`ssh://remote.machine/path/to/repo.git/` 962+ 963This transport can be used for both uploading and downloading, 964and requires you to have a log-in privilege over `ssh` to the 965remote machine. It finds out the set of objects the other side 966lacks by exchanging the head commits both ends have and 967transfers (close to) minimum set of objects. It is by far the 968most efficient way to exchange git objects between repositories. 969 970Local directory:: 971 `/path/to/repo.git/` 972+ 973This transport is the same as SSH transport but uses `sh` to run 974both ends on the local machine instead of running other end on 975the remote machine via `ssh`. 976 977GIT Native:: 978 `git://remote.machine/path/to/repo.git/` 979+ 980This transport was designed for anonymous downloading. Like SSH 981transport, it finds out the set of objects the downstream side 982lacks and transfers (close to) minimum set of objects. 983 984HTTP(s):: 985 `http://remote.machine/path/to/repo.git/` 986+ 987HTTP and HTTPS transport are used only for downloading. They 988first obtain the topmost commit object name from the remote site 989by looking at `repo.git/info/refs` file, tries to obtain the 990commit object by downloading from `repo.git/objects/xx/xxx\...` 991using the object name of that commit object. Then it reads the 992commit object to find out its parent commits and the associate 993tree object; it repeats this process until it gets all the 994necessary objects. Because of this behaviour, they are 995sometimes also called 'commit walkers'. 996+ 997The 'commit walkers' are sometimes also called 'dumb 998transports', because they do not require any GIT aware smart 999server like GIT Native transport does. Any stock HTTP server1000would suffice.1001+1002There are (confusingly enough) `git-ssh-fetch` and `git-ssh-upload`1003programs, which are 'commit walkers'; they outlived their1004usefulness when GIT Native and SSH transports were introduced,1005and not used by `git pull` or `git push` scripts.10061007Once you fetch from the remote repository, you `resolve` that1008with your current branch.10091010However -- it's such a common thing to `fetch` and then1011immediately `resolve`, that it's called `git pull`, and you can1012simply do10131014 git pull <remote-repository>10151016and optionally give a branch-name for the remote end as a second1017argument.10181019[NOTE]1020You could do without using any branches at all, by1021keeping as many local repositories as you would like to have1022branches, and merging between them with `git pull`, just like1023you merge between branches. The advantage of this approach is1024that it lets you keep set of files for each `branch` checked1025out and you may find it easier to switch back and forth if you1026juggle multiple lines of development simultaneously. Of1027course, you will pay the price of more disk usage to hold1028multiple working trees, but disk space is cheap these days.10291030[NOTE]1031You could even pull from your own repository by1032giving '.' as <remote-repository> parameter to `git pull`.10331034It is likely that you will be pulling from the same remote1035repository from time to time. As a short hand, you can store1036the remote repository URL in a file under .git/remotes/1037directory, like this:10381039------------------------------------------------1040mkdir -p .git/remotes/1041cat >.git/remotes/linus <<\EOF1042URL: http://www.kernel.org/pub/scm/git/git.git/1043EOF1044------------------------------------------------10451046and use the filename to `git pull` instead of the full URL.1047The URL specified in such file can even be a prefix1048of a full URL, like this:10491050------------------------------------------------1051cat >.git/remotes/jgarzik <<\EOF1052URL: http://www.kernel.org/pub/scm/linux/git/jgarzik/1053EOF1054------------------------------------------------105510561057Examples.10581059. `git pull linus`1060. `git pull linus tag v0.99.1`1061. `git pull jgarzik/netdev-2.6.git/ e100`10621063the above are equivalent to:10641065. `git pull http://www.kernel.org/pub/scm/git/git.git/ HEAD`1066. `git pull http://www.kernel.org/pub/scm/git/git.git/ tag v0.99.1`1067. `git pull http://www.kernel.org/pub/.../jgarzik/netdev-2.6.git e100`106810691070Publishing your work1071--------------------10721073So we can use somebody else's work from a remote repository; but1074how can *you* prepare a repository to let other people pull from1075it?10761077Your do your real work in your working tree that has your1078primary repository hanging under it as its `.git` subdirectory.1079You *could* make that repository accessible remotely and ask1080people to pull from it, but in practice that is not the way1081things are usually done. A recommended way is to have a public1082repository, make it reachable by other people, and when the1083changes you made in your primary working tree are in good shape,1084update the public repository from it. This is often called1085'pushing'.10861087[NOTE]1088This public repository could further be mirrored, and that is1089how git repositories at `kernel.org` are managed.10901091Publishing the changes from your local (private) repository to1092your remote (public) repository requires a write privilege on1093the remote machine. You need to have an SSH account there to1094run a single command, `git-receive-pack`.10951096First, you need to create an empty repository on the remote1097machine that will house your public repository. This empty1098repository will be populated and be kept up-to-date by pushing1099into it later. Obviously, this repository creation needs to be1100done only once.11011102[NOTE]1103`git push` uses a pair of programs,1104`git-send-pack` on your local machine, and `git-receive-pack`1105on the remote machine. The communication between the two over1106the network internally uses an SSH connection.11071108Your private repository's GIT directory is usually `.git`, but1109your public repository is often named after the project name,1110i.e. `<project>.git`. Let's create such a public repository for1111project `my-git`. After logging into the remote machine, create1112an empty directory:11131114------------1115mkdir my-git.git1116------------11171118Then, make that directory into a GIT repository by running1119`git init-db`, but this time, since its name is not the usual1120`.git`, we do things slightly differently:11211122------------1123GIT_DIR=my-git.git git-init-db1124------------11251126Make sure this directory is available for others you want your1127changes to be pulled by via the transport of your choice. Also1128you need to make sure that you have the `git-receive-pack`1129program on the `$PATH`.11301131[NOTE]1132Many installations of sshd do not invoke your shell as the login1133shell when you directly run programs; what this means is that if1134your login shell is `bash`, only `.bashrc` is read and not1135`.bash_profile`. As a workaround, make sure `.bashrc` sets up1136`$PATH` so that you can run `git-receive-pack` program.11371138[NOTE]1139If you plan to publish this repository to be accessed over http,1140you should do `chmod +x my-git.git/hooks/post-update` at this1141point. This makes sure that every time you push into this1142repository, `git-update-server-info` is run.11431144Your "public repository" is now ready to accept your changes.1145Come back to the machine you have your private repository. From1146there, run this command:11471148------------1149git push <public-host>:/path/to/my-git.git master1150------------11511152This synchronizes your public repository to match the named1153branch head (i.e. `master` in this case) and objects reachable1154from them in your current repository.11551156As a real example, this is how I update my public git1157repository. Kernel.org mirror network takes care of the1158propagation to other publicly visible machines:11591160------------1161git push master.kernel.org:/pub/scm/git/git.git/ 1162------------116311641165Packing your repository1166-----------------------11671168Earlier, we saw that one file under `.git/objects/??/` directory1169is stored for each git object you create. This representation1170is efficient to create atomically and safely, but1171not so convenient to transport over the network. Since git objects are1172immutable once they are created, there is a way to optimize the1173storage by "packing them together". The command11741175------------1176git repack1177------------11781179will do it for you. If you followed the tutorial examples, you1180would have accumulated about 17 objects in `.git/objects/??/`1181directories by now. `git repack` tells you how many objects it1182packed, and stores the packed file in `.git/objects/pack`1183directory.11841185[NOTE]1186You will see two files, `pack-\*.pack` and `pack-\*.idx`,1187in `.git/objects/pack` directory. They are closely related to1188each other, and if you ever copy them by hand to a different1189repository for whatever reason, you should make sure you copy1190them together. The former holds all the data from the objects1191in the pack, and the latter holds the index for random1192access.11931194If you are paranoid, running `git-verify-pack` command would1195detect if you have a corrupt pack, but do not worry too much.1196Our programs are always perfect ;-).11971198Once you have packed objects, you do not need to leave the1199unpacked objects that are contained in the pack file anymore.12001201------------1202git prune-packed1203------------12041205would remove them for you.12061207You can try running `find .git/objects -type f` before and after1208you run `git prune-packed` if you are curious. Also `git1209count-objects` would tell you how many unpacked objects are in1210your repository and how much space they are consuming.12111212[NOTE]1213`git pull` is slightly cumbersome for HTTP transport, as a1214packed repository may contain relatively few objects in a1215relatively large pack. If you expect many HTTP pulls from your1216public repository you might want to repack & prune often, or1217never.12181219If you run `git repack` again at this point, it will say1220"Nothing to pack". Once you continue your development and1221accumulate the changes, running `git repack` again will create a1222new pack, that contains objects created since you packed your1223repository the last time. We recommend that you pack your project1224soon after the initial import (unless you are starting your1225project from scratch), and then run `git repack` every once in a1226while, depending on how active your project is.12271228When a repository is synchronized via `git push` and `git pull`1229objects packed in the source repository are usually stored1230unpacked in the destination, unless rsync transport is used.1231While this allows you to use different packing strategies on1232both ends, it also means you may need to repack both1233repositories every once in a while.123412351236Working with Others1237-------------------12381239Although git is a truly distributed system, it is often1240convenient to organize your project with an informal hierarchy1241of developers. Linux kernel development is run this way. There1242is a nice illustration (page 17, "Merges to Mainline") in Randy1243Dunlap's presentation (`http://tinyurl.com/a2jdg`).12441245It should be stressed that this hierarchy is purely *informal*.1246There is nothing fundamental in git that enforces the "chain of1247patch flow" this hierarchy implies. You do not have to pull1248from only one remote repository.12491250A recommended workflow for a "project lead" goes like this:125112521. Prepare your primary repository on your local machine. Your1253 work is done there.125412552. Prepare a public repository accessible to others.1256+1257If other people are pulling from your repository over dumb1258transport protocols, you need to keep this repository 'dumb1259transport friendly'. After `git init-db`,1260`$GIT_DIR/hooks/post-update` copied from the standard templates1261would contain a call to `git-update-server-info` but the1262`post-update` hook itself is disabled by default -- enable it1263with `chmod +x post-update`.126412653. Push into the public repository from your primary1266 repository.126712684. `git repack` the public repository. This establishes a big1269 pack that contains the initial set of objects as the1270 baseline, and possibly `git prune` if the transport1271 used for pulling from your repository supports packed1272 repositories.127312745. Keep working in your primary repository. Your changes1275 include modifications of your own, patches you receive via1276 e-mails, and merges resulting from pulling the "public"1277 repositories of your "subsystem maintainers".1278+1279You can repack this private repository whenever you feel like.128012816. Push your changes to the public repository, and announce it1282 to the public.128312847. Every once in a while, "git repack" the public repository.1285 Go back to step 5. and continue working.128612871288A recommended work cycle for a "subsystem maintainer" who works1289on that project and has an own "public repository" goes like this:129012911. Prepare your work repository, by `git clone` the public1292 repository of the "project lead". The URL used for the1293 initial cloning is stored in `.git/remotes/origin`.129412952. Prepare a public repository accessible to others, just like1296 the "project lead" person does.129712983. Copy over the packed files from "project lead" public1299 repository to your public repository.130013014. Push into the public repository from your primary1302 repository. Run `git repack`, and possibly `git prune` if the1303 transport used for pulling from your repository supports1304 packed repositories.130513065. Keep working in your primary repository. Your changes1307 include modifications of your own, patches you receive via1308 e-mails, and merges resulting from pulling the "public"1309 repositories of your "project lead" and possibly your1310 "sub-subsystem maintainers".1311+1312You can repack this private repository whenever you feel1313like.131413156. Push your changes to your public repository, and ask your1316 "project lead" and possibly your "sub-subsystem1317 maintainers" to pull from it.131813197. Every once in a while, `git repack` the public repository.1320 Go back to step 5. and continue working.132113221323A recommended work cycle for an "individual developer" who does1324not have a "public" repository is somewhat different. It goes1325like this:132613271. Prepare your work repository, by `git clone` the public1328 repository of the "project lead" (or a "subsystem1329 maintainer", if you work on a subsystem). The URL used for1330 the initial cloning is stored in `.git/remotes/origin`.133113322. Do your work in your repository on 'master' branch.133313343. Run `git fetch origin` from the public repository of your1335 upstream every once in a while. This does only the first1336 half of `git pull` but does not merge. The head of the1337 public repository is stored in `.git/refs/heads/origin`.133813394. Use `git cherry origin` to see which ones of your patches1340 were accepted, and/or use `git rebase origin` to port your1341 unmerged changes forward to the updated upstream.134213435. Use `git format-patch origin` to prepare patches for e-mail1344 submission to your upstream and send it out. Go back to1345 step 2. and continue.134613471348Working with Others, Shared Repository Style1349--------------------------------------------13501351If you are coming from CVS background, the style of cooperation1352suggested in the previous section may be new to you. You do not1353have to worry. git supports "shared public repository" style of1354cooperation you are probably more familiar with as well.13551356For this, set up a public repository on a machine that is1357reachable via SSH by people with "commit privileges". Put the1358committers in the same user group and make the repository1359writable by that group.13601361You, as an individual committer, then:13621363- First clone the shared repository to a local repository:1364------------------------------------------------1365$ git clone repo.shared.xz:/pub/scm/project.git/ my-project1366$ cd my-project1367$ hack away1368------------------------------------------------13691370- Merge the work others might have done while you were hacking1371 away:1372------------------------------------------------1373$ git pull origin1374$ test the merge result1375------------------------------------------------1376[NOTE]1377================================1378The first `git clone` would have placed the following in1379`my-project/.git/remotes/origin` file, and that's why this and1380the next step work.1381------------1382URL: repo.shared.xz:/pub/scm/project.git/ my-project1383Pull: master:origin1384------------1385================================13861387- push your work as the new head of the shared1388 repository.1389------------------------------------------------1390$ git push origin master1391------------------------------------------------1392If somebody else pushed into the same shared repository while1393you were working locally, `git push` in the last step would1394complain, telling you that the remote `master` head does not1395fast forward. You need to pull and merge those other changes1396back before you push your work when it happens.139713981399Bundling your work together1400---------------------------14011402It is likely that you will be working on more than one thing at1403a time. It is easy to use those more-or-less independent tasks1404using branches with git.14051406We have already seen how branches work in a previous example,1407with "fun and work" example using two branches. The idea is the1408same if there are more than two branches. Let's say you started1409out from "master" head, and have some new code in the "master"1410branch, and two independent fixes in the "commit-fix" and1411"diff-fix" branches:14121413------------1414$ git show-branch1415! [commit-fix] Fix commit message normalization.1416 ! [diff-fix] Fix rename detection.1417 * [master] Release candidate #11418---1419 + [diff-fix] Fix rename detection.1420 + [diff-fix~1] Better common substring algorithm.1421+ [commit-fix] Fix commit message normalization.1422 + [master] Release candidate #11423+++ [diff-fix~2] Pretty-print messages.1424------------14251426Both fixes are tested well, and at this point, you want to merge1427in both of them. You could merge in 'diff-fix' first and then1428'commit-fix' next, like this:14291430------------1431$ git resolve master diff-fix 'Merge fix in diff-fix'1432$ git resolve master commit-fix 'Merge fix in commit-fix'1433------------14341435Which would result in:14361437------------1438$ git show-branch1439! [commit-fix] Fix commit message normalization.1440 ! [diff-fix] Fix rename detection.1441 * [master] Merge fix in commit-fix1442---1443 + [master] Merge fix in commit-fix1444+ + [commit-fix] Fix commit message normalization.1445 + [master~1] Merge fix in diff-fix1446 ++ [diff-fix] Fix rename detection.1447 ++ [diff-fix~1] Better common substring algorithm.1448 + [master~2] Release candidate #11449+++ [master~3] Pretty-print messages.1450------------14511452However, there is no particular reason to merge in one branch1453first and the other next, when what you have are a set of truly1454independent changes (if the order mattered, then they are not1455independent by definition). You could instead merge those two1456branches into the current branch at once. First let's undo what1457we just did and start over. We would want to get the master1458branch before these two merges by resetting it to 'master~2':14591460------------1461$ git reset --hard master~21462------------14631464You can make sure 'git show-branch' matches the state before1465those two 'git resolve' you just did. Then, instead of running1466two 'git resolve' commands in a row, you would pull these two1467branch heads (this is known as 'making an Octopus'):14681469------------1470$ git pull . commit-fix diff-fix1471$ git show-branch1472! [commit-fix] Fix commit message normalization.1473 ! [diff-fix] Fix rename detection.1474 * [master] Octopus merge of branches 'diff-fix' and 'commit-fix'1475---1476 + [master] Octopus merge of branches 'diff-fix' and 'commit-fix'1477+ + [commit-fix] Fix commit message normalization.1478 ++ [diff-fix] Fix rename detection.1479 ++ [diff-fix~1] Better common substring algorithm.1480 + [master~1] Release candidate #11481+++ [master~2] Pretty-print messages.1482------------14831484Note that you should not do Octopus because you can. An octopus1485is a valid thing to do and often makes it easier to view the1486commit history if you are pulling more than two independent1487changes at the same time. However, if you have merge conflicts1488with any of the branches you are merging in and need to hand1489resolve, that is an indication that the development happened in1490those branches were not independent after all, and you should1491merge two at a time, documenting how you resolved the conflicts,1492and the reason why you preferred changes made in one side over1493the other. Otherwise it would make the project history harder1494to follow, not easier.14951496[ to be continued.. cvsimports ]