1A short git tutorial 2==================== 3v0.99.5, Aug 2005 4 5Introduction 6------------ 7 8This is trying to be a short tutorial on setting up and using a git 9repository, mainly because being hands-on and using explicit examples is 10often the best way of explaining what is going on. 11 12In normal life, most people wouldn't use the "core" git programs 13directly, but rather script around them to make them more palatable. 14Understanding the core git stuff may help some people get those scripts 15done, though, and it may also be instructive in helping people 16understand what it is that the higher-level helper scripts are actually 17doing. 18 19The core git is often called "plumbing", with the prettier user 20interfaces on top of it called "porcelain". You may not want to use the 21plumbing directly very often, but it can be good to know what the 22plumbing does for when the porcelain isn't flushing... 23 24 25Creating a git repository 26------------------------- 27 28Creating a new git repository couldn't be easier: all git repositories start 29out empty, and the only thing you need to do is find yourself a 30subdirectory that you want to use as a working tree - either an empty 31one for a totally new project, or an existing working tree that you want 32to import into git. 33 34For our first example, we're going to start a totally new repository from 35scratch, with no pre-existing files, and we'll call it `git-tutorial`. 36To start up, create a subdirectory for it, change into that 37subdirectory, and initialize the git infrastructure with `git-init-db`: 38 39------------------------------------------------ 40mkdir git-tutorial 41cd git-tutorial 42git-init-db 43------------------------------------------------ 44 45to which git will reply 46 47 defaulting to local storage area 48 49which is just git's way of saying that you haven't been doing anything 50strange, and that it will have created a local `.git` directory setup for 51your new project. You will now have a `.git` directory, and you can 52inspect that with `ls`. For your new empty project, it should show you 53three entries, among other things: 54 55 - a symlink called `HEAD`, pointing to `refs/heads/master` 56+ 57Don't worry about the fact that the file that the `HEAD` link points to 58doesn't even exist yet -- you haven't created the commit that will 59start your `HEAD` development branch yet. 60 61 - a subdirectory called `objects`, which will contain all the 62 objects of your project. You should never have any real reason to 63 look at the objects directly, but you might want to know that these 64 objects are what contains all the real 'data' in your repository. 65 66 - a subdirectory called `refs`, which contains references to objects. 67 68In particular, the `refs` subdirectory will contain two other 69subdirectories, named `heads` and `tags` respectively. They do 70exactly what their names imply: they contain references to any number 71of different 'heads' of development (aka 'branches'), and to any 72'tags' that you have created to name specific versions in your 73repository. 74 75One note: the special `master` head is the default branch, which is 76why the `.git/HEAD` file was created as a symlink to it even if it 77doesn't yet exist. Basically, the `HEAD` link is supposed to always 78point to the branch you are working on right now, and you always 79start out expecting to work on the `master` branch. 80 81However, this is only a convention, and you can name your branches 82anything you want, and don't have to ever even 'have' a `master` 83branch. A number of the git tools will assume that `.git/HEAD` is 84valid, though. 85 86[NOTE] 87An 'object' is identified by its 160-bit SHA1 hash, aka 'object name', 88and a reference to an object is always the 40-byte hex 89representation of that SHA1 name. The files in the `refs` 90subdirectory are expected to contain these hex references 91(usually with a final `\'\n\'` at the end), and you should thus 92expect to see a number of 41-byte files containing these 93references in these `refs` subdirectories when you actually start 94populating your tree. 95 96You have now created your first git repository. Of course, since it's 97empty, that's not very useful, so let's start populating it with data. 98 99 100Populating a git repository 101--------------------------- 102 103We'll keep this simple and stupid, so we'll start off with populating a 104few trivial files just to get a feel for it. 105 106Start off with just creating any random files that you want to maintain 107in your git repository. We'll start off with a few bad examples, just to 108get a feel for how this works: 109 110------------------------------------------------ 111echo "Hello World" >hello 112echo "Silly example" >example 113------------------------------------------------ 114 115you have now created two files in your working tree (aka 'working directory'), but to 116actually check in your hard work, you will have to go through two steps: 117 118 - fill in the 'index' file (aka 'cache') with the information about your 119 working tree state. 120 121 - commit that index file as an object. 122 123The first step is trivial: when you want to tell git about any changes 124to your working tree, you use the `git-update-cache` program. That 125program normally just takes a list of filenames you want to update, but 126to avoid trivial mistakes, it refuses to add new entries to the cache 127(or remove existing ones) unless you explicitly tell it that you're 128adding a new entry with the `\--add` flag (or removing an entry with the 129`\--remove`) flag. 130 131So to populate the index with the two files you just created, you can do 132 133------------------------------------------------ 134git-update-cache --add hello example 135------------------------------------------------ 136 137and you have now told git to track those two files. 138 139In fact, as you did that, if you now look into your object directory, 140you'll notice that git will have added two new objects to the object 141database. If you did exactly the steps above, you should now be able to do 142 143 ls .git/objects/??/* 144 145and see two files: 146 147 .git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238 148 .git/objects/f2/4c74a2e500f5ee1332c86b94199f52b1d1d962 149 150which correspond with the objects with names of 557db... and f24c7.. 151respectively. 152 153If you want to, you can use `git-cat-file` to look at those objects, but 154you'll have to use the object name, not the filename of the object: 155 156 git-cat-file -t 557db03de997c86a4a028e1ebd3a1ceb225be238 157 158where the `-t` tells `git-cat-file` to tell you what the "type" of the 159object is. Git will tell you that you have a "blob" object (ie just a 160regular file), and you can see the contents with 161 162 git-cat-file "blob" 557db03 163 164which will print out "Hello World". The object 557db03 is nothing 165more than the contents of your file `hello`. 166 167[NOTE] 168Don't confuse that object with the file `hello` itself. The 169object is literally just those specific *contents* of the file, and 170however much you later change the contents in file `hello`, the object 171we just looked at will never change. Objects are immutable. 172 173[NOTE] 174The second example demonstrates that you can 175abbreviate the object name to only the first several 176hexadecimal digits in most places. 177 178Anyway, as we mentioned previously, you normally never actually take a 179look at the objects themselves, and typing long 40-character hex 180names is not something you'd normally want to do. The above digression 181was just to show that `git-update-cache` did something magical, and 182actually saved away the contents of your files into the git object 183database. 184 185Updating the cache did something else too: it created a `.git/index` 186file. This is the index that describes your current working tree, and 187something you should be very aware of. Again, you normally never worry 188about the index file itself, but you should be aware of the fact that 189you have not actually really "checked in" your files into git so far, 190you've only *told* git about them. 191 192However, since git knows about them, you can now start using some of the 193most basic git commands to manipulate the files or look at their status. 194 195In particular, let's not even check in the two files into git yet, we'll 196start off by adding another line to `hello` first: 197 198------------------------------------------------ 199echo "It's a new day for git" >>hello 200------------------------------------------------ 201 202and you can now, since you told git about the previous state of `hello`, ask 203git what has changed in the tree compared to your old index, using the 204`git-diff-files` command: 205 206------------ 207git-diff-files 208------------ 209 210Oops. That wasn't very readable. It just spit out its own internal 211version of a `diff`, but that internal version really just tells you 212that it has noticed that "hello" has been modified, and that the old object 213contents it had have been replaced with something else. 214 215To make it readable, we can tell git-diff-files to output the 216differences as a patch, using the `-p` flag: 217 218------------ 219git-diff-files -p 220------------ 221 222which will spit out 223 224------------ 225diff --git a/hello b/hello 226--- a/hello 227+++ b/hello 228@@ -1 +1,2 @@ 229 Hello World 230+It's a new day for git 231---- 232 233i.e. the diff of the change we caused by adding another line to `hello`. 234 235In other words, `git-diff-files` always shows us the difference between 236what is recorded in the index, and what is currently in the working 237tree. That's very useful. 238 239A common shorthand for `git-diff-files -p` is to just write `git 240diff`, which will do the same thing. 241 242 243Committing git state 244-------------------- 245 246Now, we want to go to the next stage in git, which is to take the files 247that git knows about in the index, and commit them as a real tree. We do 248that in two phases: creating a 'tree' object, and committing that 'tree' 249object as a 'commit' object together with an explanation of what the 250tree was all about, along with information of how we came to that state. 251 252Creating a tree object is trivial, and is done with `git-write-tree`. 253There are no options or other input: git-write-tree will take the 254current index state, and write an object that describes that whole 255index. In other words, we're now tying together all the different 256filenames with their contents (and their permissions), and we're 257creating the equivalent of a git "directory" object: 258 259------------------------------------------------ 260git-write-tree 261------------------------------------------------ 262 263and this will just output the name of the resulting tree, in this case 264(if you have done exactly as I've described) it should be 265 266 8988da15d077d4829fc51d8544c097def6644dbb 267 268which is another incomprehensible object name. Again, if you want to, 269you can use `git-cat-file -t 8988d\...` to see that this time the object 270is not a "blob" object, but a "tree" object (you can also use 271`git-cat-file` to actually output the raw object contents, but you'll see 272mainly a binary mess, so that's less interesting). 273 274However -- normally you'd never use `git-write-tree` on its own, because 275normally you always commit a tree into a commit object using the 276`git-commit-tree` command. In fact, it's easier to not actually use 277`git-write-tree` on its own at all, but to just pass its result in as an 278argument to `git-commit-tree`. 279 280`git-commit-tree` normally takes several arguments -- it wants to know 281what the 'parent' of a commit was, but since this is the first commit 282ever in this new repository, and it has no parents, we only need to pass in 283the object name of the tree. However, `git-commit-tree` 284also wants to get a commit message 285on its standard input, and it will write out the resulting object name for the 286commit to its standard output. 287 288And this is where we start using the `.git/HEAD` file. The `HEAD` file is 289supposed to contain the reference to the top-of-tree, and since that's 290exactly what `git-commit-tree` spits out, we can do this all with a simple 291shell pipeline: 292 293------------------------------------------------ 294echo "Initial commit" | git-commit-tree $(git-write-tree) > .git/HEAD 295------------------------------------------------ 296 297which will say: 298 299 Committing initial tree 8988da15d077d4829fc51d8544c097def6644dbb 300 301just to warn you about the fact that it created a totally new commit 302that is not related to anything else. Normally you do this only *once* 303for a project ever, and all later commits will be parented on top of an 304earlier commit, and you'll never see this "Committing initial tree" 305message ever again. 306 307Again, normally you'd never actually do this by hand. There is a 308helpful script called `git commit` that will do all of this for you. So 309you could have just written `git commit` 310instead, and it would have done the above magic scripting for you. 311 312 313Making a change 314--------------- 315 316Remember how we did the `git-update-cache` on file `hello` and then we 317changed `hello` afterward, and could compare the new state of `hello` with the 318state we saved in the index file? 319 320Further, remember how I said that `git-write-tree` writes the contents 321of the *index* file to the tree, and thus what we just committed was in 322fact the *original* contents of the file `hello`, not the new ones. We did 323that on purpose, to show the difference between the index state, and the 324state in the working tree, and how they don't have to match, even 325when we commit things. 326 327As before, if we do `git-diff-files -p` in our git-tutorial project, 328we'll still see the same difference we saw last time: the index file 329hasn't changed by the act of committing anything. However, now that we 330have committed something, we can also learn to use a new command: 331`git-diff-cache`. 332 333Unlike `git-diff-files`, which showed the difference between the index 334file and the working tree, `git-diff-cache` shows the differences 335between a committed *tree* and either the index file or the working 336tree. In other words, `git-diff-cache` wants a tree to be diffed 337against, and before we did the commit, we couldn't do that, because we 338didn't have anything to diff against. 339 340But now we can do 341 342 git-diff-cache -p HEAD 343 344(where `-p` has the same meaning as it did in `git-diff-files`), and it 345will show us the same difference, but for a totally different reason. 346Now we're comparing the working tree not against the index file, 347but against the tree we just wrote. It just so happens that those two 348are obviously the same, so we get the same result. 349 350Again, because this is a common operation, you can also just shorthand 351it with 352 353 git diff HEAD 354 355which ends up doing the above for you. 356 357In other words, `git-diff-cache` normally compares a tree against the 358working tree, but when given the `\--cached` flag, it is told to 359instead compare against just the index cache contents, and ignore the 360current working tree state entirely. Since we just wrote the index 361file to HEAD, doing `git-diff-cache \--cached -p HEAD` should thus return 362an empty set of differences, and that's exactly what it does. 363 364[NOTE] 365================ 366`git-diff-cache` really always uses the index for its 367comparisons, and saying that it compares a tree against the working 368tree is thus not strictly accurate. In particular, the list of 369files to compare (the "meta-data") *always* comes from the index file, 370regardless of whether the `\--cached` flag is used or not. The `\--cached` 371flag really only determines whether the file *contents* to be compared 372come from the working tree or not. 373 374This is not hard to understand, as soon as you realize that git simply 375never knows (or cares) about files that it is not told about 376explicitly. Git will never go *looking* for files to compare, it 377expects you to tell it what the files are, and that's what the index 378is there for. 379================ 380 381However, our next step is to commit the *change* we did, and again, to 382understand what's going on, keep in mind the difference between "working 383tree contents", "index file" and "committed tree". We have changes 384in the working tree that we want to commit, and we always have to 385work through the index file, so the first thing we need to do is to 386update the index cache: 387 388------------------------------------------------ 389git-update-cache hello 390------------------------------------------------ 391 392(note how we didn't need the `\--add` flag this time, since git knew 393about the file already). 394 395Note what happens to the different `git-diff-\*` versions here. After 396we've updated `hello` in the index, `git-diff-files -p` now shows no 397differences, but `git-diff-cache -p HEAD` still *does* show that the 398current state is different from the state we committed. In fact, now 399`git-diff-cache` shows the same difference whether we use the `--cached` 400flag or not, since now the index is coherent with the working tree. 401 402Now, since we've updated `hello` in the index, we can commit the new 403version. We could do it by writing the tree by hand again, and 404committing the tree (this time we'd have to use the `-p HEAD` flag to 405tell commit that the HEAD was the *parent* of the new commit, and that 406this wasn't an initial commit any more), but you've done that once 407already, so let's just use the helpful script this time: 408 409------------------------------------------------ 410git commit 411------------------------------------------------ 412 413which starts an editor for you to write the commit message and tells you 414a bit about what you have done. 415 416Write whatever message you want, and all the lines that start with '#' 417will be pruned out, and the rest will be used as the commit message for 418the change. If you decide you don't want to commit anything after all at 419this point (you can continue to edit things and update the cache), you 420can just leave an empty message. Otherwise `git commit` will commit 421the change for you. 422 423You've now made your first real git commit. And if you're interested in 424looking at what `git commit` really does, feel free to investigate: 425it's a few very simple shell scripts to generate the helpful (?) commit 426message headers, and a few one-liners that actually do the 427commit itself (`git-commit-script`). 428 429 430Checking it out 431--------------- 432 433While creating changes is useful, it's even more useful if you can tell 434later what changed. The most useful command for this is another of the 435`diff` family, namely `git-diff-tree`. 436 437`git-diff-tree` can be given two arbitrary trees, and it will tell you the 438differences between them. Perhaps even more commonly, though, you can 439give it just a single commit object, and it will figure out the parent 440of that commit itself, and show the difference directly. Thus, to get 441the same diff that we've already seen several times, we can now do 442 443 git-diff-tree -p HEAD 444 445(again, `-p` means to show the difference as a human-readable patch), 446and it will show what the last commit (in `HEAD`) actually changed. 447 448More interestingly, you can also give `git-diff-tree` the `-v` flag, which 449tells it to also show the commit message and author and date of the 450commit, and you can tell it to show a whole series of diffs. 451Alternatively, you can tell it to be "silent", and not show the diffs at 452all, but just show the actual commit message. 453 454In fact, together with the `git-rev-list` program (which generates a 455list of revisions), `git-diff-tree` ends up being a veritable fount of 456changes. A trivial (but very useful) script called `git-whatchanged` is 457included with git which does exactly this, and shows a log of recent 458activities. 459 460To see the whole history of our pitiful little git-tutorial project, you 461can do 462 463 git log 464 465which shows just the log messages, or if we want to see the log together 466with the associated patches use the more complex (and much more 467powerful) 468 469 git-whatchanged -p --root 470 471and you will see exactly what has changed in the repository over its 472short history. 473 474[NOTE] 475The `\--root` flag is a flag to `git-diff-tree` to tell it to 476show the initial aka 'root' commit too. Normally you'd probably not 477want to see the initial import diff, but since the tutorial project 478was started from scratch and is so small, we use it to make the result 479a bit more interesting. 480 481With that, you should now be having some inkling of what git does, and 482can explore on your own. 483 484[NOTE] 485Most likely, you are not directly using the core 486git Plumbing commands, but using Porcelain like Cogito on top 487of it. Cogito works a bit differently and you usually do not 488have to run `git-update-cache` yourself for changed files (you 489do tell underlying git about additions and removals via 490`cg-add` and `cg-rm` commands). Just before you make a commit 491with `cg-commit`, Cogito figures out which files you modified, 492and runs `git-update-cache` on them for you. 493 494 495Tagging a version 496----------------- 497 498In git, there are two kinds of tags, a "light" one, and an "annotated tag". 499 500A "light" tag is technically nothing more than a branch, except we put 501it in the `.git/refs/tags/` subdirectory instead of calling it a `head`. 502So the simplest form of tag involves nothing more than 503 504------------------------------------------------ 505git tag my-first-tag 506------------------------------------------------ 507 508which just writes the current `HEAD` into the `.git/refs/tags/my-first-tag` 509file, after which point you can then use this symbolic name for that 510particular state. You can, for example, do 511 512 git diff my-first-tag 513 514to diff your current state against that tag (which at this point will 515obviously be an empty diff, but if you continue to develop and commit 516stuff, you can use your tag as an "anchor-point" to see what has changed 517since you tagged it. 518 519An "annotated tag" is actually a real git object, and contains not only a 520pointer to the state you want to tag, but also a small tag name and 521message, along with optionally a PGP signature that says that yes, 522you really did 523that tag. You create these annotated tags with either the `-a` or 524`-s` flag to `git tag`: 525 526 git tag -s <tagname> 527 528which will sign the current `HEAD` (but you can also give it another 529argument that specifies the thing to tag, ie you could have tagged the 530current `mybranch` point by using `git tag <tagname> mybranch`). 531 532You normally only do signed tags for major releases or things 533like that, while the light-weight tags are useful for any marking you 534want to do -- any time you decide that you want to remember a certain 535point, just create a private tag for it, and you have a nice symbolic 536name for the state at that point. 537 538 539Copying repositories 540-------------------- 541 542Git repositories are normally totally self-sufficient, and it's worth noting 543that unlike CVS, for example, there is no separate notion of 544"repository" and "working tree". A git repository normally *is* the 545working tree, with the local git information hidden in the `.git` 546subdirectory. There is nothing else. What you see is what you got. 547 548[NOTE] 549You can tell git to split the git internal information from 550the directory that it tracks, but we'll ignore that for now: it's not 551how normal projects work, and it's really only meant for special uses. 552So the mental model of "the git information is always tied directly to 553the working tree that it describes" may not be technically 100% 554accurate, but it's a good model for all normal use. 555 556This has two implications: 557 558 - if you grow bored with the tutorial repository you created (or you've 559 made a mistake and want to start all over), you can just do simple 560 561 rm -rf git-tutorial 562+ 563and it will be gone. There's no external repository, and there's no 564history outside the project you created. 565 566 - if you want to move or duplicate a git repository, you can do so. There 567 is `git clone` command, but if all you want to do is just to 568 create a copy of your repository (with all the full history that 569 went along with it), you can do so with a regular 570 `cp -a git-tutorial new-git-tutorial`. 571+ 572Note that when you've moved or copied a git repository, your git index 573file (which caches various information, notably some of the "stat" 574information for the files involved) will likely need to be refreshed. 575So after you do a `cp -a` to create a new copy, you'll want to do 576 577 git-update-cache --refresh 578+ 579in the new repository to make sure that the index file is up-to-date. 580 581Note that the second point is true even across machines. You can 582duplicate a remote git repository with *any* regular copy mechanism, be it 583`scp`, `rsync` or `wget`. 584 585When copying a remote repository, you'll want to at a minimum update the 586index cache when you do this, and especially with other peoples' 587repositories you often want to make sure that the index cache is in some 588known state (you don't know *what* they've done and not yet checked in), 589so usually you'll precede the `git-update-cache` with a 590 591 git-read-tree --reset HEAD 592 git-update-cache --refresh 593 594which will force a total index re-build from the tree pointed to by `HEAD`. 595It resets the index contents to `HEAD`, and then the `git-update-cache` 596makes sure to match up all index entries with the checked-out files. 597If the original repository had uncommitted changes in its 598working tree, `git-update-cache --refresh` notices them and 599tells you they need to be updated. 600 601The above can also be written as simply 602 603 git reset 604 605and in fact a lot of the common git command combinations can be scripted 606with the `git xyz` interfaces, and you can learn things by just looking 607at what the `git-*-script` scripts do (`git reset` is the above two lines 608implemented in `git-reset-script`, but some things like `git status` and 609`git commit` are slightly more complex scripts around the basic git 610commands). 611 612Many (most?) public remote repositories will not contain any of 613the checked out files or even an index file, and will *only* contain the 614actual core git files. Such a repository usually doesn't even have the 615`.git` subdirectory, but has all the git files directly in the 616repository. 617 618To create your own local live copy of such a "raw" git repository, you'd 619first create your own subdirectory for the project, and then copy the 620raw repository contents into the `.git` directory. For example, to 621create your own copy of the git repository, you'd do the following 622 623 mkdir my-git 624 cd my-git 625 rsync -rL rsync://rsync.kernel.org/pub/scm/git/git.git/ .git 626 627followed by 628 629 git-read-tree HEAD 630 631to populate the index. However, now you have populated the index, and 632you have all the git internal files, but you will notice that you don't 633actually have any of the working tree files to work on. To get 634those, you'd check them out with 635 636 git-checkout-cache -u -a 637 638where the `-u` flag means that you want the checkout to keep the index 639up-to-date (so that you don't have to refresh it afterward), and the 640`-a` flag means "check out all files" (if you have a stale copy or an 641older version of a checked out tree you may also need to add the `-f` 642flag first, to tell git-checkout-cache to *force* overwriting of any old 643files). 644 645Again, this can all be simplified with 646 647 git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ my-git 648 cd my-git 649 git checkout 650 651which will end up doing all of the above for you. 652 653You have now successfully copied somebody else's (mine) remote 654repository, and checked it out. 655 656 657Creating a new branch 658--------------------- 659 660Branches in git are really nothing more than pointers into the git 661object database from within the `.git/refs/` subdirectory, and as we 662already discussed, the `HEAD` branch is nothing but a symlink to one of 663these object pointers. 664 665You can at any time create a new branch by just picking an arbitrary 666point in the project history, and just writing the SHA1 name of that 667object into a file under `.git/refs/heads/`. You can use any filename you 668want (and indeed, subdirectories), but the convention is that the 669"normal" branch is called `master`. That's just a convention, though, 670and nothing enforces it. 671 672To show that as an example, let's go back to the git-tutorial repository we 673used earlier, and create a branch in it. You do that by simply just 674saying that you want to check out a new branch: 675 676------------ 677git checkout -b mybranch 678------------ 679 680will create a new branch based at the current `HEAD` position, and switch 681to it. 682 683[NOTE] 684================================================ 685If you make the decision to start your new branch at some 686other point in the history than the current `HEAD`, you can do so by 687just telling `git checkout` what the base of the checkout would be. 688In other words, if you have an earlier tag or branch, you'd just do 689 690 git checkout -b mybranch earlier-commit 691 692and it would create the new branch `mybranch` at the earlier commit, 693and check out the state at that time. 694================================================ 695 696You can always just jump back to your original `master` branch by doing 697 698 git checkout master 699 700(or any other branch-name, for that matter) and if you forget which 701branch you happen to be on, a simple 702 703 ls -l .git/HEAD 704 705will tell you where it's pointing. To get the list of branches 706you have, you can say 707 708 git branch 709 710which is nothing more than a simple script around `ls .git/refs/heads`. 711There will be asterisk in front of the branch you are currently on. 712 713Sometimes you may wish to create a new branch _without_ actually 714checking it out and switching to it. If so, just use the command 715 716 git branch <branchname> [startingpoint] 717 718which will simply _create_ the branch, but will not do anything further. 719You can then later -- once you decide that you want to actually develop 720on that branch -- switch to that branch with a regular `git checkout` 721with the branchname as the argument. 722 723 724Merging two branches 725-------------------- 726 727One of the ideas of having a branch is that you do some (possibly 728experimental) work in it, and eventually merge it back to the main 729branch. So assuming you created the above `mybranch` that started out 730being the same as the original `master` branch, let's make sure we're in 731that branch, and do some work there. 732 733------------------------------------------------ 734git checkout mybranch 735echo "Work, work, work" >>hello 736git commit -m 'Some work.' hello 737------------------------------------------------ 738 739Here, we just added another line to `hello`, and we used a shorthand for 740both going a `git-update-cache hello` and `git commit` by just giving the 741filename directly to `git commit`. The `-m` flag is to give the 742commit log message from the command line. 743 744Now, to make it a bit more interesting, let's assume that somebody else 745does some work in the original branch, and simulate that by going back 746to the master branch, and editing the same file differently there: 747 748------------ 749git checkout master 750------------ 751 752Here, take a moment to look at the contents of `hello`, and notice how they 753don't contain the work we just did in `mybranch` -- because that work 754hasn't happened in the `master` branch at all. Then do 755 756------------ 757echo "Play, play, play" >>hello 758echo "Lots of fun" >>example 759git commit -m 'Some fun.' hello example 760------------ 761 762since the master branch is obviously in a much better mood. 763 764Now, you've got two branches, and you decide that you want to merge the 765work done. Before we do that, let's introduce a cool graphical tool that 766helps you view what's going on: 767 768 gitk --all 769 770will show you graphically both of your branches (that's what the `\--all` 771means: normally it will just show you your current `HEAD`) and their 772histories. You can also see exactly how they came to be from a common 773source. 774 775Anyway, let's exit `gitk` (`^Q` or the File menu), and decide that we want 776to merge the work we did on the `mybranch` branch into the `master` 777branch (which is currently our `HEAD` too). To do that, there's a nice 778script called `git resolve`, which wants to know which branches you want 779to resolve and what the merge is all about: 780 781------------ 782git resolve HEAD mybranch "Merge work in mybranch" 783------------ 784 785where the third argument is going to be used as the commit message if 786the merge can be resolved automatically. 787 788Now, in this case we've intentionally created a situation where the 789merge will need to be fixed up by hand, though, so git will do as much 790of it as it can automatically (which in this case is just merge the `example` 791file, which had no differences in the `mybranch` branch), and say: 792 793 Simple merge failed, trying Automatic merge 794 Auto-merging hello. 795 merge: warning: conflicts during merge 796 ERROR: Merge conflict in hello. 797 fatal: merge program failed 798 Automatic merge failed, fix up by hand 799 800which is way too verbose, but it basically tells you that it failed the 801really trivial merge ("Simple merge") and did an "Automatic merge" 802instead, but that too failed due to conflicts in `hello`. 803 804Not to worry. It left the (trivial) conflict in `hello` in the same form you 805should already be well used to if you've ever used CVS, so let's just 806open `hello` in our editor (whatever that may be), and fix it up somehow. 807I'd suggest just making it so that `hello` contains all four lines: 808 809------------ 810Hello World 811It's a new day for git 812Play, play, play 813Work, work, work 814------------ 815 816and once you're happy with your manual merge, just do a 817 818------------ 819git commit hello 820------------ 821 822which will very loudly warn you that you're now committing a merge 823(which is correct, so never mind), and you can write a small merge 824message about your adventures in git-merge-land. 825 826After you're done, start up `gitk --all` to see graphically what the 827history looks like. Notice that `mybranch` still exists, and you can 828switch to it, and continue to work with it if you want to. The 829`mybranch` branch will not contain the merge, but next time you merge it 830from the `master` branch, git will know how you merged it, so you'll not 831have to do _that_ merge again. 832 833Another useful tool, especially if you do not always work in X-Window 834environment, is `git show-branch`. 835 836------------------------------------------------ 837$ git show-branch master mybranch 838* [master] Merged "mybranch" changes. 839 ! [mybranch] Some work. 840-- 841+ [master] Merged "mybranch" changes. 842+ [master~1] Some fun. 843++ [mybranch] Some work. 844------------------------------------------------ 845 846The first two lines indicate that it is showing the two branches 847and the first line of the commit log message from their 848top-of-the-tree commits, you are currently on `master` branch 849(notice the asterisk `*` character), and the first column for 850the later output lines is used to show commits contained in the 851`master` branch, and the second column for the `mybranch` 852branch. Three commits are shown along with their log messages. 853All of them have plus `+` characters in the first column, which 854means they are now part of the `master` branch. Only the "Some 855work" commit has the plus `+` character in the second column, 856because `mybranch` has not been merged to incorporate these 857commits from the master branch. 858 859Now, let's pretend you are the one who did all the work in 860`mybranch`, and the fruit of your hard work has finally been merged 861to the `master` branch. Let's go back to `mybranch`, and run 862resolve to get the "upstream changes" back to your branch. 863 864 git checkout mybranch 865 git resolve HEAD master "Merge upstream changes." 866 867This outputs something like this (the actual commit object names 868would be different) 869 870 Updating from ae3a2da... to a80b4aa.... 871 example | 1 + 872 hello | 1 + 873 2 files changed, 2 insertions(+), 0 deletions(-) 874 875Because your branch did not contain anything more than what are 876already merged into the `master` branch, the resolve operation did 877not actually do a merge. Instead, it just updated the top of 878the tree of your branch to that of the `master` branch. This is 879often called 'fast forward' merge. 880 881You can run `gitk --all` again to see how the commit ancestry 882looks like, or run `show-branch`, which tells you this. 883 884------------------------------------------------ 885$ git show-branch master mybranch 886! [master] Merged "mybranch" changes. 887 * [mybranch] Merged "mybranch" changes. 888-- 889++ [master] Merged "mybranch" changes. 890------------------------------------------------ 891 892 893Merging external work 894--------------------- 895 896It's usually much more common that you merge with somebody else than 897merging with your own branches, so it's worth pointing out that git 898makes that very easy too, and in fact, it's not that different from 899doing a `git resolve`. In fact, a remote merge ends up being nothing 900more than "fetch the work from a remote repository into a temporary tag" 901followed by a `git resolve`. 902 903Fetching from a remote repository is done by, unsurprisingly, 904`git fetch`: 905 906 git fetch <remote-repository> 907 908One of the following transports can be used to name the 909repository to download from: 910 911Rsync:: 912 `rsync://remote.machine/path/to/repo.git/` 913+ 914Rsync transport is usable for both uploading and downloading, 915but is completely unaware of what git does, and can produce 916unexpected results when you download from the public repository 917while the repository owner is uploading into it via `rsync` 918transport. Most notably, it could update the files under 919`refs/` which holds the object name of the topmost commits 920before uploading the files in `objects/` -- the downloader would 921obtain head commit object name while that object itself is still 922not available in the repository. For this reason, it is 923considered deprecated. 924 925SSH:: 926 `remote.machine:/path/to/repo.git/` or 927+ 928`ssh://remote.machine/path/to/repo.git/` 929+ 930This transport can be used for both uploading and downloading, 931and requires you to have a log-in privilege over `ssh` to the 932remote machine. It finds out the set of objects the other side 933lacks by exchanging the head commits both ends have and 934transfers (close to) minimum set of objects. It is by far the 935most efficient way to exchange git objects between repositories. 936 937Local directory:: 938 `/path/to/repo.git/` 939+ 940This transport is the same as SSH transport but uses `sh` to run 941both ends on the local machine instead of running other end on 942the remote machine via `ssh`. 943 944GIT Native:: 945 `git://remote.machine/path/to/repo.git/` 946+ 947This transport was designed for anonymous downloading. Like SSH 948transport, it finds out the set of objects the downstream side 949lacks and transfers (close to) minimum set of objects. 950 951HTTP(s):: 952 `http://remote.machine/path/to/repo.git/` 953+ 954HTTP and HTTPS transport are used only for downloading. They 955first obtain the topmost commit object name from the remote site 956by looking at `repo.git/info/refs` file, tries to obtain the 957commit object by downloading from `repo.git/objects/xx/xxx\...` 958using the object name of that commit object. Then it reads the 959commit object to find out its parent commits and the associate 960tree object; it repeats this process until it gets all the 961necessary objects. Because of this behaviour, they are 962sometimes also called 'commit walkers'. 963+ 964The 'commit walkers' are sometimes also called 'dumb 965transports', because they do not require any GIT aware smart 966server like GIT Native transport does. Any stock HTTP server 967would suffice. 968+ 969There are (confusingly enough) `git-ssh-pull` and `git-ssh-push` 970programs, which are 'commit walkers'; they outlived their 971usefulness when GIT Native and SSH transports were introduced, 972and not used by `git pull` or `git push` scripts. 973 974Once you fetch from the remote repository, you `resolve` that 975with your current branch. 976 977However -- it's such a common thing to `fetch` and then 978immediately `resolve`, that it's called `git pull`, and you can 979simply do 980 981 git pull <remote-repository> 982 983and optionally give a branch-name for the remote end as a second 984argument. 985 986[NOTE] 987You could do without using any branches at all, by 988keeping as many local repositories as you would like to have 989branches, and merging between them with `git pull`, just like 990you merge between branches. The advantage of this approach is 991that it lets you keep set of files for each `branch` checked 992out and you may find it easier to switch back and forth if you 993juggle multiple lines of development simultaneously. Of 994course, you will pay the price of more disk usage to hold 995multiple working trees, but disk space is cheap these days. 996 997[NOTE] 998You could even pull from your own repository by 999giving '.' as <remote-repository> parameter to `git pull`.10001001It is likely that you will be pulling from the same remote1002repository from time to time. As a short hand, you can store1003the remote repository URL in a file under .git/remotes/1004directory, like this:10051006------------------------------------------------1007mkdir -p .git/remotes/1008cat >.git/remotes/linus <<\EOF1009URL: http://www.kernel.org/pub/scm/git/git.git/1010EOF1011------------------------------------------------10121013and use the filename to `git pull` instead of the full URL.1014The URL specified in such file can even be a prefix1015of a full URL, like this:10161017------------------------------------------------1018cat >.git/remotes/jgarzik <<\EOF1019URL: http://www.kernel.org/pub/scm/linux/git/jgarzik/1020EOF1021------------------------------------------------102210231024Examples.10251026. `git pull linus`1027. `git pull linus tag v0.99.1`1028. `git pull jgarzik/netdev-2.6.git/ e100`10291030the above are equivalent to:10311032. `git pull http://www.kernel.org/pub/scm/git/git.git/ HEAD`1033. `git pull http://www.kernel.org/pub/scm/git/git.git/ tag v0.99.1`1034. `git pull http://www.kernel.org/pub/.../jgarzik/netdev-2.6.git e100`103510361037Publishing your work1038--------------------10391040So we can use somebody else's work from a remote repository; but1041how can *you* prepare a repository to let other people pull from1042it?10431044Your do your real work in your working tree that has your1045primary repository hanging under it as its `.git` subdirectory.1046You *could* make that repository accessible remotely and ask1047people to pull from it, but in practice that is not the way1048things are usually done. A recommended way is to have a public1049repository, make it reachable by other people, and when the1050changes you made in your primary working tree are in good shape,1051update the public repository from it. This is often called1052'pushing'.10531054[NOTE]1055This public repository could further be mirrored, and that is1056how git repositories at `kernel.org` are managed.10571058Publishing the changes from your local (private) repository to1059your remote (public) repository requires a write privilege on1060the remote machine. You need to have an SSH account there to1061run a single command, `git-receive-pack`.10621063First, you need to create an empty repository on the remote1064machine that will house your public repository. This empty1065repository will be populated and be kept up-to-date by pushing1066into it later. Obviously, this repository creation needs to be1067done only once.10681069[NOTE]1070`git push` uses a pair of programs,1071`git-send-pack` on your local machine, and `git-receive-pack`1072on the remote machine. The communication between the two over1073the network internally uses an SSH connection.10741075Your private repository's GIT directory is usually `.git`, but1076your public repository is often named after the project name,1077i.e. `<project>.git`. Let's create such a public repository for1078project `my-git`. After logging into the remote machine, create1079an empty directory:10801081 mkdir my-git.git10821083Then, make that directory into a GIT repository by running1084`git init-db`, but this time, since its name is not the usual1085`.git`, we do things slightly differently:10861087 GIT_DIR=my-git.git git-init-db10881089Make sure this directory is available for others you want your1090changes to be pulled by via the transport of your choice. Also1091you need to make sure that you have the `git-receive-pack`1092program on the `$PATH`.10931094[NOTE]1095Many installations of sshd do not invoke your shell as the login1096shell when you directly run programs; what this means is that if1097your login shell is `bash`, only `.bashrc` is read and not1098`.bash_profile`. As a workaround, make sure `.bashrc` sets up1099`$PATH` so that you can run `git-receive-pack` program.11001101Your "public repository" is now ready to accept your changes.1102Come back to the machine you have your private repository. From1103there, run this command:11041105 git push <public-host>:/path/to/my-git.git master11061107This synchronizes your public repository to match the named1108branch head (i.e. `master` in this case) and objects reachable1109from them in your current repository.11101111As a real example, this is how I update my public git1112repository. Kernel.org mirror network takes care of the1113propagation to other publicly visible machines:11141115 git push master.kernel.org:/pub/scm/git/git.git/ 111611171118Packing your repository1119-----------------------11201121Earlier, we saw that one file under `.git/objects/??/` directory1122is stored for each git object you create. This representation1123is efficient to create atomically and safely, but1124not so convenient to transport over the network. Since git objects are1125immutable once they are created, there is a way to optimize the1126storage by "packing them together". The command11271128 git repack11291130will do it for you. If you followed the tutorial examples, you1131would have accumulated about 17 objects in `.git/objects/??/`1132directories by now. `git repack` tells you how many objects it1133packed, and stores the packed file in `.git/objects/pack`1134directory.11351136[NOTE]1137You will see two files, `pack-\*.pack` and `pack-\*.idx`,1138in `.git/objects/pack` directory. They are closely related to1139each other, and if you ever copy them by hand to a different1140repository for whatever reason, you should make sure you copy1141them together. The former holds all the data from the objects1142in the pack, and the latter holds the index for random1143access.11441145If you are paranoid, running `git-verify-pack` command would1146detect if you have a corrupt pack, but do not worry too much.1147Our programs are always perfect ;-).11481149Once you have packed objects, you do not need to leave the1150unpacked objects that are contained in the pack file anymore.11511152 git prune-packed11531154would remove them for you.11551156You can try running `find .git/objects -type f` before and after1157you run `git prune-packed` if you are curious. Also `git1158count-objects` would tell you how many unpacked objects are in1159your repository and how much space they are consuming.11601161[NOTE]1162`git pull` is slightly cumbersome for HTTP transport, as a1163packed repository may contain relatively few objects in a1164relatively large pack. If you expect many HTTP pulls from your1165public repository you might want to repack & prune often, or1166never.11671168If you run `git repack` again at this point, it will say1169"Nothing to pack". Once you continue your development and1170accumulate the changes, running `git repack` again will create a1171new pack, that contains objects created since you packed your1172repository the last time. We recommend that you pack your project1173soon after the initial import (unless you are starting your1174project from scratch), and then run `git repack` every once in a1175while, depending on how active your project is.11761177When a repository is synchronized via `git push` and `git pull`1178objects packed in the source repository are usually stored1179unpacked in the destination, unless rsync transport is used.1180While this allows you to use different packing strategies on1181both ends, it also means you may need to repack both1182repositories every once in a while.118311841185Working with Others1186-------------------11871188Although git is a truly distributed system, it is often1189convenient to organize your project with an informal hierarchy1190of developers. Linux kernel development is run this way. There1191is a nice illustration (page 17, "Merges to Mainline") in Randy1192Dunlap's presentation (`http://tinyurl.com/a2jdg`).11931194It should be stressed that this hierarchy is purely *informal*.1195There is nothing fundamental in git that enforces the "chain of1196patch flow" this hierarchy implies. You do not have to pull1197from only one remote repository.11981199A recommended workflow for a "project lead" goes like this:120012011. Prepare your primary repository on your local machine. Your1202 work is done there.120312042. Prepare a public repository accessible to others.1205+1206If other people are pulling from your repository over dumb1207transport protocols, you need to keep this repository 'dumb1208transport friendly'. After `git init-db`,1209`$GIT_DIR/hooks/post-update` copied from the standard templates1210would contain a call to `git-update-server-info` but the1211`post-update` hook itself is disabled by default -- enable it1212with `chmod +x post-update`.121312143. Push into the public repository from your primary1215 repository.121612174. `git repack` the public repository. This establishes a big1218 pack that contains the initial set of objects as the1219 baseline, and possibly `git prune` if the transport1220 used for pulling from your repository supports packed1221 repositories.122212235. Keep working in your primary repository. Your changes1224 include modifications of your own, patches you receive via1225 e-mails, and merges resulting from pulling the "public"1226 repositories of your "subsystem maintainers".1227+1228You can repack this private repository whenever you feel like.122912306. Push your changes to the public repository, and announce it1231 to the public.123212337. Every once in a while, "git repack" the public repository.1234 Go back to step 5. and continue working.123512361237A recommended work cycle for a "subsystem maintainer" who works1238on that project and has an own "public repository" goes like this:123912401. Prepare your work repository, by `git clone` the public1241 repository of the "project lead". The URL used for the1242 initial cloning is stored in `.git/remotes/origin`.124312442. Prepare a public repository accessible to others, just like1245 the "project lead" person does.124612473. Copy over the packed files from "project lead" public1248 repository to your public repository.124912504. Push into the public repository from your primary1251 repository. Run `git repack`, and possibly `git prune` if the1252 transport used for pulling from your repository supports1253 packed repositories.125412555. Keep working in your primary repository. Your changes1256 include modifications of your own, patches you receive via1257 e-mails, and merges resulting from pulling the "public"1258 repositories of your "project lead" and possibly your1259 "sub-subsystem maintainers".1260+1261You can repack this private repository whenever you feel1262like.126312646. Push your changes to your public repository, and ask your1265 "project lead" and possibly your "sub-subsystem1266 maintainers" to pull from it.126712687. Every once in a while, `git repack` the public repository.1269 Go back to step 5. and continue working.127012711272A recommended work cycle for an "individual developer" who does1273not have a "public" repository is somewhat different. It goes1274like this:127512761. Prepare your work repository, by `git clone` the public1277 repository of the "project lead" (or a "subsystem1278 maintainer", if you work on a subsystem). The URL used for1279 the initial cloning is stored in `.git/remotes/origin`.128012812. Do your work in your repository on 'master' branch.128212833. Run `git fetch origin` from the public repository of your1284 upstream every once in a while. This does only the first1285 half of `git pull` but does not merge. The head of the1286 public repository is stored in `.git/refs/heads/origin`.128712884. Use `git cherry origin` to see which ones of your patches1289 were accepted, and/or use `git rebase origin` to port your1290 unmerged changes forward to the updated upstream.129112925. Use `git format-patch origin` to prepare patches for e-mail1293 submission to your upstream and send it out. Go back to1294 step 2. and continue.129512961297Working with Others, Shared Repository Style1298--------------------------------------------12991300If you are coming from CVS background, the style of cooperation1301suggested in the previous section may be new to you. You do not1302have to worry. git supports "shared public repository" style of1303cooperation you are probably more familiar with as well.13041305For this, set up a public repository on a machine that is1306reachable via SSH by people with "commit privileges". Put the1307committers in the same user group and make the repository1308writable by that group.13091310You, as an individual committer, then:13111312- First clone the shared repository to a local repository:1313------------------------------------------------1314$ git clone repo.shared.xz:/pub/scm/project.git/ my-project1315$ cd my-project1316$ hack away1317------------------------------------------------13181319- Merge the work others might have done while you were hacking1320 away:1321------------------------------------------------1322$ git pull origin1323$ test the merge result1324------------------------------------------------1325[NOTE]1326================================1327The first `git clone` would have placed the following in1328`my-project/.git/remotes/origin` file, and that's why this and1329the next step work.1330------------1331URL: repo.shared.xz:/pub/scm/project.git/ my-project1332Pull: master:origin1333------------1334================================13351336- push your work as the new head of the shared1337 repository.1338------------------------------------------------1339$ git push origin master1340------------------------------------------------1341If somebody else pushed into the same shared repository while1342you were working locally, `git push` in the last step would1343complain, telling you that the remote `master` head does not1344fast forward. You need to pull and merge those other changes1345back before you push your work when it happens.134613471348[ to be continued.. cvsimports ]