1A short git tutorial 2==================== 3v0.99.5, Aug 2005 4 5Introduction 6------------ 7 8This is trying to be a short tutorial on setting up and using a git 9repository, mainly because being hands-on and using explicit examples is 10often the best way of explaining what is going on. 11 12In normal life, most people wouldn't use the "core" git programs 13directly, but rather script around them to make them more palatable. 14Understanding the core git stuff may help some people get those scripts 15done, though, and it may also be instructive in helping people 16understand what it is that the higher-level helper scripts are actually 17doing. 18 19The core git is often called "plumbing", with the prettier user 20interfaces on top of it called "porcelain". You may not want to use the 21plumbing directly very often, but it can be good to know what the 22plumbing does for when the porcelain isn't flushing... 23 24 25Creating a git repository 26------------------------- 27 28Creating a new git repository couldn't be easier: all git repositories start 29out empty, and the only thing you need to do is find yourself a 30subdirectory that you want to use as a working tree - either an empty 31one for a totally new project, or an existing working tree that you want 32to import into git. 33 34For our first example, we're going to start a totally new repository from 35scratch, with no pre-existing files, and we'll call it `git-tutorial`. 36To start up, create a subdirectory for it, change into that 37subdirectory, and initialize the git infrastructure with `git-init-db`: 38 39------------------------------------------------ 40mkdir git-tutorial 41cd git-tutorial 42git-init-db 43------------------------------------------------ 44 45to which git will reply 46 47 defaulting to local storage area 48 49which is just git's way of saying that you haven't been doing anything 50strange, and that it will have created a local .git directory setup for 51your new project. You will now have a `.git` directory, and you can 52inspect that with `ls`. For your new empty project, it should show you 53three entries, among other things: 54 55 - a symlink called `HEAD`, pointing to `refs/heads/master` 56+ 57Don't worry about the fact that the file that the `HEAD` link points to 58doesn't even exist yet - you haven't created the commit that will 59start your `HEAD` development branch yet. 60 61 - a subdirectory called `objects`, which will contain all the 62 objects of your project. You should never have any real reason to 63 look at the objects directly, but you might want to know that these 64 objects are what contains all the real 'data' in your repository. 65 66 - a subdirectory called `refs`, which contains references to objects. 67 68In particular, the `refs` subdirectory will contain two other 69subdirectories, named `heads` and `tags` respectively. They do 70exactly what their names imply: they contain references to any number 71of different 'heads' of development (aka 'branches'), and to any 72'tags' that you have created to name specific versions in your 73repository. 74 75One note: the special `master` head is the default branch, which is 76why the `.git/HEAD` file was created as a symlink to it even if it 77doesn't yet exist. Basically, the `HEAD` link is supposed to always 78point to the branch you are working on right now, and you always 79start out expecting to work on the `master` branch. 80 81However, this is only a convention, and you can name your branches 82anything you want, and don't have to ever even 'have' a `master` 83branch. A number of the git tools will assume that `.git/HEAD` is 84valid, though. 85 86[NOTE] 87An "object" is identified by its 160-bit SHA1 hash, aka "name", 88and a reference to an object is always the 40-byte hex 89representation of that SHA1 name. The files in the "refs" 90subdirectory are expected to contain these hex references 91(usually with a final '\n' at the end), and you should thus 92expect to see a number of 41-byte files containing these 93references in this refs subdirectories when you actually start 94populating your tree. 95 96You have now created your first git repository. Of course, since it's 97empty, that's not very useful, so let's start populating it with data. 98 99 100Populating a git repository 101--------------------------- 102 103We'll keep this simple and stupid, so we'll start off with populating a 104few trivial files just to get a feel for it. 105 106Start off with just creating any random files that you want to maintain 107in your git repository. We'll start off with a few bad examples, just to 108get a feel for how this works: 109 110------------------------------------------------ 111echo "Hello World" >hello 112echo "Silly example" >example 113------------------------------------------------ 114 115you have now created two files in your working tree (aka "working directory"), but to 116actually check in your hard work, you will have to go through two steps: 117 118 - fill in the "index" file (aka "cache") with the information about your 119 working tree state. 120 121 - commit that index file as an object. 122 123The first step is trivial: when you want to tell git about any changes 124to your working tree, you use the `git-update-cache` program. That 125program normally just takes a list of filenames you want to update, but 126to avoid trivial mistakes, it refuses to add new entries to the cache 127(or remove existing ones) unless you explicitly tell it that you're 128adding a new entry with the `--add` flag (or removing an entry with the 129`--remove`) flag. 130 131So to populate the index with the two files you just created, you can do 132 133------------------------------------------------ 134git-update-cache --add hello example 135------------------------------------------------ 136 137and you have now told git to track those two files. 138 139In fact, as you did that, if you now look into your object directory, 140you'll notice that git will have added two new objects to the object 141database. If you did exactly the steps above, you should now be able to do 142 143 ls .git/objects/??/* 144 145and see two files: 146 147 .git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238 148 .git/objects/f2/4c74a2e500f5ee1332c86b94199f52b1d1d962 149 150which correspond with the objects with names of 557db... and f24c7.. 151respectively. 152 153If you want to, you can use "git-cat-file" to look at those objects, but 154you'll have to use the object name, not the filename of the object: 155 156 git-cat-file -t 557db03de997c86a4a028e1ebd3a1ceb225be238 157 158where the "-t" tells git-cat-file to tell you what the "type" of the 159object is. Git will tell you that you have a "blob" object (ie just a 160regular file), and you can see the contents with 161 162 git-cat-file "blob" 557db03 163 164which will print out "Hello World". The object 557db03 is nothing 165more than the contents of your file "hello". 166 167[NOTE] 168Don't confuse that object with the file "hello" itself. The 169object is literally just those specific _contents_ of the file, and 170however much you later change the contents in file "hello", the object we 171just looked at will never change. Objects are immutable. 172 173[NOTE] 174The second example demonstrates that you can 175abbreviate the object name to only the first several 176hexadecimal digits in most places. 177 178Anyway, as we mentioned previously, you normally never actually take a 179look at the objects themselves, and typing long 40-character hex 180names is not something you'd normally want to do. The above digression 181was just to show that `git-update-cache` did something magical, and 182actually saved away the contents of your files into the git object 183database. 184 185Updating the cache did something else too: it created a `.git/index` 186file. This is the index that describes your current working tree, and 187something you should be very aware of. Again, you normally never worry 188about the index file itself, but you should be aware of the fact that 189you have not actually really "checked in" your files into git so far, 190you've only _told_ git about them. 191 192However, since git knows about them, you can now start using some of the 193most basic git commands to manipulate the files or look at their status. 194 195In particular, let's not even check in the two files into git yet, we'll 196start off by adding another line to "hello" first: 197 198------------------------------------------------ 199echo "It's a new day for git" >>hello 200------------------------------------------------ 201 202and you can now, since you told git about the previous state of "hello", ask 203git what has changed in the tree compared to your old index, using the 204"git-diff-files" command: 205 206 git-diff-files 207 208Oops. That wasn't very readable. It just spit out its own internal 209version of a "diff", but that internal version really just tells you 210that it has noticed that "hello" has been modified, and that the old object 211contents it had have been replaced with something else. 212 213To make it readable, we can tell git-diff-files to output the 214differences as a patch, using the "-p" flag: 215 216 git-diff-files -p 217 218which will spit out 219 220 diff --git a/hello b/hello 221 --- a/hello 222 +++ b/hello 223 @@ -1 +1,2 @@ 224 Hello World 225 +It's a new day for git 226 227ie the diff of the change we caused by adding another line to "hello". 228 229In other words, git-diff-files always shows us the difference between 230what is recorded in the index, and what is currently in the working 231tree. That's very useful. 232 233A common shorthand for "git-diff-files -p" is to just write 234 235 git diff 236 237which will do the same thing. 238 239 240Committing git state 241-------------------- 242 243Now, we want to go to the next stage in git, which is to take the files 244that git knows about in the index, and commit them as a real tree. We do 245that in two phases: creating a "tree" object, and committing that "tree" 246object as a "commit" object together with an explanation of what the 247tree was all about, along with information of how we came to that state. 248 249Creating a tree object is trivial, and is done with "git-write-tree". 250There are no options or other input: git-write-tree will take the 251current index state, and write an object that describes that whole 252index. In other words, we're now tying together all the different 253filenames with their contents (and their permissions), and we're 254creating the equivalent of a git "directory" object: 255 256------------------------------------------------ 257git-write-tree 258------------------------------------------------ 259 260and this will just output the name of the resulting tree, in this case 261(if you have done exactly as I've described) it should be 262 263 8988da15d077d4829fc51d8544c097def6644dbb 264 265which is another incomprehensible object name. Again, if you want to, 266you can use "git-cat-file -t 8988d.." to see that this time the object 267is not a "blob" object, but a "tree" object (you can also use 268git-cat-file to actually output the raw object contents, but you'll see 269mainly a binary mess, so that's less interesting). 270 271However - normally you'd never use "git-write-tree" on its own, because 272normally you always commit a tree into a commit object using the 273"git-commit-tree" command. In fact, it's easier to not actually use 274git-write-tree on its own at all, but to just pass its result in as an 275argument to "git-commit-tree". 276 277"git-commit-tree" normally takes several arguments - it wants to know 278what the _parent_ of a commit was, but since this is the first commit 279ever in this new repository, and it has no parents, we only need to pass in 280the object name of the tree. However, git-commit-tree also wants to get a commit message 281on its standard input, and it will write out the resulting object name for the 282commit to its standard output. 283 284And this is where we start using the .git/HEAD file. The HEAD file is 285supposed to contain the reference to the top-of-tree, and since that's 286exactly what git-commit-tree spits out, we can do this all with a simple 287shell pipeline: 288 289------------------------------------------------ 290echo "Initial commit" | git-commit-tree $(git-write-tree) > .git/HEAD 291------------------------------------------------ 292 293which will say: 294 295 Committing initial tree 8988da15d077d4829fc51d8544c097def6644dbb 296 297just to warn you about the fact that it created a totally new commit 298that is not related to anything else. Normally you do this only _once_ 299for a project ever, and all later commits will be parented on top of an 300earlier commit, and you'll never see this "Committing initial tree" 301message ever again. 302 303Again, normally you'd never actually do this by hand. There is a 304helpful script called "git commit" that will do all of this for you. So 305you could have just written 306 307 git commit 308 309instead, and it would have done the above magic scripting for you. 310 311 312Making a change 313--------------- 314 315Remember how we did the "git-update-cache" on file "hello" and then we 316changed "hello" afterward, and could compare the new state of "hello" with the 317state we saved in the index file? 318 319Further, remember how I said that "git-write-tree" writes the contents 320of the _index_ file to the tree, and thus what we just committed was in 321fact the _original_ contents of the file "hello", not the new ones. We did 322that on purpose, to show the difference between the index state, and the 323state in the working tree, and how they don't have to match, even 324when we commit things. 325 326As before, if we do "git-diff-files -p" in our git-tutorial project, 327we'll still see the same difference we saw last time: the index file 328hasn't changed by the act of committing anything. However, now that we 329have committed something, we can also learn to use a new command: 330"git-diff-cache". 331 332Unlike "git-diff-files", which showed the difference between the index 333file and the working tree, "git-diff-cache" shows the differences 334between a committed _tree_ and either the index file or the working 335tree. In other words, git-diff-cache wants a tree to be diffed 336against, and before we did the commit, we couldn't do that, because we 337didn't have anything to diff against. 338 339But now we can do 340 341 git-diff-cache -p HEAD 342 343(where "-p" has the same meaning as it did in git-diff-files), and it 344will show us the same difference, but for a totally different reason. 345Now we're comparing the working tree not against the index file, 346but against the tree we just wrote. It just so happens that those two 347are obviously the same, so we get the same result. 348 349Again, because this is a common operation, you can also just shorthand 350it with 351 352 git diff HEAD 353 354which ends up doing the above for you. 355 356In other words, "git-diff-cache" normally compares a tree against the 357working tree, but when given the "--cached" flag, it is told to 358instead compare against just the index cache contents, and ignore the 359current working tree state entirely. Since we just wrote the index 360file to HEAD, doing "git-diff-cache --cached -p HEAD" should thus return 361an empty set of differences, and that's exactly what it does. 362 363[NOTE] 364"git-diff-cache" really always uses the index for its 365comparisons, and saying that it compares a tree against the working 366tree is thus not strictly accurate. In particular, the list of 367files to compare (the "meta-data") _always_ comes from the index file, 368regardless of whether the --cached flag is used or not. The --cached 369flag really only determines whether the file _contents_ to be compared 370come from the working tree or not. 371+ 372This is not hard to understand, as soon as you realize that git simply 373never knows (or cares) about files that it is not told about 374explicitly. Git will never go _looking_ for files to compare, it 375expects you to tell it what the files are, and that's what the index 376is there for. 377 378However, our next step is to commit the _change_ we did, and again, to 379understand what's going on, keep in mind the difference between "working 380tree contents", "index file" and "committed tree". We have changes 381in the working tree that we want to commit, and we always have to 382work through the index file, so the first thing we need to do is to 383update the index cache: 384 385------------------------------------------------ 386git-update-cache hello 387------------------------------------------------ 388 389(note how we didn't need the "--add" flag this time, since git knew 390about the file already). 391 392Note what happens to the different git-diff-xxx versions here. After 393we've updated "hello" in the index, "git-diff-files -p" now shows no 394differences, but "git-diff-cache -p HEAD" still _does_ show that the 395current state is different from the state we committed. In fact, now 396"git-diff-cache" shows the same difference whether we use the "--cached" 397flag or not, since now the index is coherent with the working tree. 398 399Now, since we've updated "hello" in the index, we can commit the new 400version. We could do it by writing the tree by hand again, and 401committing the tree (this time we'd have to use the "-p HEAD" flag to 402tell commit that the HEAD was the _parent_ of the new commit, and that 403this wasn't an initial commit any more), but you've done that once 404already, so let's just use the helpful script this time: 405 406------------------------------------------------ 407git commit 408------------------------------------------------ 409 410which starts an editor for you to write the commit message and tells you 411a bit about what you have done. 412 413Write whatever message you want, and all the lines that start with '#' 414will be pruned out, and the rest will be used as the commit message for 415the change. If you decide you don't want to commit anything after all at 416this point (you can continue to edit things and update the cache), you 417can just leave an empty message. Otherwise git-commit-script will commit 418the change for you. 419 420You've now made your first real git commit. And if you're interested in 421looking at what git-commit-script really does, feel free to investigate: 422it's a few very simple shell scripts to generate the helpful (?) commit 423message headers, and a few one-liners that actually do the commit itself. 424 425 426Checking it out 427--------------- 428 429While creating changes is useful, it's even more useful if you can tell 430later what changed. The most useful command for this is another of the 431"diff" family, namely "git-diff-tree". 432 433git-diff-tree can be given two arbitrary trees, and it will tell you the 434differences between them. Perhaps even more commonly, though, you can 435give it just a single commit object, and it will figure out the parent 436of that commit itself, and show the difference directly. Thus, to get 437the same diff that we've already seen several times, we can now do 438 439 git-diff-tree -p HEAD 440 441(again, "-p" means to show the difference as a human-readable patch), 442and it will show what the last commit (in HEAD) actually changed. 443 444More interestingly, you can also give git-diff-tree the "-v" flag, which 445tells it to also show the commit message and author and date of the 446commit, and you can tell it to show a whole series of diffs. 447Alternatively, you can tell it to be "silent", and not show the diffs at 448all, but just show the actual commit message. 449 450In fact, together with the "git-rev-list" program (which generates a 451list of revisions), git-diff-tree ends up being a veritable fount of 452changes. A trivial (but very useful) script called "git-whatchanged" is 453included with git which does exactly this, and shows a log of recent 454activities. 455 456To see the whole history of our pitiful little git-tutorial project, you 457can do 458 459 git log 460 461which shows just the log messages, or if we want to see the log together 462with the associated patches use the more complex (and much more 463powerful) 464 465 git-whatchanged -p --root 466 467and you will see exactly what has changed in the repository over its 468short history. 469 470[NOTE] 471The "--root" flag is a flag to git-diff-tree to tell it to 472show the initial aka "root" commit too. Normally you'd probably not 473want to see the initial import diff, but since the tutorial project 474was started from scratch and is so small, we use it to make the result 475a bit more interesting. 476 477With that, you should now be having some inkling of what git does, and 478can explore on your own. 479 480[NOTE] 481Most likely, you are not directly using the core 482git Plumbing commands, but using Porcelain like Cogito on top 483of it. Cogito works a bit differently and you usually do not 484have to run "git-update-cache" yourself for changed files (you 485do tell underlying git about additions and removals via 486"cg-add" and "cg-rm" commands). Just before you make a commit 487with "cg-commit", Cogito figures out which files you modified, 488and runs "git-update-cache" on them for you. 489 490 491Tagging a version 492----------------- 493 494In git, there are two kinds of tags, a "light" one, and an "annotated tag". 495 496A "light" tag is technically nothing more than a branch, except we put 497it in the ".git/refs/tags/" subdirectory instead of calling it a "head". 498So the simplest form of tag involves nothing more than 499 500------------------------------------------------ 501git tag my-first-tag 502------------------------------------------------ 503 504which just writes the current HEAD into the .git/refs/tags/my-first-tag 505file, after which point you can then use this symbolic name for that 506particular state. You can, for example, do 507 508 git diff my-first-tag 509 510to diff your current state against that tag (which at this point will 511obviously be an empty diff, but if you continue to develop and commit 512stuff, you can use your tag as an "anchor-point" to see what has changed 513since you tagged it. 514 515An "annotated tag" is actually a real git object, and contains not only a 516pointer to the state you want to tag, but also a small tag name and 517message, along with optionally a PGP signature that says that yes, you really did 518that tag. You create these signed tags with either the "-a" or "-s" flag to "git tag": 519 520 git tag -s <tagname> 521 522which will sign the current HEAD (but you can also give it another 523argument that specifies the thing to tag, ie you could have tagged the 524current "mybranch" point by using "git tag <tagname> mybranch"). 525 526You normally only do signed tags for major releases or things 527like that, while the light-weight tags are useful for any marking you 528want to do - any time you decide that you want to remember a certain 529point, just create a private tag for it, and you have a nice symbolic 530name for the state at that point. 531 532 533Copying repositories 534-------------------- 535 536Git repositories are normally totally self-sufficient, and it's worth noting 537that unlike CVS, for example, there is no separate notion of 538"repository" and "working tree". A git repository normally _is_ the 539working tree, with the local git information hidden in the ".git" 540subdirectory. There is nothing else. What you see is what you got. 541 542[NOTE] 543You can tell git to split the git internal information from 544the directory that it tracks, but we'll ignore that for now: it's not 545how normal projects work, and it's really only meant for special uses. 546So the mental model of "the git information is always tied directly to 547the working tree that it describes" may not be technically 100% 548accurate, but it's a good model for all normal use. 549 550This has two implications: 551 552 - if you grow bored with the tutorial repository you created (or you've 553 made a mistake and want to start all over), you can just do simple 554 555 rm -rf git-tutorial 556+ 557and it will be gone. There's no external repository, and there's no 558history outside the project you created. 559 560 - if you want to move or duplicate a git repository, you can do so. There 561 is "git clone" command, but if all you want to do is just to 562 create a copy of your repository (with all the full history that 563 went along with it), you can do so with a regular 564 "cp -a git-tutorial new-git-tutorial". 565+ 566Note that when you've moved or copied a git repository, your git index 567file (which caches various information, notably some of the "stat" 568information for the files involved) will likely need to be refreshed. 569So after you do a "cp -a" to create a new copy, you'll want to do 570 571 git-update-cache --refresh 572+ 573in the new repository to make sure that the index file is up-to-date. 574 575Note that the second point is true even across machines. You can 576duplicate a remote git repository with _any_ regular copy mechanism, be it 577"scp", "rsync" or "wget". 578 579When copying a remote repository, you'll want to at a minimum update the 580index cache when you do this, and especially with other peoples' 581repositories you often want to make sure that the index cache is in some 582known state (you don't know _what_ they've done and not yet checked in), 583so usually you'll precede the "git-update-cache" with a 584 585 git-read-tree --reset HEAD 586 git-update-cache --refresh 587 588which will force a total index re-build from the tree pointed to by HEAD. 589It resets the index contents to HEAD, and then the git-update-cache 590makes sure to match up all index entries with the checked-out files. 591If the original repository had uncommitted changes in its 592working tree, "git-update-cache --refresh" notices them and 593tells you they need to be updated. 594 595The above can also be written as simply 596 597 git reset 598 599and in fact a lot of the common git command combinations can be scripted 600with the "git xyz" interfaces, and you can learn things by just looking 601at what the `git-*-script` scripts do (`git reset` is the above two lines 602implemented in `git-reset-script`, but some things like "git status" and 603`git commit` are slightly more complex scripts around the basic git 604commands). 605 606Many (most?) public remote repositories will not contain any of 607the checked out files or even an index file, and will 'only' contain the 608actual core git files. Such a repository usually doesn't even have the 609`.git` subdirectory, but has all the git files directly in the 610repository. 611 612To create your own local live copy of such a "raw" git repository, you'd 613first create your own subdirectory for the project, and then copy the 614raw repository contents into the ".git" directory. For example, to 615create your own copy of the git repository, you'd do the following 616 617 mkdir my-git 618 cd my-git 619 rsync -rL rsync://rsync.kernel.org/pub/scm/git/git.git/ .git 620 621followed by 622 623 git-read-tree HEAD 624 625to populate the index. However, now you have populated the index, and 626you have all the git internal files, but you will notice that you don't 627actually have any of the working tree files to work on. To get 628those, you'd check them out with 629 630 git-checkout-cache -u -a 631 632where the "-u" flag means that you want the checkout to keep the index 633up-to-date (so that you don't have to refresh it afterward), and the 634"-a" flag means "check out all files" (if you have a stale copy or an 635older version of a checked out tree you may also need to add the "-f" 636flag first, to tell git-checkout-cache to _force_ overwriting of any old 637files). 638 639Again, this can all be simplified with 640 641 git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ my-git 642 cd my-git 643 git checkout 644 645which will end up doing all of the above for you. 646 647You have now successfully copied somebody else's (mine) remote 648repository, and checked it out. 649 650 651Creating a new branch 652--------------------- 653 654Branches in git are really nothing more than pointers into the git 655object database from within the ".git/refs/" subdirectory, and as we 656already discussed, the HEAD branch is nothing but a symlink to one of 657these object pointers. 658 659You can at any time create a new branch by just picking an arbitrary 660point in the project history, and just writing the SHA1 name of that 661object into a file under .git/refs/heads/. You can use any filename you 662want (and indeed, subdirectories), but the convention is that the 663"normal" branch is called "master". That's just a convention, though, 664and nothing enforces it. 665 666To show that as an example, let's go back to the git-tutorial repository we 667used earlier, and create a branch in it. You do that by simply just 668saying that you want to check out a new branch: 669 670------------ 671git checkout -b mybranch 672------------ 673 674will create a new branch based at the current HEAD position, and switch 675to it. 676 677[NOTE] 678================================================ 679If you make the decision to start your new branch at some 680other point in the history than the current HEAD, you can do so by 681just telling "git checkout" what the base of the checkout would be. 682In other words, if you have an earlier tag or branch, you'd just do 683 684 git checkout -b mybranch earlier-commit 685 686and it would create the new branch "mybranch" at the earlier commit, 687and check out the state at that time. 688================================================ 689 690You can always just jump back to your original "master" branch by doing 691 692 git checkout master 693 694(or any other branch-name, for that matter) and if you forget which 695branch you happen to be on, a simple 696 697 ls -l .git/HEAD 698 699will tell you where it's pointing. To get the list of branches 700you have, you can say 701 702 git branch 703 704which is nothing more than a simple script around `ls .git/refs/heads`. 705There will be asterisk in front of the branch you are currently on. 706 707Sometimes you may wish to create a new branch _without_ actually 708checking it out and switching to it. If so, just use the command 709 710 git branch <branchname> [startingpoint] 711 712which will simply _create_ the branch, but will not do anything further. 713You can then later - once you decide that you want to actually develop 714on that branch - switch to that branch with a regular "git checkout" 715with the branchname as the argument. 716 717 718Merging two branches 719-------------------- 720 721One of the ideas of having a branch is that you do some (possibly 722experimental) work in it, and eventually merge it back to the main 723branch. So assuming you created the above "mybranch" that started out 724being the same as the original "master" branch, let's make sure we're in 725that branch, and do some work there. 726 727------------ 728git checkout mybranch 729echo "Work, work, work" >>hello 730git commit -m 'Some work.' hello 731------------ 732 733Here, we just added another line to "hello", and we used a shorthand for 734both going a "git-update-cache hello" and "git commit" by just giving the 735filename directly to "git commit". The '-m' flag is to give the 736commit log message from the command line. 737 738Now, to make it a bit more interesting, let's assume that somebody else 739does some work in the original branch, and simulate that by going back 740to the master branch, and editing the same file differently there: 741 742------------ 743git checkout master 744------------ 745 746Here, take a moment to look at the contents of "hello", and notice how they 747don't contain the work we just did in "mybranch" - because that work 748hasn't happened in the "master" branch at all. Then do 749 750------------ 751echo "Play, play, play" >>hello 752echo "Lots of fun" >>example 753git commit -m 'Some fun.' hello example 754------------ 755 756since the master branch is obviously in a much better mood. 757 758Now, you've got two branches, and you decide that you want to merge the 759work done. Before we do that, let's introduce a cool graphical tool that 760helps you view what's going on: 761 762 gitk --all 763 764will show you graphically both of your branches (that's what the "--all" 765means: normally it will just show you your current HEAD) and their 766histories. You can also see exactly how they came to be from a common 767source. 768 769Anyway, let's exit gitk (^Q or the File menu), and decide that we want 770to merge the work we did on the "mybranch" branch into the "master" 771branch (which is currently our HEAD too). To do that, there's a nice 772script called "git resolve", which wants to know which branches you want 773to resolve and what the merge is all about: 774 775------------ 776git resolve HEAD mybranch "Merge work in mybranch" 777------------ 778 779where the third argument is going to be used as the commit message if 780the merge can be resolved automatically. 781 782Now, in this case we've intentionally created a situation where the 783merge will need to be fixed up by hand, though, so git will do as much 784of it as it can automatically (which in this case is just merge the "example" 785file, which had no differences in the "mybranch" branch), and say: 786 787 Simple merge failed, trying Automatic merge 788 Auto-merging hello. 789 merge: warning: conflicts during merge 790 ERROR: Merge conflict in hello. 791 fatal: merge program failed 792 Automatic merge failed, fix up by hand 793 794which is way too verbose, but it basically tells you that it failed the 795really trivial merge ("Simple merge") and did an "Automatic merge" 796instead, but that too failed due to conflicts in "hello". 797 798Not to worry. It left the (trivial) conflict in "hello" in the same form you 799should already be well used to if you've ever used CVS, so let's just 800open "hello" in our editor (whatever that may be), and fix it up somehow. 801I'd suggest just making it so that "hello" contains all four lines: 802 803------------ 804Hello World 805It's a new day for git 806Play, play, play 807Work, work, work 808------------ 809 810and once you're happy with your manual merge, just do a 811 812------------ 813git commit hello 814------------ 815 816which will very loudly warn you that you're now committing a merge 817(which is correct, so never mind), and you can write a small merge 818message about your adventures in git-merge-land. 819 820After you're done, start up "gitk --all" to see graphically what the 821history looks like. Notice that "mybranch" still exists, and you can 822switch to it, and continue to work with it if you want to. The 823"mybranch" branch will not contain the merge, but next time you merge it 824from the "master" branch, git will know how you merged it, so you'll not 825have to do _that_ merge again. 826 827Another useful tool, especially if you do not always work in X-Window 828environment, is "git show-branch". 829 830------------------------------------------------ 831$ git show-branch master mybranch 832* [master] Merged "mybranch" changes. 833 ! [mybranch] Some work. 834-- 835+ [master] Merged "mybranch" changes. 836+ [master~1] Some fun. 837++ [mybranch] Some work. 838------------------------------------------------ 839 840The first two lines indicate that it is showing the two branches 841and the first line of the commit log message from their 842top-of-the-tree commits, you are currently on "master" branch 843(notice the asterisk "*" character), and the first column for 844the later output lines is used to show commits contained in the 845"master" branch, and the second column for the "mybranch" 846branch. Three commits are shown along with their log messages. 847All of them have plus '+' characters in the first column, which 848means they are now part of the "master" branch. Only the "Some 849work" commit has the plus '+' character in the second column, 850because "mybranch" has not been merged to incorporate these 851commits from the master branch. 852 853Now, let's pretend you are the one who did all the work in 854mybranch, and the fruit of your hard work has finally been merged 855to the master branch. Let's go back to "mybranch", and run 856resolve to get the "upstream changes" back to your branch. 857 858 git checkout mybranch 859 git resolve HEAD master "Merge upstream changes." 860 861This outputs something like this (the actual commit object names 862would be different) 863 864 Updating from ae3a2da... to a80b4aa.... 865 example | 1 + 866 hello | 1 + 867 2 files changed, 2 insertions(+), 0 deletions(-) 868 869Because your branch did not contain anything more than what are 870already merged into the master branch, the resolve operation did 871not actually do a merge. Instead, it just updated the top of 872the tree of your branch to that of the "master" branch. This is 873often called "fast forward" merge. 874 875You can run "gitk --all" again to see how the commit ancestry 876looks like, or run "show-branch", which tells you this. 877 878------------------------------------------------ 879$ git show-branch master mybranch 880! [master] Merged "mybranch" changes. 881 * [mybranch] Merged "mybranch" changes. 882-- 883++ [master] Merged "mybranch" changes. 884------------------------------------------------ 885 886 887Merging external work 888--------------------- 889 890It's usually much more common that you merge with somebody else than 891merging with your own branches, so it's worth pointing out that git 892makes that very easy too, and in fact, it's not that different from 893doing a "git resolve". In fact, a remote merge ends up being nothing 894more than "fetch the work from a remote repository into a temporary tag" 895followed by a "git resolve". 896 897It's such a common thing to do that it's called "git pull", and you can 898simply do 899 900 git pull <remote-repository> 901 902and optionally give a branch-name for the remote end as a second 903argument. 904 905The "remote" repository can even be on the same machine. One of 906the following notations can be used to name the repository to 907pull from: 908 909 Rsync URL 910 rsync://remote.machine/path/to/repo.git/ 911 912 HTTP(s) URL 913 http://remote.machine/path/to/repo.git/ 914 915 GIT URL 916 git://remote.machine/path/to/repo.git/ 917 918 SSH URL 919 remote.machine:/path/to/repo.git/ 920 921 Local directory 922 /path/to/repo.git/ 923 924[NOTE] 925You could do without using any branches at all, by 926keeping as many local repositories as you would like to have 927branches, and merging between them with "git pull", just like 928you merge between branches. The advantage of this approach is 929that it lets you keep set of files for each "branch" checked 930out and you may find it easier to switch back and forth if you 931juggle multiple lines of development simultaneously. Of 932course, you will pay the price of more disk usage to hold 933multiple working trees, but disk space is cheap these days. 934 935[NOTE] 936You could even pull from your own repository by 937giving '.' as <remote-repository> parameter to "git pull". 938 939It is likely that you will be pulling from the same remote 940repository from time to time. As a short hand, you can store 941the remote repository URL in a file under .git/remotes/ 942directory, like this: 943 944------------------------------------------------ 945mkdir -p .git/remotes/ 946cat >.git/remotes/linus <<\EOF 947URL: http://www.kernel.org/pub/scm/git/git.git/ 948EOF 949------------------------------------------------ 950 951and use the filename to "git pull" instead of the full URL. 952The URL specified in such file can even be a prefix 953of a full URL, like this: 954 955------------------------------------------------ 956cat >.git/remotes/jgarzik <<\EOF 957URL: http://www.kernel.org/pub/scm/linux/git/jgarzik/ 958EOF 959------------------------------------------------ 960 961 962Examples. 963 964. git pull linus 965. git pull linus tag v0.99.1 966. git pull jgarzik/netdev-2.6.git/ e100 967 968the above are equivalent to: 969 970. git pull http://www.kernel.org/pub/scm/git/git.git/ HEAD 971. git pull http://www.kernel.org/pub/scm/git/git.git/ tag v0.99.1 972. git pull http://www.kernel.org/pub/.../jgarzik/netdev-2.6.git e100 973 974 975Publishing your work 976-------------------- 977 978So we can use somebody else's work from a remote repository; but 979how can _you_ prepare a repository to let other people pull from 980it? 981 982Your do your real work in your working tree that has your 983primary repository hanging under it as its ".git" subdirectory. 984You _could_ make that repository accessible remotely and ask 985people to pull from it, but in practice that is not the way 986things are usually done. A recommended way is to have a public 987repository, make it reachable by other people, and when the 988changes you made in your primary working tree are in good shape, 989update the public repository from it. This is often called 990"pushing". 991 992[NOTE] 993This public repository could further be mirrored, 994and that is how kernel.org git repositories are done. 995 996Publishing the changes from your local (private) repository to 997your remote (public) repository requires a write privilege on 998the remote machine. You need to have an SSH account there to 999run a single command, "git-receive-pack".10001001First, you need to create an empty repository on the remote1002machine that will house your public repository. This empty1003repository will be populated and be kept up-to-date by pushing1004into it later. Obviously, this repository creation needs to be1005done only once.10061007[NOTE]1008"git push" uses a pair of programs,1009"git-send-pack" on your local machine, and "git-receive-pack"1010on the remote machine. The communication between the two over1011the network internally uses an SSH connection.10121013Your private repository's GIT directory is usually .git, but1014your public repository is often named after the project name,1015i.e. "<project>.git". Let's create such a public repository for1016project "my-git". After logging into the remote machine, create1017an empty directory:10181019 mkdir my-git.git10201021Then, make that directory into a GIT repository by running1022git-init-db, but this time, since its name is not the usual1023".git", we do things slightly differently:10241025 GIT_DIR=my-git.git git-init-db10261027Make sure this directory is available for others you want your1028changes to be pulled by via the transport of your choice. Also1029you need to make sure that you have the "git-receive-pack"1030program on the $PATH.10311032[NOTE]1033Many installations of sshd do not invoke your shell1034as the login shell when you directly run programs; what this1035means is that if your login shell is bash, only .bashrc is1036read and not .bash_profile. As a workaround, make1037sure .bashrc sets up $PATH so that you can run 'git-receive-pack'1038program.10391040Your "public repository" is now ready to accept your changes.1041Come back to the machine you have your private repository. From1042there, run this command:10431044 git push <public-host>:/path/to/my-git.git master10451046This synchronizes your public repository to match the named1047branch head (i.e. "master" in this case) and objects reachable1048from them in your current repository.10491050As a real example, this is how I update my public git1051repository. Kernel.org mirror network takes care of the1052propagation to other publicly visible machines:10531054 git push master.kernel.org:/pub/scm/git/git.git/ 105510561057Packing your repository1058-----------------------10591060Earlier, we saw that one file under .git/objects/??/ directory1061is stored for each git object you create. This representation1062is convenient and efficient to create atomically and safely, but1063not so convenient to transport over the network. Since git objects are1064immutable once they are created, there is a way to optimize the1065storage by "packing them together". The command10661067 git repack10681069will do it for you. If you followed the tutorial examples, you1070would have accumulated about 17 objects in .git/objects/??/1071directories by now. "git repack" tells you how many objects it1072packed, and stores the packed file in .git/objects/pack1073directory.10741075[NOTE]1076You will see two files, pack-\*.pack and pack-\*.idx,1077in .git/objects/pack directory. They are closely related to1078each other, and if you ever copy them by hand to a different1079repository for whatever reason, you should make sure you copy1080them together. The former holds all the data from the objects1081in the pack, and the latter holds the index for random1082access.10831084If you are paranoid, running "git-verify-pack" command would1085detect if you have a corrupt pack, but do not worry too much.1086Our programs are always perfect ;-).10871088Once you have packed objects, you do not need to leave the1089unpacked objects that are contained in the pack file anymore.10901091 git prune-packed10921093would remove them for you.10941095You can try running "find .git/objects -type f" before and after1096you run "git prune-packed" if you are curious.10971098[NOTE]1099"git pull" is slightly cumbersome for HTTP transport,1100as a packed repository may contain relatively few objects in a1101relatively large pack. If you expect many HTTP pulls from your1102public repository you might want to repack & prune often, or1103never.11041105If you run "git repack" again at this point, it will say1106"Nothing to pack". Once you continue your development and1107accumulate the changes, running "git repack" again will create a1108new pack, that contains objects created since you packed your1109repository the last time. We recommend that you pack your project1110soon after the initial import (unless you are starting your1111project from scratch), and then run "git repack" every once in a1112while, depending on how active your project is.11131114When a repository is synchronized via "git push" and "git pull",1115objects packed in the source repository are usually stored1116unpacked in the destination, unless rsync transport is used.111711181119Working with Others1120-------------------11211122Although git is a truly distributed system, it is often1123convenient to organize your project with an informal hierarchy1124of developers. Linux kernel development is run this way. There1125is a nice illustration (page 17, "Merges to Mainline") in Randy1126Dunlap's presentation (http://tinyurl.com/a2jdg).11271128It should be stressed that this hierarchy is purely "informal".1129There is nothing fundamental in git that enforces the "chain of1130patch flow" this hierarchy implies. You do not have to pull1131from only one remote repository.113211331134A recommended workflow for a "project lead" goes like this:113511361. Prepare your primary repository on your local machine. Your1137 work is done there.113811392. Prepare a public repository accessible to others.114011413. Push into the public repository from your primary1142 repository.114311444. "git repack" the public repository. This establishes a big1145 pack that contains the initial set of objects as the1146 baseline, and possibly "git prune-packed" if the transport1147 used for pulling from your repository supports packed1148 repositories.114911505. Keep working in your primary repository. Your changes1151 include modifications of your own, patches you receive via1152 e-mails, and merges resulting from pulling the "public"1153 repositories of your "subsystem maintainers".1154+1155You can repack this private repository whenever you feel like.115611576. Push your changes to the public repository, and announce it1158 to the public.115911607. Every once in a while, "git repack" the public repository.1161 Go back to step 5. and continue working.116211631164A recommended work cycle for a "subsystem maintainer" who works1165on that project and has an own "public repository" goes like this:116611671. Prepare your work repository, by "git clone" the public1168 repository of the "project lead". The URL used for the1169 initial cloning is stored in .git/branches/origin.117011712. Prepare a public repository accessible to others.117211733. Copy over the packed files from "project lead" public1174 repository to your public repository by hand; preferrably1175 use rsync for that task.117611774. Push into the public repository from your primary1178 repository. Run "git repack", and possibly "git1179 prune-packed" if the transport used for pulling from your1180 repository supports packed repositories.118111825. Keep working in your primary repository. Your changes1183 include modifications of your own, patches you receive via1184 e-mails, and merges resulting from pulling the "public"1185 repositories of your "project lead" and possibly your1186 "sub-subsystem maintainers".1187+1188You can repack this private repository whenever you feel1189like.119011916. Push your changes to your public repository, and ask your1192 "project lead" and possibly your "sub-subsystem1193 maintainers" to pull from it.119411957. Every once in a while, "git repack" the public repository.1196 Go back to step 5. and continue working.119711981199A recommended work cycle for an "individual developer" who does1200not have a "public" repository is somewhat different. It goes1201like this:120212031. Prepare your work repository, by "git clone" the public1204 repository of the "project lead" (or a "subsystem1205 maintainer", if you work on a subsystem). The URL used for1206 the initial cloning is stored in .git/branches/origin.120712082. Do your work there. Make commits.120912103. Run "git fetch origin" from the public repository of your1211 upstream every once in a while. This does only the first1212 half of "git pull" but does not merge. The head of the1213 public repository is stored in .git/refs/heads/origin.121412154. Use "git cherry origin" to see which ones of your patches1216 were accepted, and/or use "git rebase origin" to port your1217 unmerged changes forward to the updated upstream.121812195. Use "git format-patch origin" to prepare patches for e-mail1220 submission to your upstream and send it out. Go back to1221 step (2) and continue.122212231224Working with Others, Shared Repository Style1225--------------------------------------------12261227If you are coming from CVS background, the style of cooperation1228suggested in the previous section may be new to you. You do not1229have to worry. git supports "shared public repository" style of1230cooperation you are probably more familiar with as well.12311232For this, set up a public repository on a machine that is1233reachable via SSH by people with "commit privileges". Put the1234committers in the same user group and make the repository1235writable by that group.12361237Each committer would then:12381239 - clone the shared repository to a local repository,12401241------------------------------------------------1242$ git clone repo.shared.xz:/pub/scm/project.git/ my-project1243$ cd my-project1244$ hack away1245------------------------------------------------12461247 - merge the work others might have done while you were1248 hacking away.12491250------------------------------------------------1251$ git pull origin1252$ test the merge result1253------------------------------------------------12541255 - push your work as the new head of the shared1256 repository.12571258------------------------------------------------1259$ git push origin master1260------------------------------------------------12611262If somebody else pushed into the same shared repository while1263you were working locally, the last step "git push" would1264complain, telling you that the remote "master" head does not1265fast forward. You need to pull and merge those other changes1266back before you push your work when it happens.126712681269[ to be continued.. cvsimports ]