1A short git tutorial 2==================== 3May 2005 4 5 6Introduction 7------------ 8 9This is trying to be a short tutorial on setting up and using a git 10archive, mainly because being hands-on and using explicit examples is 11often the best way of explaining what is going on. 12 13In normal life, most people wouldn't use the "core" git programs 14directly, but rather script around them to make them more palatable. 15Understanding the core git stuff may help some people get those scripts 16done, though, and it may also be instructive in helping people 17understand what it is that the higher-level helper scripts are actually 18doing. 19 20The core git is often called "plumbing", with the prettier user 21interfaces on top of it called "porcelain". You may not want to use the 22plumbing directly very often, but it can be good to know what the 23plumbing does for when the porcelain isn't flushing... 24 25 26Creating a git archive 27---------------------- 28 29Creating a new git archive couldn't be easier: all git archives start 30out empty, and the only thing you need to do is find yourself a 31subdirectory that you want to use as a working tree - either an empty 32one for a totally new project, or an existing working tree that you want 33to import into git. 34 35For our first example, we're going to start a totally new archive from 36scratch, with no pre-existing files, and we'll call it "git-tutorial". 37To start up, create a subdirectory for it, change into that 38subdirectory, and initialize the git infrastructure with "git-init-db": 39 40 mkdir git-tutorial 41 cd git-tutorial 42 git-init-db 43 44to which git will reply 45 46 defaulting to local storage area 47 48which is just git's way of saying that you haven't been doing anything 49strange, and that it will have created a local .git directory setup for 50your new project. You will now have a ".git" directory, and you can 51inspect that with "ls". For your new empty project, ls should show you 52three entries: 53 54 - a symlink called HEAD, pointing to "refs/heads/master" 55 56 Don't worry about the fact that the file that the HEAD link points to 57 doesn't even exist yet - you haven't created the commit that will 58 start your HEAD development branch yet. 59 60 - a subdirectory called "objects", which will contain all the git SHA1 61 objects of your project. You should never have any real reason to 62 look at the objects directly, but you might want to know that these 63 objects are what contains all the real _data_ in your repository. 64 65 - a subdirectory called "refs", which contains references to objects. 66 67 In particular, the "refs" subdirectory will contain two other 68 subdirectories, named "heads" and "tags" respectively. They do 69 exactly what their names imply: they contain references to any number 70 of different "heads" of development (aka "branches"), and to any 71 "tags" that you have created to name specific versions of your 72 repository. 73 74 One note: the special "master" head is the default branch, which is 75 why the .git/HEAD file was created as a symlink to it even if it 76 doesn't yet exist. Basically, the HEAD link is supposed to always 77 point to the branch you are working on right now, and you always 78 start out expecting to work on the "master" branch. 79 80 However, this is only a convention, and you can name your branches 81 anything you want, and don't have to ever even _have_ a "master" 82 branch. A number of the git tools will assume that .git/HEAD is 83 valid, though. 84 85 [ Implementation note: an "object" is identified by its 160-bit SHA1 86 hash, aka "name", and a reference to an object is always the 40-byte 87 hex representation of that SHA1 name. The files in the "refs" 88 subdirectory are expected to contain these hex references (usually 89 with a final '\n' at the end), and you should thus expect to see a 90 number of 41-byte files containing these references in this refs 91 subdirectories when you actually start populating your tree ] 92 93You have now created your first git archive. Of course, since it's 94empty, that's not very useful, so let's start populating it with data. 95 96 97 Populating a git archive 98 ------------------------ 99 100We'll keep this simple and stupid, so we'll start off with populating a 101few trivial files just to get a feel for it. 102 103Start off with just creating any random files that you want to maintain 104in your git archive. We'll start off with a few bad examples, just to 105get a feel for how this works: 106 107 echo "Hello World" >hello 108 echo "Silly example" >example 109 110you have now created two files in your working directory, but to 111actually check in your hard work, you will have to go through two steps: 112 113 - fill in the "cache" aka "index" file with the information about your 114 working directory state 115 116 - commit that index file as an object. 117 118The first step is trivial: when you want to tell git about any changes 119to your working directory, you use the "git-update-cache" program. That 120program normally just takes a list of filenames you want to update, but 121to avoid trivial mistakes, it refuses to add new entries to the cache 122(or remove existing ones) unless you explicitly tell it that you're 123adding a new entry with the "--add" flag (or removing an entry with the 124"--remove") flag. 125 126So to populate the index with the two files you just created, you can do 127 128 git-update-cache --add hello example 129 130and you have now told git to track those two files. 131 132In fact, as you did that, if you now look into your object directory, 133you'll notice that git will have added two new objects to the object 134store. If you did exactly the steps above, you should now be able to do 135 136 ls .git/objects/??/* 137 138and see two files: 139 140 .git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238 141 .git/objects/f2/4c74a2e500f5ee1332c86b94199f52b1d1d962 142 143which correspond with the object with SHA1 names of 557db... and f24c7.. 144respectively. 145 146If you want to, you can use "git-cat-file" to look at those objects, but 147you'll have to use the object name, not the filename of the object: 148 149 git-cat-file -t 557db03de997c86a4a028e1ebd3a1ceb225be238 150 151where the "-t" tells git-cat-file to tell you what the "type" of the 152object is. Git will tell you that you have a "blob" object (ie just a 153regular file), and you can see the contents with 154 155 git-cat-file "blob" 557db03de997c86a4a028e1ebd3a1ceb225be238 156 157which will print out "Hello World". The object 557db... is nothing 158more than the contents of your file "hello". 159 160[ Digression: don't confuse that object with the file "hello" itself. The 161 object is literally just those specific _contents_ of the file, and 162 however much you later change the contents in file "hello", the object we 163 just looked at will never change. Objects are immutable. ] 164 165Anyway, as we mentioned previously, you normally never actually take a 166look at the objects themselves, and typing long 40-character hex SHA1 167names is not something you'd normally want to do. The above digression 168was just to show that "git-update-cache" did something magical, and 169actually saved away the contents of your files into the git content 170store. 171 172Updating the cache did something else too: it created a ".git/index" 173file. This is the index that describes your current working tree, and 174something you should be very aware of. Again, you normally never worry 175about the index file itself, but you should be aware of the fact that 176you have not actually really "checked in" your files into git so far, 177you've only _told_ git about them. 178 179However, since git knows about them, you can now start using some of the 180most basic git commands to manipulate the files or look at their status. 181 182In particular, let's not even check in the two files into git yet, we'll 183start off by adding another line to "hello" first: 184 185 echo "It's a new day for git" >>hello 186 187and you can now, since you told git about the previous state of "hello", ask 188git what has changed in the tree compared to your old index, using the 189"git-diff-files" command: 190 191 git-diff-files 192 193oops. That wasn't very readable. It just spit out its own internal 194version of a "diff", but that internal version really just tells you 195that it has noticed that "hello" has been modified, and that the old object 196contents it had have been replaced with something else. 197 198To make it readable, we can tell git-diff-files to output the 199differences as a patch, using the "-p" flag: 200 201 git-diff-files -p 202 203which will spit out 204 205 diff --git a/hello b/hello 206 --- a/hello 207 +++ b/hello 208 @@ -1 +1,2 @@ 209 Hello World 210 +It's a new day for git 211 212ie the diff of the change we caused by adding another line to "hello". 213 214In other words, git-diff-files always shows us the difference between 215what is recorded in the index, and what is currently in the working 216tree. That's very useful. 217 218A common shorthand for "git-diff-files -p" is to just write 219 220 git diff 221 222which will do the same thing. 223 224 225 Committing git state 226 -------------------- 227 228Now, we want to go to the next stage in git, which is to take the files 229that git knows about in the index, and commit them as a real tree. We do 230that in two phases: creating a "tree" object, and committing that "tree" 231object as a "commit" object together with an explanation of what the 232tree was all about, along with information of how we came to that state. 233 234Creating a tree object is trivial, and is done with "git-write-tree". 235There are no options or other input: git-write-tree will take the 236current index state, and write an object that describes that whole 237index. In other words, we're now tying together all the different 238filenames with their contents (and their permissions), and we're 239creating the equivalent of a git "directory" object: 240 241 git-write-tree 242 243and this will just output the name of the resulting tree, in this case 244(if you have done exactly as I've described) it should be 245 246 8988da15d077d4829fc51d8544c097def6644dbb 247 248which is another incomprehensible object name. Again, if you want to, 249you can use "git-cat-file -t 8988d.." to see that this time the object 250is not a "blob" object, but a "tree" object (you can also use 251git-cat-file to actually output the raw object contents, but you'll see 252mainly a binary mess, so that's less interesting). 253 254However - normally you'd never use "git-write-tree" on its own, because 255normally you always commit a tree into a commit object using the 256"git-commit-tree" command. In fact, it's easier to not actually use 257git-write-tree on its own at all, but to just pass its result in as an 258argument to "git-commit-tree". 259 260"git-commit-tree" normally takes several arguments - it wants to know 261what the _parent_ of a commit was, but since this is the first commit 262ever in this new archive, and it has no parents, we only need to pass in 263the tree ID. However, git-commit-tree also wants to get a commit message 264on its standard input, and it will write out the resulting ID for the 265commit to its standard output. 266 267And this is where we start using the .git/HEAD file. The HEAD file is 268supposed to contain the reference to the top-of-tree, and since that's 269exactly what git-commit-tree spits out, we can do this all with a simple 270shell pipeline: 271 272 echo "Initial commit" | git-commit-tree $(git-write-tree) > .git/HEAD 273 274which will say: 275 276 Committing initial tree 8988da15d077d4829fc51d8544c097def6644dbb 277 278just to warn you about the fact that it created a totally new commit 279that is not related to anything else. Normally you do this only _once_ 280for a project ever, and all later commits will be parented on top of an 281earlier commit, and you'll never see this "Committing initial tree" 282message ever again. 283 284Again, normally you'd never actually do this by hand. There is a 285helpful script called "git commit" that will do all of this for you. So 286you could have just written 287 288 git commit 289 290instead, and it would have done the above magic scripting for you. 291 292 293 Making a change 294 --------------- 295 296Remember how we did the "git-update-cache" on file "hello" and then we 297changed "hello" afterward, and could compare the new state of "hello" with the 298state we saved in the index file? 299 300Further, remember how I said that "git-write-tree" writes the contents 301of the _index_ file to the tree, and thus what we just committed was in 302fact the _original_ contents of the file "hello", not the new ones. We did 303that on purpose, to show the difference between the index state, and the 304state in the working directory, and how they don't have to match, even 305when we commit things. 306 307As before, if we do "git-diff-files -p" in our git-tutorial project, 308we'll still see the same difference we saw last time: the index file 309hasn't changed by the act of committing anything. However, now that we 310have committed something, we can also learn to use a new command: 311"git-diff-cache". 312 313Unlike "git-diff-files", which showed the difference between the index 314file and the working directory, "git-diff-cache" shows the differences 315between a committed _tree_ and either the index file or the working 316directory. In other words, git-diff-cache wants a tree to be diffed 317against, and before we did the commit, we couldn't do that, because we 318didn't have anything to diff against. 319 320But now we can do 321 322 git-diff-cache -p HEAD 323 324(where "-p" has the same meaning as it did in git-diff-files), and it 325will show us the same difference, but for a totally different reason. 326Now we're comparing the working directory not against the index file, 327but against the tree we just wrote. It just so happens that those two 328are obviously the same, so we get the same result. 329 330Again, because this is a common operation, you can also just shorthand 331it with 332 333 git diff HEAD 334 335which ends up doing the above for you. 336 337In other words, "git-diff-cache" normally compares a tree against the 338working directory, but when given the "--cached" flag, it is told to 339instead compare against just the index cache contents, and ignore the 340current working directory state entirely. Since we just wrote the index 341file to HEAD, doing "git-diff-cache --cached -p HEAD" should thus return 342an empty set of differences, and that's exactly what it does. 343 344[ Digression: "git-diff-cache" really always uses the index for its 345 comparisons, and saying that it compares a tree against the working 346 directory is thus not strictly accurate. In particular, the list of 347 files to compare (the "meta-data") _always_ comes from the index file, 348 regardless of whether the --cached flag is used or not. The --cached 349 flag really only determines whether the file _contents_ to be compared 350 come from the working directory or not. 351 352 This is not hard to understand, as soon as you realize that git simply 353 never knows (or cares) about files that it is not told about 354 explicitly. Git will never go _looking_ for files to compare, it 355 expects you to tell it what the files are, and that's what the index 356 is there for. ] 357 358However, our next step is to commit the _change_ we did, and again, to 359understand what's going on, keep in mind the difference between "working 360directory contents", "index file" and "committed tree". We have changes 361in the working directory that we want to commit, and we always have to 362work through the index file, so the first thing we need to do is to 363update the index cache: 364 365 git-update-cache hello 366 367(note how we didn't need the "--add" flag this time, since git knew 368about the file already). 369 370Note what happens to the different git-diff-xxx versions here. After 371we've updated "hello" in the index, "git-diff-files -p" now shows no 372differences, but "git-diff-cache -p HEAD" still _does_ show that the 373current state is different from the state we committed. In fact, now 374"git-diff-cache" shows the same difference whether we use the "--cached" 375flag or not, since now the index is coherent with the working directory. 376 377Now, since we've updated "hello" in the index, we can commit the new 378version. We could do it by writing the tree by hand again, and 379committing the tree (this time we'd have to use the "-p HEAD" flag to 380tell commit that the HEAD was the _parent_ of the new commit, and that 381this wasn't an initial commit any more), but you've done that once 382already, so let's just use the helpful script this time: 383 384 git commit 385 386which starts an editor for you to write the commit message and tells you 387a bit about what you're doing. 388 389Write whatever message you want, and all the lines that start with '#' 390will be pruned out, and the rest will be used as the commit message for 391the change. If you decide you don't want to commit anything after all at 392this point (you can continue to edit things and update the cache), you 393can just leave an empty message. Otherwise git-commit-script will commit 394the change for you. 395 396You've now made your first real git commit. And if you're interested in 397looking at what git-commit-script really does, feel free to investigate: 398it's a few very simple shell scripts to generate the helpful (?) commit 399message headers, and a few one-liners that actually do the commit itself. 400 401 402 Checking it out 403 --------------- 404 405While creating changes is useful, it's even more useful if you can tell 406later what changed. The most useful command for this is another of the 407"diff" family, namely "git-diff-tree". 408 409git-diff-tree can be given two arbitrary trees, and it will tell you the 410differences between them. Perhaps even more commonly, though, you can 411give it just a single commit object, and it will figure out the parent 412of that commit itself, and show the difference directly. Thus, to get 413the same diff that we've already seen several times, we can now do 414 415 git-diff-tree -p HEAD 416 417(again, "-p" means to show the difference as a human-readable patch), 418and it will show what the last commit (in HEAD) actually changed. 419 420More interestingly, you can also give git-diff-tree the "-v" flag, which 421tells it to also show the commit message and author and date of the 422commit, and you can tell it to show a whole series of diffs. 423Alternatively, you can tell it to be "silent", and not show the diffs at 424all, but just show the actual commit message. 425 426In fact, together with the "git-rev-list" program (which generates a 427list of revisions), git-diff-tree ends up being a veritable fount of 428changes. A trivial (but very useful) script called "git-whatchanged" is 429included with git which does exactly this, and shows a log of recent 430activity. 431 432To see the whole history of our pitiful little git-tutorial project, you 433can do 434 435 git log 436 437which shows just the log messages, or if we want to see the log together 438with the associated patches use the more complex (and much more 439powerful) 440 441 git-whatchanged -p --root 442 443and you will see exactly what has changed in the repository over its 444short history. 445 446[ Side note: the "--root" flag is a flag to git-diff-tree to tell it to 447 show the initial aka "root" commit too. Normally you'd probably not 448 want to see the initial import diff, but since the tutorial project 449 was started from scratch and is so small, we use it to make the result 450 a bit more interesting ] 451 452With that, you should now be having some inkling of what git does, and 453can explore on your own. 454 455 456[ Side note: most likely, you are not directly using the core 457 git Plumbing commands, but using Porcelain like Cogito on top 458 of it. Cogito works a bit differently and you usually do not 459 have to run "git-update-cache" yourself for changed files (you 460 do tell underlying git about additions and removals via 461 "cg-add" and "cg-rm" commands). Just before you make a commit 462 with "cg-commit", Cogito figures out which files you modified, 463 and runs "git-update-cache" on them for you. ] 464 465 466 Tagging a version 467 ----------------- 468 469In git, there's two kinds of tags, a "light" one, and a "signed tag". 470 471A "light" tag is technically nothing more than a branch, except we put 472it in the ".git/refs/tags/" subdirectory instead of calling it a "head". 473So the simplest form of tag involves nothing more than 474 475 git tag my-first-tag 476 477which just writes the current HEAD into the .git/refs/tags/my-first-tag 478file, after which point you can then use this symbolic name for that 479particular state. You can, for example, do 480 481 git diff my-first-tag 482 483to diff your current state against that tag (which at this point will 484obviously be an empty diff, but if you continue to develop and commit 485stuff, you can use your tag as an "anchor-point" to see what has changed 486since you tagged it. 487 488A "signed tag" is actually a real git object, and contains not only a 489pointer to the state you want to tag, but also a small tag name and 490message, along with a PGP signature that says that yes, you really did 491that tag. You create these signed tags with the "-s" flag to "git tag": 492 493 git tag -s <tagname> 494 495which will sign the current HEAD (but you can also give it another 496argument that specifies the thing to tag, ie you could have tagged the 497current "mybranch" point by using "git tag <tagname> mybranch"). 498 499You normally only do signed tags for major releases or things 500like that, while the light-weight tags are useful for any marking you 501want to do - any time you decide that you want to remember a certain 502point, just create a private tag for it, and you have a nice symbolic 503name for the state at that point. 504 505 506 Copying archives 507 ----------------- 508 509Git archives are normally totally self-sufficient, and it's worth noting 510that unlike CVS, for example, there is no separate notion of 511"repository" and "working tree". A git repository normally _is_ the 512working tree, with the local git information hidden in the ".git" 513subdirectory. There is nothing else. What you see is what you got. 514 515[ Side note: you can tell git to split the git internal information from 516 the directory that it tracks, but we'll ignore that for now: it's not 517 how normal projects work, and it's really only meant for special uses. 518 So the mental model of "the git information is always tied directly to 519 the working directory that it describes" may not be technically 100% 520 accurate, but it's a good model for all normal use ] 521 522This has two implications: 523 524 - if you grow bored with the tutorial archive you created (or you've 525 made a mistake and want to start all over), you can just do simple 526 527 rm -rf git-tutorial 528 529 and it will be gone. There's no external repository, and there's no 530 history outside of the project you created. 531 532 - if you want to move or duplicate a git archive, you can do so. There 533 is "git clone" command, but if all you want to do is just to 534 create a copy of your archive (with all the full history that 535 went along with it), you can do so with a regular 536 "cp -a git-tutorial new-git-tutorial". 537 538 Note that when you've moved or copied a git archive, your git index 539 file (which caches various information, notably some of the "stat" 540 information for the files involved) will likely need to be refreshed. 541 So after you do a "cp -a" to create a new copy, you'll want to do 542 543 git-update-cache --refresh 544 545 to make sure that the index file is up-to-date in the new one. 546 547Note that the second point is true even across machines. You can 548duplicate a remote git archive with _any_ regular copy mechanism, be it 549"scp", "rsync" or "wget". 550 551When copying a remote repository, you'll want to at a minimum update the 552index cache when you do this, and especially with other peoples 553repositories you often want to make sure that the index cache is in some 554known state (you don't know _what_ they've done and not yet checked in), 555so usually you'll precede the "git-update-cache" with a 556 557 git-read-tree --reset HEAD 558 git-update-cache --refresh 559 560which will force a total index re-build from the tree pointed to by HEAD 561(it resets the index contents to HEAD, and then the git-update-cache 562makes sure to match up all index entries with the checked-out files). 563 564The above can also be written as simply 565 566 git reset 567 568and in fact a lot of the common git command combinations can be scripted 569with the "git xyz" interfaces, and you can learn things by just looking 570at what the git-*-script scripts do ("git reset" is the above two lines 571implemented in "git-reset-script", but some things like "git status" and 572"git commit" are slightly more complex scripts around the basic git 573commands). 574 575NOTE! Many (most?) public remote repositories will not contain any of 576the checked out files or even an index file, and will _only_ contain the 577actual core git files. Such a repository usually doesn't even have the 578".git" subdirectory, but has all the git files directly in the 579repository. 580 581To create your own local live copy of such a "raw" git repository, you'd 582first create your own subdirectory for the project, and then copy the 583raw repository contents into the ".git" directory. For example, to 584create your own copy of the git repository, you'd do the following 585 586 mkdir my-git 587 cd my-git 588 rsync -rL rsync://rsync.kernel.org/pub/scm/git/git.git/ .git 589 590followed by 591 592 git-read-tree HEAD 593 594to populate the index. However, now you have populated the index, and 595you have all the git internal files, but you will notice that you don't 596actually have any of the _working_directory_ files to work on. To get 597those, you'd check them out with 598 599 git-checkout-cache -u -a 600 601where the "-u" flag means that you want the checkout to keep the index 602up-to-date (so that you don't have to refresh it afterward), and the 603"-a" flag means "check out all files" (if you have a stale copy or an 604older version of a checked out tree you may also need to add the "-f" 605flag first, to tell git-checkout-cache to _force_ overwriting of any old 606files). 607 608Again, this can all be simplified with 609 610 git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ my-git 611 cd my-git 612 git checkout 613 614which will end up doing all of the above for you. 615 616You have now successfully copied somebody else's (mine) remote 617repository, and checked it out. 618 619 620 Creating a new branch 621 --------------------- 622 623Branches in git are really nothing more than pointers into the git 624object space from within the ".git/refs/" subdirectory, and as we 625already discussed, the HEAD branch is nothing but a symlink to one of 626these object pointers. 627 628You can at any time create a new branch by just picking an arbitrary 629point in the project history, and just writing the SHA1 name of that 630object into a file under .git/refs/heads/. You can use any filename you 631want (and indeed, subdirectories), but the convention is that the 632"normal" branch is called "master". That's just a convention, though, 633and nothing enforces it. 634 635To show that as an example, let's go back to the git-tutorial archive we 636used earlier, and create a branch in it. You do that by simply just 637saying that you want to check out a new branch: 638 639 git checkout -b mybranch 640 641will create a new branch based at the current HEAD position, and switch 642to it. 643 644[ Side note: if you make the decision to start your new branch at some 645 other point in the history than the current HEAD, you can do so by 646 just telling "git checkout" what the base of the checkout would be. 647 In other words, if you have an earlier tag or branch, you'd just do 648 649 git checkout -b mybranch earlier-branch 650 651 and it would create the new branch "mybranch" at the earlier point, 652 and check out the state at that time. ] 653 654You can always just jump back to your original "master" branch by doing 655 656 git checkout master 657 658(or any other branch-name, for that matter) and if you forget which 659branch you happen to be on, a simple 660 661 ls -l .git/HEAD 662 663will tell you where it's pointing. 664 665NOTE! Sometimes you may wish to create a new branch _without_ actually 666checking it out and switching to it. If so, just use the command 667 668 git branch <branchname> [startingpoint] 669 670which will simply _create_ the branch, but will not do anything further. 671You can then later - once you decide that you want to actually develop 672on that branch - switch to that branch with a regular "git checkout" 673with the branchname as the argument. 674 675 676 Merging two branches 677 -------------------- 678 679One of the ideas of having a branch is that you do some (possibly 680experimental) work in it, and eventually merge it back to the main 681branch. So assuming you created the above "mybranch" that started out 682being the same as the original "master" branch, let's make sure we're in 683that branch, and do some work there. 684 685 git checkout mybranch 686 echo "Work, work, work" >>hello 687 git commit hello 688 689Here, we just added another line to "hello", and we used a shorthand for 690both going a "git-update-cache hello" and "git commit" by just giving the 691filename directly to "git commit". 692 693Now, to make it a bit more interesting, let's assume that somebody else 694does some work in the original branch, and simulate that by going back 695to the master branch, and editing the same file differently there: 696 697 git checkout master 698 699Here, take a moment to look at the contents of "hello", and notice how they 700don't contain the work we just did in "mybranch" - because that work 701hasn't happened in the "master" branch at all. Then do 702 703 echo "Play, play, play" >>hello 704 echo "Lots of fun" >>example 705 git commit hello example 706 707since the master branch is obviously in a much better mood. 708 709Now, you've got two branches, and you decide that you want to merge the 710work done. Before we do that, let's introduce a cool graphical tool that 711helps you view what's going on: 712 713 gitk --all 714 715will show you graphically both of your branches (that's what the "--all" 716means: normally it will just show you your current HEAD) and their 717histories. You can also see exactly how they came to be from a common 718source. 719 720Anyway, let's exit gitk (^Q or the File menu), and decide that we want 721to merge the work we did on the "mybranch" branch into the "master" 722branch (which is currently our HEAD too). To do that, there's a nice 723script called "git resolve", which wants to know which branches you want 724to resolve and what the merge is all about: 725 726 git resolve HEAD mybranch "Merge work in mybranch" 727 728where the third argument is going to be used as the commit message if 729the merge can be resolved automatically. 730 731Now, in this case we've intentionally created a situation where the 732merge will need to be fixed up by hand, though, so git will do as much 733of it as it can automatically (which in this case is just merge the "b" 734file, which had no differences in the "mybranch" branch), and say: 735 736 Simple merge failed, trying Automatic merge 737 Auto-merging hello. 738 merge: warning: conflicts during merge 739 ERROR: Merge conflict in hello. 740 fatal: merge program failed 741 Automatic merge failed, fix up by hand 742 743which is way too verbose, but it basically tells you that it failed the 744really trivial merge ("Simple merge") and did an "Automatic merge" 745instead, but that too failed due to conflicts in "hello". 746 747Not to worry. It left the (trivial) conflict in "hello" in the same form you 748should already be well used to if you've ever used CVS, so let's just 749open "hello" in our editor (whatever that may be), and fix it up somehow. 750I'd suggest just making it so that "hello" contains all four lines: 751 752 Hello World 753 It's a new day for git 754 Play, play, play 755 Work, work, work 756 757and once you're happy with your manual merge, just do a 758 759 git commit hello 760 761which will very loudly warn you that you're now committing a merge 762(which is correct, so never mind), and you can write a small merge 763message about your adventures in git-merge-land. 764 765After you're done, start up "gitk --all" to see graphically what the 766history looks like. Notice that "mybranch" still exists, and you can 767switch to it, and continue to work with it if you want to. The 768"mybranch" branch will not contain the merge, but next time you merge it 769from the "master" branch, git will know how you merged it, so you'll not 770have to do _that_ merge again. 771 772 773 Merging external work 774 --------------------- 775 776It's usually much more common that you merge with somebody else than 777merging with your own branches, so it's worth pointing out that git 778makes that very easy too, and in fact, it's not that different from 779doing a "git resolve". In fact, a remote merge ends up being nothing 780more than "fetch the work from a remote repository into a temporary tag" 781followed by a "git resolve". 782 783It's such a common thing to do that it's called "git pull", and you can 784simply do 785 786 git pull <remote-repository> 787 788and optionally give a branch-name for the remote end as a second 789argument. 790 791The "remote" repository can even be on the same machine. One of 792the following notations can be used to name the repository to 793pull from: 794 795 Rsync URL 796 rsync://remote.machine/path/to/repo.git/ 797 798 HTTP(s) URL 799 http://remote.machine/path/to/repo.git/ 800 801 GIT URL 802 git://remote.machine/path/to/repo.git/ 803 804 SSH URL 805 remote.machine:/path/to/repo.git/ 806 807 Local directory 808 /path/to/repo.git/ 809 810[ Digression: you could do without using any branches at all, by 811 keeping as many local repositories as you would like to have 812 branches, and merging between them with "git pull", just like 813 you merge between branches. The advantage of this approach is 814 that it lets you keep set of files for each "branch" checked 815 out and you may find it easier to switch back and forth if you 816 juggle multiple lines of development simultaneously. Of 817 course, you will pay the price of more disk usage to hold 818 multiple working trees, but disk space is cheap these days. ] 819 820It is likely that you will be pulling from the same remote 821repository from time to time. As a short hand, you can store 822the remote repository URL in a file under .git/branches/ 823directory, like this: 824 825 mkdir -p .git/branches 826 echo rsync://kernel.org/pub/scm/git/git.git/ \ 827 >.git/branches/linus 828 829and use the filename to "git pull" instead of the full URL. 830The contents of a file under .git/branches can even be a prefix 831of a full URL, like this: 832 833 echo rsync://kernel.org/pub/.../jgarzik/ 834 >.git/branches/jgarzik 835 836Examples. 837 838 (1) git pull linus 839 (2) git pull linus tag v0.99.1 840 (3) git pull jgarzik/netdev-2.6.git/ e100 841 842the above are equivalent to: 843 844 (1) git pull rsync://kernel.org/pub/scm/git/git.git/ HEAD 845 (2) git pull rsync://kernel.org/pub/scm/git/git.git/ tag v0.99.1 846 (3) git pull rsync://kernel.org/pub/.../jgarzik/netdev-2.6.git e100 847 848 849 Publishing your work 850 -------------------- 851 852So we can use somebody else's work from a remote repository; but 853how can _you_ prepare a repository to let other people pull from 854it? 855 856Your do your real work in your working directory that has your 857primary repository hanging under it as its ".git" subdirectory. 858You _could_ make that repository accessible remotely and ask 859people to pull from it, but in practice that is not the way 860things are usually done. A recommended way is to have a public 861repository, make it reachable by other people, and when the 862changes you made in your primary working directory are in good 863shape, update the public repository from it. This is often 864called "pushing". 865 866[ Side note: this public repository could further be mirrored, 867 and that is how kernel.org git repositories are done. ] 868 869Publishing the changes from your local (private) repository to 870your remote (public) repository requires a write privilege on 871the remote machine. You need to have an SSH account there to 872run a single command, "git-receive-pack". 873 874First, you need to create an empty repository on the remote 875machine that will house your public repository. This empty 876repository will be populated and be kept up-to-date by pushing 877into it later. Obviously, this repository creation needs to be 878done only once. 879 880[ Digression: "git push" uses a pair of programs, 881 "git-send-pack" on your local machine, and "git-receive-pack" 882 on the remote machine. The communication between the two over 883 the network internally uses an SSH connection. ] 884 885Your private repository's GIT directory is usually .git, but 886your public repository is often named after the project name, 887i.e. "<project>.git". Let's create such a public repository for 888project "my-git". After logging into the remote machine, create 889an empty directory: 890 891 mkdir my-git.git 892 893Then, make that directory into a GIT repository by running 894git-init-db, but this time, since it's name is not the usual 895".git", we do things slightly differently: 896 897 GIT_DIR=my-git.git git-init-db 898 899Make sure this directory is available for others you want your 900changes to be pulled by via the transport of your choice. Also 901you need to make sure that you have the "git-receive-pack" 902program on the $PATH. 903 904[ Side note: many installations of sshd do not invoke your shell 905 as the login shell when you directly run programs; what this 906 means is that if your login shell is bash, only .bashrc is 907 read and not .bash_profile. As a workaround, make sure 908 .bashrc sets up $PATH so that you can run 'git-receive-pack' 909 program. ] 910 911Your "public repository" is now ready to accept your changes. 912Come back to the machine you have your private repository. From 913there, run this command: 914 915 git push <public-host>:/path/to/my-git.git master 916 917This synchronizes your public repository to match the named 918branch head (i.e. "master" in this case) and objects reachable 919from them in your current repository. 920 921As a real example, this is how I update my public git 922repository. Kernel.org mirror network takes care of the 923propagation to other publicly visible machines: 924 925 git push master.kernel.org:/pub/scm/git/git.git/ 926 927 928[ Digression: your GIT "public" repository people can pull from 929 is different from a public CVS repository that lets read-write 930 access to multiple developers. It is a copy of _your_ primary 931 repository published for others to use, and you should not 932 push into it from more than one repository (this means, not 933 just disallowing other developers to push into it, but also 934 you should push into it from a single repository of yours). 935 Sharing the result of work done by multiple people are always 936 done by pulling (i.e. fetching and merging) from public 937 repositories of those people. Typically this is done by the 938 "project lead" person, and the resulting repository is 939 published as the public repository of the "project lead" for 940 everybody to base further changes on. ] 941 942 943 Packing your repository 944 ----------------------- 945 946Earlier, we saw that one file under .git/objects/??/ directory 947is stored for each git object you create. This representation 948is convenient and efficient to create atomically and safely, but 949not so to transport over the network. Since git objects are 950immutable once they are created, there is a way to optimize the 951storage by "packing them together". The command 952 953 git repack 954 955will do it for you. If you followed the tutorial examples, you 956would have accumulated about 17 objects in .git/objects/??/ 957directories by now. "git repack" tells you how many objects it 958packed, and stores the packed file in .git/objects/pack 959directory. 960 961[ Side Note: you will see two files, pack-*.pack and pack-*.idx, 962 in .git/objects/pack directory. They are closely related to 963 each other, and if you ever copy them by hand to a different 964 repository for whatever reason, you should make sure you copy 965 them together. The former holds all the data from the objects 966 in the pack, and the latter holds the index for random 967 access. ] 968 969If you are paranoid, running "git-verify-pack" command would 970detect if you have a corrupt pack, but do not worry too much. 971Our programs are always perfect ;-). 972 973Once you have packed objects, you do not need to leave the 974unpacked objects that are contained in the pack file anymore. 975 976 git prune-packed 977 978would remove them for you. 979 980You can try running "find .git/objects -type f" before and after 981you run "git prune-packed" if you are curious. 982 983[ Side Note: "git pull" is slightly cumbersome for HTTP transport, 984 as a packed repository may contain relatively few objects in a 985 relatively large pack. If you expect many HTTP pulls from your 986 public repository you might want to repack & prune often, or 987 never. ] 988 989If you run "git repack" again at this point, it will say 990"Nothing to pack". Once you continue your development and 991accumulate the changes, running "git repack" again will create a 992new pack, that contains objects created since you packed your 993archive the last time. We recommend that you pack your project 994soon after the initial import (unless you are starting your 995project from scratch), and then run "git repack" every once in a 996while, depending on how active your project is. 997 998When a repository is synchronized via "git push" and "git pull", 999objects packed in the source repository are usually stored1000unpacked in the destination, unless rsync transport is used.100110021003 Working with Others1004 -------------------10051006Although git is a truly distributed system, it is often1007convenient to organize your project with an informal hierarchy1008of developers. Linux kernel development is run this way. There1009is a nice illustration (page 17, "Merges to Mainline") in Randy1010Dunlap's presentation (http://tinyurl.com/a2jdg).10111012It should be stressed that this hierarchy is purely "informal".1013There is nothing fundamental in git that enforces the "chain of1014patch flow" this hierarchy implies. You do not have to pull1015from only one remote repository.101610171018A recommended workflow for a "project lead" goes like this:10191020 (1) Prepare your primary repository on your local machine. Your1021 work is done there.10221023 (2) Prepare a public repository accessible to others.10241025 (3) Push into the public repository from your primary1026 repository.10271028 (4) "git repack" the public repository. This establishes a big1029 pack that contains the initial set of objects as the1030 baseline, and possibly "git prune-packed" if the transport1031 used for pulling from your repository supports packed1032 repositories.10331034 (5) Keep working in your primary repository. Your changes1035 include modifications of your own, patches you receive via1036 e-mails, and merges resulting from pulling the "public"1037 repositories of your "subsystem maintainers".10381039 You can repack this private repository whenever you feel1040 like.10411042 (6) Push your changes to the public repository, and announce it1043 to the public.10441045 (7) Every once in a while, "git repack" the public repository.1046 Go back to step (5) and continue working.104710481049A recommended work cycle for a "subsystem maintainer" who works1050on that project and has an own "public repository" goes like this:10511052 (1) Prepare your work repository, by "git clone" the public1053 repository of the "project lead". The URL used for the1054 initial cloning is stored in .git/branches/origin.10551056 (2) Prepare a public repository accessible to others.10571058 (3) Copy over the packed files from "project lead" public1059 repository to your public repository by hand; preferrably1060 use rsync for that task.10611062 (4) Push into the public repository from your primary1063 repository. Run "git repack", and possibly "git1064 prune-packed" if the transport used for pulling from your1065 repository supports packed repositories.10661067 (5) Keep working in your primary repository. Your changes1068 include modifications of your own, patches you receive via1069 e-mails, and merges resulting from pulling the "public"1070 repositories of your "project lead" and possibly your1071 "sub-subsystem maintainers".10721073 You can repack this private repository whenever you feel1074 like.10751076 (6) Push your changes to your public repository, and ask your1077 "project lead" and possibly your "sub-subsystem1078 maintainers" to pull from it.10791080 (7) Every once in a while, "git repack" the public repository.1081 Go back to step (5) and continue working.108210831084A recommended work cycle for an "individual developer" who does1085not have a "public" repository is somewhat different. It goes1086like this:10871088 (1) Prepare your work repository, by "git clone" the public1089 repository of the "project lead" (or a "subsystem1090 maintainer", if you work on a subsystem). The URL used for1091 the initial cloning is stored in .git/branches/origin.10921093 (2) Do your work there. Make commits.10941095 (3) Run "git fetch origin" from the public repository of your1096 upstream every once in a while. This does only the first1097 half of "git pull" but does not merge. The head of the1098 public repository is stored in .git/refs/heads/origin.10991100 (4) Use "git cherry origin" to see which ones of your patches1101 were accepted, and/or use "git rebase origin" to port your1102 unmerged changes forward to the updated upstream.11031104 (5) Use "git format-patch origin" to prepare patches for e-mail1105 submission to your upstream and send it out. Go back to1106 step (2) and continue.110711081109[ to be continued.. cvsimports ]