did, and why.
Every commit has a 40-hexdigit id, sometimes called the "object name" or the
-"SHA1 id", shown on the first line of the "git-show" output. You can usually
+"SHA-1 id", shown on the first line of the "git show" output. You can usually
refer to a commit by a shorter name, such as a tag or a branch name, but this
longer name can also be useful. Most importantly, it is a globally unique
name for this commit: so if you tell somebody else the object name (for
Examining an old version without creating a new branch
------------------------------------------------------
-The git-checkout command normally expects a branch head, but will also
+The `git checkout` command normally expects a branch head, but will also
accept an arbitrary commit; for example, you can check out the commit
referenced by a tag:
HEAD is now at 427abfa... Linux v2.6.17
------------------------------------------------
-The HEAD then refers to the SHA1 of the commit instead of to a branch,
+The HEAD then refers to the SHA-1 of the commit instead of to a branch,
and git branch shows that you are no longer on a branch:
------------------------------------------------
REVISIONS" section of linkgit:git-rev-parse[1].
[[Updating-a-repository-With-git-fetch]]
-Updating a repository with git-fetch
+Updating a repository with git fetch
------------------------------------
Eventually the developer cloned from will do additional work in her
-------------------------------------------------
New remote-tracking branches will be stored under the shorthand name
-that you gave "git-remote add", in this case linux-nfs:
+that you gave "git remote add", in this case linux-nfs:
-------------------------------------------------
$ git branch -r
to return you to the branch you were on before.
-Note that the version which git-bisect checks out for you at each
+Note that the version which `git bisect` checks out for you at each
point is just a suggestion, and you're free to try a different
version if you think it would be a good idea. For example,
occasionally you may land on a commit that broke something unrelated;
commits:
Merges (to be discussed later), as well as operations such as
-git-reset, which change the currently checked-out commit, generally
+`git reset`, which change the currently checked-out commit, generally
set ORIG_HEAD to the value HEAD had before the current operation.
-The git-fetch operation always stores the head of the last fetched
-branch in FETCH_HEAD. For example, if you run git fetch without
+The `git fetch` operation always stores the head of the last fetched
+branch in FETCH_HEAD. For example, if you run `git fetch` without
specifying a local branch as the target of the operation
-------------------------------------------------
-------------------------------------------------
Alternatively, you may often see this sort of thing done with the
-lower-level command linkgit:git-rev-list[1], which just lists the SHA1's
+lower-level command linkgit:git-rev-list[1], which just lists the SHA-1's
of all the given commits:
-------------------------------------------------
If you have some initial content (say, a tarball):
-------------------------------------------------
-$ tar -xzvf project.tar.gz
+$ tar xzvf project.tar.gz
$ cd project
$ git init
$ git add . # include everything below ./ in the first commit:
shows the difference between the working tree and the index file.
-Note that "git-add" always adds just the current contents of a file
+Note that "git add" always adds just the current contents of a file
to the index; further changes to the same file will be ignored unless
-you run git-add on the file again.
+you run `git add` on the file again.
When you're ready, just run
A project will often generate files that you do 'not' want to track with git.
This typically includes files generated by a build process or temporary
backup files made by your editor. Of course, 'not' tracking files with git
-is just a matter of 'not' calling "`git-add`" on them. But it quickly becomes
+is just a matter of 'not' calling `git add` on them. But it quickly becomes
annoying to have these untracked files lying around; e.g. they make
-"`git add .`" practically useless, and they keep showing up in the output of
-"`git status`".
+`git add .` practically useless, and they keep showing up in the output of
+`git status`.
You can tell git to ignore certain files by creating a file called .gitignore
in the top level of your working directory, with contents such as:
-------------------------------------------------
the different stages of that file will be "collapsed", after which
-git-diff will (by default) no longer show diffs for that file.
+`git diff` will (by default) no longer show diffs for that file.
[[undoing-a-merge]]
Undoing a merge
If the problematic commit is the most recent commit, and you have not
yet made that commit public, then you may just
-<<undoing-a-merge,destroy it using git-reset>>.
+<<undoing-a-merge,destroy it using `git reset`>>.
Alternatively, you
can edit the working directory and update the index to fix your
In the process of undoing a previous bad change, you may find it
useful to check out an older version of a particular file using
-linkgit:git-checkout[1]. We've used git-checkout before to switch
+linkgit:git-checkout[1]. We've used `git checkout` before to switch
branches, but it has quite different behavior if it is given a path
name: the command
work-in-progress changes.
------------------------------------------------
-$ git stash "work in progress for foo feature"
+$ git stash save "work in progress for foo feature"
------------------------------------------------
This command will save your changes away to the `stash`, and
------------------------------------------------
After that, you can go back to what you were working on with
-`git stash apply`:
+`git stash pop`:
------------------------------------------------
-$ git stash apply
+$ git stash pop
------------------------------------------------
-------------------------------------------------
to recompress the archive. This can be very time-consuming, so
-you may prefer to run git-gc when you are not doing other work.
+you may prefer to run `git gc` when you are not doing other work.
[[ensuring-reliability]]
suppose you delete a branch, then realize you need the history it
contained. The reflog is also deleted; however, if you have not yet
pruned the repository, then you may still be able to find the lost
-commits in the dangling objects that git-fsck reports. See
+commits in the dangling objects that `git fsck` reports. See
<<dangling-objects>> for the details.
-------------------------------------------------
===============================
[[getting-updates-With-git-pull]]
-Getting updates with git-pull
+Getting updates with git pull
-----------------------------
After you clone a repository and make a few changes of your own, you
<<fast-forwards,fast forward>>; instead, your branch will just be
updated to point to the latest commit from the upstream branch.)
-The git-pull command can also be given "." as the "remote" repository,
+The `git pull` command can also be given "." as the "remote" repository,
in which case it just merges in a branch from the current repository; so
the commands
Another way to submit changes to a project is to tell the maintainer
of that project to pull the changes from your repository using
linkgit:git-pull[1]. In the section "<<getting-updates-With-git-pull,
-Getting updates with git-pull>>" we described this as a way to get
+Getting updates with `git pull`>>" we described this as a way to get
updates from the "main" repository, but it works just as well in the
other direction.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Assume your personal repository is in the directory ~/proj. We
-first create a new clone of the repository and tell git-daemon that it
+first create a new clone of the repository and tell `git daemon` that it
is meant to be public:
-------------------------------------------------
Otherwise, all you need to do is start linkgit:git-daemon[1]; it will
listen on port 9418. By default, it will allow access to any directory
that looks like a git directory and contains the magic file
-git-daemon-export-ok. Passing some directory paths as git-daemon
+git-daemon-export-ok. Passing some directory paths as `git daemon`
arguments will further restrict the exports to those paths.
-You can also run git-daemon as an inetd service; see the
+You can also run `git daemon` as an inetd service; see the
linkgit:git-daemon[1] man page for details. (See especially the
examples section.)
$ git push ssh://yourserver.com/~you/proj.git master
-------------------------------------------------
-As with git-fetch, git-push will complain if this does not result in a
+As with `git fetch`, `git push` will complain if this does not result in a
<<fast-forwards,fast forward>>; see the following section for details on
handling this case.
will not be updated by the push. This may lead to unexpected results if
the branch you push to is the currently checked-out branch!
-As with git-fetch, you may also set up configuration options to
+As with `git fetch`, you may also set up configuration options to
save typing; so, for example, after
-------------------------------------------------
This can happen, for example, if you:
- - use `git-reset --hard` to remove already-published commits, or
- - use `git-commit --amend` to replace already-published commits
+ - use `git reset --hard` to remove already-published commits, or
+ - use `git commit --amend` to replace already-published commits
(as in <<fixing-a-mistake-by-rewriting-history>>), or
- - use `git-rebase` to rebase any already-published commits (as
+ - use `git rebase` to rebase any already-published commits (as
in <<using-git-rebase>>).
-You may force git-push to perform the update anyway by preceding the
+You may force `git push` to perform the update anyway by preceding the
branch name with a plus sign:
-------------------------------------------------
- Git's ability to quickly import and merge patches allows a
single maintainer to process incoming changes even at very
- high rates. And when that becomes too much, git-pull provides
+ high rates. And when that becomes too much, `git pull` provides
an easy way for that maintainer to delegate this job to other
maintainers while still allowing optional review of incoming
changes.
you are rewriting history.
[[using-git-rebase]]
-Keeping a patch series up to date using git-rebase
+Keeping a patch series up to date using git rebase
--------------------------------------------------
Suppose that you create a branch "mywork" on a remote-tracking branch
................................................
In the process, it may discover conflicts. In that case it will stop
-and allow you to fix the conflicts; after fixing conflicts, use "git-add"
+and allow you to fix the conflicts; after fixing conflicts, use `git add`
to update the index with those contents, and then, instead of
-running git-commit, just run
+running `git commit`, just run
-------------------------------------------------
$ git rebase --continue
$ git tag bad mywork~5
-------------------------------------------------
-(Either gitk or git-log may be useful for finding the commit.)
+(Either gitk or `git log` may be useful for finding the commit.)
Then check out that commit, edit it, and rebase the rest of the series
on top of it (note that we could check out the commit on a temporary
and browse through the list of patches in the mywork branch using gitk,
applying them (possibly in a different order) to mywork-new using
-cherry-pick, and possibly modifying them as you go using `commit --amend`.
+cherry-pick, and possibly modifying them as you go using `git commit --amend`.
The linkgit:git-gui[1] command may also help as it allows you to
individually select diff hunks for inclusion in the index (by
right-clicking on the diff hunk and choosing "Stage Hunk for Commit").
-Another technique is to use git-format-patch to create a series of
+Another technique is to use `git format-patch` to create a series of
patches, then reset the state to before the patches:
-------------------------------------------------
linkgit:git-bisect[1] identifies C as the culprit, how will you
figure out that the problem is due to this change in semantics?
-When the result of a git-bisect is a non-merge commit, you should
+When the result of a `git bisect` is a non-merge commit, you should
normally be able to discover the problem by examining just that commit.
Developers can make this easy by breaking their changes into small
self-contained commits. That won't help in the case above, however,
git fetch and fast-forwards
---------------------------
-In the previous example, when updating an existing branch, "git-fetch"
+In the previous example, when updating an existing branch, "git fetch"
checks to make sure that the most recent commit on the remote
branch is a descendant of the most recent commit on your copy of the
branch before updating your copy of the branch to point at the new
o--o--o <-- new head of the branch
................................................
-In this case, "git-fetch" will fail, and print out a warning.
+In this case, "git fetch" will fail, and print out a warning.
In that case, you can still force git to update to the new head, as
described in the following section. However, note that in the
them.
[[forcing-fetch]]
-Forcing git-fetch to do non-fast-forward updates
+Forcing git fetch to do non-fast-forward updates
------------------------------------------------
If git fetch fails because the new head of a branch is not a
We already saw in <<understanding-commits>> that all commits are stored
under a 40-digit "object name". In fact, all the information needed to
represent the history of a project is stored in objects with such names.
-In each case the name is calculated by taking the SHA1 hash of the
-contents of the object. The SHA1 hash is a cryptographic hash function.
+In each case the name is calculated by taking the SHA-1 hash of the
+contents of the object. The SHA-1 hash is a cryptographic hash function.
What that means to us is that it is impossible to find two different
objects with the same name. This has a number of advantages; among
others:
same content stored in two repositories will always be stored under
the same name.
- Git can detect errors when it reads an object, by checking that the
- object's name is still the SHA1 hash of its contents.
+ object's name is still the SHA-1 hash of its contents.
(See <<object-details>> for the details of the object formatting and
-SHA1 calculation.)
+SHA-1 calculation.)
There are four different types of objects: "blob", "tree", "commit", and
"tag".
As you can see, a commit is defined by:
-- a tree: The SHA1 name of a tree object (as defined below), representing
+- a tree: The SHA-1 name of a tree object (as defined below), representing
the contents of a directory at a certain point in time.
-- parent(s): The SHA1 name of some number of commits which represent the
+- parent(s): The SHA-1 name of some number of commits which represent the
immediately previous step(s) in the history of the project. The
example above has one parent; merge commits may have more than
one. A commit with no parents is called a "root" commit, and
------------------------------------------------
As you can see, a tree object contains a list of entries, each with a
-mode, object type, SHA1 name, and name, sorted by name. It represents
+mode, object type, SHA-1 name, and name, sorted by name. It represents
the contents of a single directory tree.
The object type may be a blob, representing the contents of a file, or
another tree, representing the contents of a subdirectory. Since trees
-and blobs, like all other objects, are named by the SHA1 hash of their
-contents, two trees have the same SHA1 name if and only if their
+and blobs, like all other objects, are named by the SHA-1 hash of their
+contents, two trees have the same SHA-1 name if and only if their
contents (including, recursively, the contents of all subdirectories)
are identical. This allows git to quickly determine the differences
between two related tree objects, since it can ignore any entries with
Trust
~~~~~
-If you receive the SHA1 name of a blob from one source, and its contents
+If you receive the SHA-1 name of a blob from one source, and its contents
from another (possibly untrusted) source, you can still trust that those
-contents are correct as long as the SHA1 name agrees. This is because
-the SHA1 is designed so that it is infeasible to find different contents
+contents are correct as long as the SHA-1 name agrees. This is because
+the SHA-1 is designed so that it is infeasible to find different contents
that produce the same hash.
-Similarly, you need only trust the SHA1 name of a top-level tree object
+Similarly, you need only trust the SHA-1 name of a top-level tree object
to trust the contents of the entire directory that it refers to, and if
-you receive the SHA1 name of a commit from a trusted source, then you
+you receive the SHA-1 name of a commit from a trusted source, then you
can easily verify the entire history of commits reachable through
parents of that commit, and all of those contents of the trees referred
to by those commits.
commits tells others that they can trust the whole history.
In other words, you can easily validate a whole archive by just
-sending out a single email that tells the people the name (SHA1 hash)
+sending out a single email that tells the people the name (SHA-1 hash)
of the top commit, and digitally sign that email using something
like GPG/PGP.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Newly created objects are initially created in a file named after the
-object's SHA1 hash (stored in .git/objects).
+object's SHA-1 hash (stored in .git/objects).
Unfortunately this system becomes inefficient once a project has a
lot of objects. Try this on an old project:
to remove any of the "loose" objects that are now contained in the
pack. This will also remove any unreferenced objects (which may be
-created when, for example, you use "git-reset" to remove a commit).
+created when, for example, you use "git reset" to remove a commit).
You can verify that the loose objects are gone by looking at the
.git/objects directory or by running
pointer itself just doesn't, since you replaced it with another one.
There are also other situations that cause dangling objects. For
-example, a "dangling blob" may arise because you did a "git-add" of a
+example, a "dangling blob" may arise because you did a "git add" of a
file, but then, before you actually committed it and made it part of the
bigger picture, you changed something else in that file and committed
that *updated* thing--the old state that you added originally ends up
almost always the result of either being a half-way mergebase (the blob
will often even have the conflict markers from a merge in it, if you
have had conflicting merges that you fixed up by hand), or simply
-because you interrupted a "git-fetch" with ^C or something like that,
+because you interrupted a "git fetch" with ^C or something like that,
leaving _some_ of the new objects in the object database, but just
dangling and useless.
repository--it's kind of like doing a filesystem fsck recovery: you
don't want to do that while the filesystem is mounted.
-(The same is true of "git-fsck" itself, btw, but since
-git-fsck never actually *changes* the repository, it just reports
-on what it found, git-fsck itself is never "dangerous" to run.
+(The same is true of "git fsck" itself, btw, but since
+`git fsck` never actually *changes* the repository, it just reports
+on what it found, `git fsck` itself is never 'dangerous' to run.
Running it while somebody is actually changing the repository can cause
confusing and scary messages, but it won't actually do anything bad. In
contrast, running "git prune" while somebody is actively changing the
------------------------------------------------
which will create and store a blob object with the contents of
-somedirectory/myfile, and output the sha1 of that object. if you're
+somedirectory/myfile, and output the SHA-1 of that object. if you're
extremely lucky it might be 4b9458b3786228369c63936db65827de3cc06200, in
which case you've guessed right, and the corruption is fixed!
-----------
The index is a binary file (generally kept in .git/index) containing a
-sorted list of path names, each with permissions and the SHA1 of a blob
+sorted list of path names, each with permissions and the SHA-1 of a blob
object; linkgit:git-ls-files[1] can show you the contents of the index:
-------------------------------------------------
NOTE: Do not use local URLs here if you plan to publish your superproject!
-See what files `git-submodule` created:
+See what files `git submodule` created:
-------------------------------------------------
$ ls -a
. .. .git .gitmodules a b c d
-------------------------------------------------
-The `git-submodule add <repo> <path>` command does a couple of things:
+The `git submodule add <repo> <path>` command does a couple of things:
- It clones the submodule from <repo> to the given <path> under the
current directory and by default checks out the master branch.
$ git submodule init
-------------------------------------------------
-Now use `git-submodule update` to clone the repositories and check out the
+Now use `git submodule update` to clone the repositories and check out the
commits specified in the superproject:
-------------------------------------------------
. .. .git a.txt
-------------------------------------------------
-One major difference between `git-submodule update` and `git-submodule add` is
-that `git-submodule update` checks out a specific commit, rather than the tip
+One major difference between `git submodule update` and `git submodule add` is
+that `git submodule update` checks out a specific commit, rather than the tip
of a branch. It's like checking out a tag: the head is detached, so you're not
working on a branch.
index. Normal operation is just
-------------------------------------------------
-$ git read-tree <sha1 of tree>
+$ git read-tree <SHA-1 of tree>
-------------------------------------------------
and your index file will now be equivalent to the tree that you saved
files. This is not a very common operation, since normally you'd just
keep your files updated, and rather than write to your working
directory, you'd tell the index files about the changes in your
-working directory (i.e. `git-update-index`).
+working directory (i.e. `git update-index`).
However, if you decide to jump to a new version, or check out somebody
else's version, or just restore a previous tree, you'd populate your
or, if you want to check out all of the index, use `-a`.
-NOTE! git-checkout-index normally refuses to overwrite old files, so
+NOTE! `git checkout-index` normally refuses to overwrite old files, so
if you have an old version of the tree already checked out, you will
need to use the "-f" flag ('before' the "-a" flag or the filename) to
'force' the checkout.
and then giving the reason for the commit on stdin (either through
redirection from a pipe or file, or by just typing it at the tty).
-git-commit-tree will return the name of the object that represents
+`git commit-tree` will return the name of the object that represents
that commit, and you should save it away for later use. Normally,
you'd commit a new `HEAD` state, and while git doesn't care where you
save the note about that state, in practice we tend to just write the
to show its contents. NOTE! Trees have binary content, and as a result
there is a special helper for showing that content, called
-`git-ls-tree`, which turns the binary content into a more easily
+`git ls-tree`, which turns the binary content into a more easily
readable form.
It's especially instructive to look at "commit" objects, since those
------------------------------------------------
Each line of the `git ls-files --unmerged` output begins with
-the blob mode bits, blob SHA1, 'stage number', and the
+the blob mode bits, blob SHA-1, 'stage number', and the
filename. The 'stage number' is git's way to say which tree it
came from: stage 1 corresponds to `$orig` tree, stage 2 `HEAD`
tree, and stage3 `$target` tree.
Earlier we said that trivial merges are done inside
-`git-read-tree -m`. For example, if the file did not change
+`git read-tree -m`. For example, if the file did not change
from `$orig` to `HEAD` nor `$target`, or if the file changed
from `$orig` to `HEAD` and `$orig` to `$target` the same way,
obviously the final outcome is what is in `HEAD`. What the
$ git update-index hello.c
-------------------------------------------------
-When a path is in the "unmerged" state, running `git-update-index` for
+When a path is in the "unmerged" state, running `git update-index` for
that path tells git to mark the path resolved.
The above is the description of a git merge at the lowest level,
to help you understand what conceptually happens under the hood.
-In practice, nobody, not even git itself, runs `git-cat-file` three times
-for this. There is a `git-merge-index` program that extracts the
+In practice, nobody, not even git itself, runs `git cat-file` three times
+for this. There is a `git merge-index` program that extracts the
stages to temporary files and calls a "merge" script on it:
-------------------------------------------------
$ git merge-index git-merge-one-file hello.c
-------------------------------------------------
-and that is what higher level `git-merge -s resolve` is implemented with.
+and that is what higher level `git merge -s resolve` is implemented with.
[[hacking-git]]
Hacking git
Regardless of object type, all objects share the following
characteristics: they are all deflated with zlib, and have a header
that not only specifies their type, but also provides size information
-about the data in the object. It's worth noting that the SHA1 hash
+about the data in the object. It's worth noting that the SHA-1 hash
that is used to name the object is the hash of the original data
plus this header, so `sha1sum` 'file' does not match the object name
for 'file'.
(Historical note: in the dawn of the age of git the hash
-was the sha1 of the 'compressed' object.)
+was the SHA-1 of the 'compressed' object.)
As a result, the general consistency of an object can always be tested
independently of the contents or the type of the object: all objects can
The structured objects can further have their structure and
connectivity to other objects verified. This is generally done with
-the `git-fsck` program, which generates a full dependency graph
+the `git fsck` program, which generates a full dependency graph
of all objects, and verifies their internal consistency (in addition
to just verifying their superficial consistency through the hash).
This is just to get you into the groove for the most libified part of Git:
the revision walker.
-Basically, the initial version of `git-log` was a shell script:
+Basically, the initial version of `git log` was a shell script:
----------------------------------------------------------------
$ git-rev-list --pretty $(git-rev-parse --default HEAD "$@") | \
What does this mean?
-`git-rev-list` is the original version of the revision walker, which
+`git rev-list` is the original version of the revision walker, which
_always_ printed a list of revisions to stdout. It is still functional,
and needs to, since most new Git programs start out as scripts using
-`git-rev-list`.
+`git rev-list`.
-`git-rev-parse` is not as important any more; it was only used to filter out
+`git rev-parse` is not as important any more; it was only used to filter out
options that were relevant for the different plumbing commands that were
called by the script.
-Most of what `git-rev-list` did is contained in `revision.c` and
+Most of what `git rev-list` did is contained in `revision.c` and
`revision.h`. It wraps the options in a struct named `rev_info`, which
controls how and what revisions are walked, and more.
-The original job of `git-rev-parse` is now taken by the function
+The original job of `git rev-parse` is now taken by the function
`setup_revisions()`, which parses the revisions and the common command line
options for the revision walker. This information is stored in the struct
`rev_info` for later consumption. You can do your own command line option
`git show v1.3.0{tilde}155^2{tilde}4` and scroll down to that function (note that you
no longer need to call `setup_pager()` directly).
-Nowadays, `git-log` is a builtin, which means that it is _contained_ in the
+Nowadays, `git log` is a builtin, which means that it is _contained_ in the
command `git`. The source side of a builtin is
- a function called `cmd_<bla>`, typically defined in `builtin-<bla>.c`,
_not_ named like the `.c` file in which they live have to be listed in
`BUILT_INS` in the `Makefile`.
-`git-log` looks more complicated in C than it does in the original script,
+`git log` looks more complicated in C than it does in the original script,
but that allows for a much greater flexibility and performance.
Here again it is a good point to take a pause.
So, think about something which you are interested in, say, "how can I
access a blob just knowing the object name of it?". The first step is to
find a Git command with which you can do it. In this example, it is either
-`git-show` or `git-cat-file`.
+`git show` or `git cat-file`.
-For the sake of clarity, let's stay with `git-cat-file`, because it
+For the sake of clarity, let's stay with `git cat-file`, because it
- is plumbing, and
------------------------------------------------------------------
git_config(git_default_config);
if (argc != 3)
- usage("git-cat-file [-t|-s|-e|-p|<type>] <sha1>");
+ usage("git cat-file [-t|-s|-e|-p|<type>] <sha1>");
if (get_sha1(argv[2], sha1))
die("Not a valid object name %s", argv[2]);
------------------------------------------------------------------
-----------------------------------
Sometimes, you do not know where to look for a feature. In many such cases,
-it helps to search through the output of `git log`, and then `git-show` the
+it helps to search through the output of `git log`, and then `git show` the
corresponding commit.
-Example: If you know that there was some test case for `git-bundle`, but
+Example: If you know that there was some test case for `git bundle`, but
do not remember where it was (yes, you _could_ `git grep bundle t/`, but that
does not illustrate the point!):
- Whenever possible, section headings should clearly describe the task
they explain how to do, in language that requires no more knowledge
than necessary: for example, "importing patches into a project" rather
- than "the git-am command"
+ than "the `git am` command"
Think about how to create a clear chapter dependency graph that will
allow people to get to important topics without necessarily reading