If you run "git branch" at this point, you'll see that git has
temporarily moved you to a new branch named "bisect". This branch
points to a commit (with commit id 65934...) that is reachable from
-v2.6.19 but not from v2.6.18. Compile and test it, and see whether
+"master" but not from v2.6.18. Compile and test it, and see whether
it crashes. Assume it does crash. Then:
-------------------------------------------------
commit. You can find out with this:
-------------------------------------------------
-$ git log --raw --abbrev=40 --pretty=oneline -- filename |
+$ git log --raw --abbrev=40 --pretty=oneline |
grep -B 1 `git hash-object filename`
-------------------------------------------------
fundamentally different ways to fix the problem:
1. You can create a new commit that undoes whatever was done
- by the previous commit. This is the correct thing if your
+ by the old commit. This is the correct thing if your
mistake has already been made public.
2. You can go back and modify the old commit. You should
-------------------------
On large repositories, git depends on compression to keep the history
-information from taking up to much space on disk or in memory.
+information from taking up too much space on disk or in memory.
This compression is not performed automatically. Therefore you
should occasionally run gitlink:git-gc[1]:
Dangling objects are not a problem. At worst they may take up a little
extra disk space. They can sometimes provide a last-resort method for
recovering lost work--see <<dangling-objects>> for details. However, if
-you wish, you can remove them with gitlink:git-prune[1] or the --prune
+you wish, you can remove them with gitlink:git-prune[1] or the `--prune`
option to gitlink:git-gc[1]:
-------------------------------------------------
git-gc when run without any options), it is not safe to prune while
other git operations are in progress in the same repository.
+If gitlink:git-fsck[1] complains about sha1 mismatches or missing
+objects, you may have a much more serious problem; your best option is
+probably restoring from backups. See
+<<recovering-from-repository-corruption>> for a detailed discussion.
+
[[recovering-lost-changes]]
Recovering lost changes
~~~~~~~~~~~~~~~~~~~~~~~
Reflogs
^^^^^^^
-Say you modify a branch with gitlink:git-reset[1] --hard, and then
+Say you modify a branch with `gitlink:git-reset[1] --hard`, and then
realize that the branch was the only reference you had to that point in
history.
$ git log master@{1}
-------------------------------------------------
-This lists the commits reachable from the previous version of the head.
-This syntax can be used to with any git command that accepts a commit,
-not just with git log. Some other examples:
+This lists the commits reachable from the previous version of the
+"master" branch head. This syntax can be used with any git command
+that accepts a commit, not just with git log. Some other examples:
-------------------------------------------------
$ git show master@{2} # See where the branch pointed 2,
More generally, a branch that is created from a remote branch will pull
by default from that branch. See the descriptions of the
branch.<name>.remote and branch.<name>.merge options in
-gitlink:git-config[1], and the discussion of the --track option in
+gitlink:git-config[1], and the discussion of the `--track` option in
gitlink:git-checkout[1], to learn how to control these defaults.
In addition to saving you keystrokes, "git pull" also helps you by
$ git pull /path/to/other/repository
-------------------------------------------------
-or an ssh url:
+or an ssh URL:
-------------------------------------------------
$ git clone ssh://yourhost/~you/repository
This is the preferred method.
If someone else administers the server, they should tell you what
-directory to put the repository in, and what git:// url it will appear
+directory to put the repository in, and what git:// URL it will appear
at. You can then skip to the section
"<<pushing-changes-to-a-public-repository,Pushing changes to a public
repository>>", below.
gitlink:git-update-server-info[1], and the documentation
link:hooks.html[Hooks used by git].)
-Advertise the url of proj.git. Anybody else should then be able to
-clone or pull from that url, for example with a command line like:
+Advertise the URL of proj.git. Anybody else should then be able to
+clone or pull from that URL, for example with a command line like:
-------------------------------------------------
$ git clone http://yourserver.com/~you/proj.git
$ git branch --track release origin/master
-------------------------------------------------
-These can be easily kept up to date using gitlink:git-pull[1]
+These can be easily kept up to date using gitlink:git-pull[1].
-------------------------------------------------
$ git checkout test && git pull
$ git log linux..branchname | git-shortlog
-------------------------------------------------
-To see whether it has already been merged into the test or release branches
+To see whether it has already been merged into the test or release branches,
use:
-------------------------------------------------
$ git log release..branchname
-------------------------------------------------
-(If this branch has not yet been merged you will see some log entries.
+(If this branch has not yet been merged, you will see some log entries.
If it has been merged, then there will be no output.)
Once a patch completes the great cycle (moving from test to release,
then pulled by Linus, and finally coming back into your local
-"origin/master" branch) the branch for this change is no longer needed.
+"origin/master" branch), the branch for this change is no longer needed.
You detect this when the output from:
-------------------------------------------------
and git will continue applying the rest of the patches.
-At any point you may use the --abort option to abort this process and
+At any point you may use the `--abort` option to abort this process and
return mywork to the state it had before you started the rebase:
-------------------------------------------------
$ gitk origin..mywork &
-------------------------------------------------
-And browse through the list of patches in the mywork branch using gitk,
+and browse through the list of patches in the mywork branch using gitk,
applying them (possibly in a different order) to mywork-new using
-cherry-pick, and possibly modifying them as you go using commit --amend.
+cherry-pick, and possibly modifying them as you go using `commit --amend`.
The gitlink:git-gui[1] command may also help as it allows you to
individually select diff hunks for inclusion in the index (by
right-clicking on the diff hunk and choosing "Stage Hunk for Commit").
- Git can quickly determine whether two objects are identical or not,
just by comparing names.
-- Since object names are computed the same way in ever repository, the
+- Since object names are computed the same way in every repository, the
same content stored in two repositories will always be stored under
the same name.
- Git can detect errors when it reads an object, by checking that the
"blob" objects into a directory structure. In addition, a tree object
can refer to other tree objects, thus creating a directory hierarchy.
- A <<def_commit_object,"commit" object>> ties such directory hierarchies
- together into a <<def_DAG,directed acyclic graph>> of revisions - each
+ together into a <<def_DAG,directed acyclic graph>> of revisions--each
commit contains the object name of exactly one tree designating the
directory hierarchy at the time of the commit. In addition, a commit
refers to "parent" commit objects that describe the history of how we
example, a "dangling blob" may arise because you did a "git add" of a
file, but then, before you actually committed it and made it part of the
bigger picture, you changed something else in that file and committed
-that *updated* thing - the old state that you added originally ends up
+that *updated* thing--the old state that you added originally ends up
not being pointed to by any commit or tree, so it's now a dangling blob
object.
Generally, dangling objects aren't anything to worry about. They can
even be very useful: if you screw something up, the dangling objects can
be how you recover your old tree (say, you did a rebase, and realized
-that you really didn't want to - you can look at what dangling objects
+that you really didn't want to--you can look at what dangling objects
you have, and decide to reset your head to some old dangling state).
For commits, you can just use:
------------------------------------------------
and they'll be gone. But you should only run "git prune" on a quiescent
-repository - it's kind of like doing a filesystem fsck recovery: you
+repository--it's kind of like doing a filesystem fsck recovery: you
don't want to do that while the filesystem is mounted.
-(The same is true of "git-fsck" itself, btw - but since
+(The same is true of "git-fsck" itself, btw, but since
git-fsck never actually *changes* the repository, it just reports
on what it found, git-fsck itself is never "dangerous" to run.
Running it while somebody is actually changing the repository can cause
contrast, running "git prune" while somebody is actively changing the
repository is a *BAD* idea).
+[[recovering-from-repository-corruption]]
+Recovering from repository corruption
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By design, git treats data trusted to it with caution. However, even in
+the absence of bugs in git itself, it is still possible that hardware or
+operating system errors could corrupt data.
+
+The first defense against such problems is backups. You can back up a
+git directory using clone, or just using cp, tar, or any other backup
+mechanism.
+
+As a last resort, you can search for the corrupted objects and attempt
+to replace them by hand. Back up your repository before attempting this
+in case you corrupt things even more in the process.
+
+We'll assume that the problem is a single missing or corrupted blob,
+which is sometimes a solveable problem. (Recovering missing trees and
+especially commits is *much* harder).
+
+Before starting, verify that there is corruption, and figure out where
+it is with gitlink:git-fsck[1]; this may be time-consuming.
+
+Assume the output looks like this:
+
+------------------------------------------------
+$ git-fsck --full
+broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+ to blob 4b9458b3786228369c63936db65827de3cc06200
+missing blob 4b9458b3786228369c63936db65827de3cc06200
+------------------------------------------------
+
+(Typically there will be some "dangling object" messages too, but they
+aren't interesting.)
+
+Now you know that blob 4b9458b3 is missing, and that the tree 2d9263c6
+points to it. If you could find just one copy of that missing blob
+object, possibly in some other repository, you could move it into
+.git/objects/4b/9458b3... and be done. Suppose you can't. You can
+still examine the tree that pointed to it with gitlink:git-ls-tree[1],
+which might output something like:
+
+------------------------------------------------
+$ git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
+100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
+100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
+...
+100644 blob 4b9458b3786228369c63936db65827de3cc06200 myfile
+...
+------------------------------------------------
+
+So now you know that the missing blob was the data for a file named
+"myfile". And chances are you can also identify the directory--let's
+say it's in "somedirectory". If you're lucky the missing copy might be
+the same as the copy you have checked out in your working tree at
+"somedirectory/myfile"; you can test whether that's right with
+gitlink:git-hash-object[1]:
+
+------------------------------------------------
+$ git hash-object -w somedirectory/myfile
+------------------------------------------------
+
+which will create and store a blob object with the contents of
+somedirectory/myfile, and output the sha1 of that object. if you're
+extremely lucky it might be 4b9458b3786228369c63936db65827de3cc06200, in
+which case you've guessed right, and the corruption is fixed!
+
+Otherwise, you need more information. How do you tell which version of
+the file has been lost?
+
+The easiest way to do this is with:
+
+------------------------------------------------
+$ git log --raw --all --full-history -- somedirectory/myfile
+------------------------------------------------
+
+Because you're asking for raw output, you'll now get something like
+
+------------------------------------------------
+commit abc
+Author:
+Date:
+...
+:100644 100644 4b9458b... newsha... M somedirectory/myfile
+
+
+commit xyz
+Author:
+Date:
+
+...
+:100644 100644 oldsha... 4b9458b... M somedirectory/myfile
+------------------------------------------------
+
+This tells you that the immediately preceding version of the file was
+"newsha", and that the immediately following version was "oldsha".
+You also know the commit messages that went with the change from oldsha
+to 4b9458b and with the change from 4b9458b to newsha.
+
+If you've been committing small enough changes, you may now have a good
+shot at reconstructing the contents of the in-between state 4b9458b.
+
+If you can do that, you can now recreate the missing object with
+
+------------------------------------------------
+$ git hash-object -w <recreated-file>
+------------------------------------------------
+
+and your repository is good again!
+
+(Btw, you could have ignored the fsck, and started with doing a
+
+------------------------------------------------
+$ git log --raw --all
+------------------------------------------------
+
+and just looked for the sha of the missing object (4b9458b..) in that
+whole thing. It's up to you - git does *have* a lot of information, it is
+just missing one particular blob version.
+
[[the-index]]
The index
-----------
------------
High-level operations such as gitlink:git-commit[1],
-gitlink:git-checkout[1] and git-reset[1] work by moving data between the
-working tree, the index, and the object database. Git provides
-low-level operations which perform each of these steps individually.
+gitlink:git-checkout[1] and gitlink:git-reset[1] work by moving data
+between the working tree, the index, and the object database. Git
+provides low-level operations which perform each of these steps
+individually.
Generally, all "git" operations work on the index file. Some operations
work *purely* on the index file (showing the current state of the
NOTE! A `--remove` flag does 'not' mean that subsequent filenames will
necessarily be removed: if the files still exist in your directory
structure, the index will be updated with their new status, not
-removed. The only thing `--remove` means is that update-cache will be
+removed. The only thing `--remove` means is that update-index will be
considering a removed file to be a valid thing, and if the file really
does not exist any more, it will update the index accordingly.
$ git write-tree
-------------------------------------------------
-that doesn't come with any options - it will just write out the
+that doesn't come with any options--it will just write out the
current index into the set of tree objects that describe that state,
and it will return the name of the resulting top-level tree. You can
use that tree to re-generate the index at any time by going in the
~~~~~~~~~~~~~~~~~~~~~~~~
You read a "tree" file from the object database, and use that to
-populate (and overwrite - don't do this if your index contains any
+populate (and overwrite--don't do this if your index contains any
unsaved state that you might want to restore later!) your current
index. Normal operation is just
To commit a tree you have instantiated with "git-write-tree", you'd
create a "commit" object that refers to that tree and the history
-behind it - most notably the "parent" commits that preceded it in
+behind it--most notably the "parent" commits that preceded it in
history.
Normally a "commit" has one parent: the previous state of the tree
tree, aka the common tree, and the two "result" trees, aka the branches
you want to merge), you do a "merge" read into the index. This will
complain if it has to throw away your old index contents, so you should
-make sure that you've committed those - in fact you would normally
+make sure that you've committed those--in fact you would normally
always do a merge against your last commit (which should thus match what
you have in your current index anyway).
---------------------------------
Sadly, many merges aren't trivial. If there are files that have
-been added.moved or removed, or if both branches have modified the
+been added, moved or removed, or if both branches have modified the
same file, you will be left with an index tree that contains "merge
entries" in it. Such an index tree can 'NOT' be written out to a tree
object, and you will have to resolve any such merge clashes using
- `get_sha1()` returns 0 on _success_. This might surprise some new
Git hackers, but there is a long tradition in UNIX to return different
- negative numbers in case of different errors -- and 0 on success.
+ negative numbers in case of different errors--and 0 on success.
- the variable `sha1` in the function signature of `get_sha1()` is `unsigned
char \*`, but is actually expected to be a pointer to `unsigned
$ git branch -d new # delete branch "new"
-----------------------------------------------
-Instead of basing new branch on current HEAD (the default), use:
+Instead of basing a new branch on current HEAD (the default), use:
-----------------------------------------------
$ git branch new test # branch named "test"
Alternates, clone -reference, etc.
-git unpack-objects -r for recovery
+More on recovery from repository corruption. See:
+ http://marc.theaimsgroup.com/?l=git&m=117263864820799&w=2
+ http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2
+ http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2