Some instructions on dealing with corruption of the object database.
Most of this text is from an example by Linus, identified by Nicolas
Pitre <nico@cam.org> with a little further editing by me.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
@ -1560,6 +1560,11 @@ This may be time-consuming. Unlike most other git operations (including
@@ -1560,6 +1560,11 @@ This may be time-consuming. Unlike most other git operations (including
git-gc when run without any options), it is not safe to prune while
other git operations are in progress in the same repository.
If gitlink:git-fsck[1] complains about sha1 mismatches or missing
objects, you may have a much more serious problem; your best option is
probably restoring from backups. See
<<recovering-from-repository-corruption>> for a detailed discussion.
[[recovering-lost-changes]]
Recovering lost changes
~~~~~~~~~~~~~~~~~~~~~~~
@ -3220,6 +3225,127 @@ confusing and scary messages, but it won't actually do anything bad. In
@@ -3220,6 +3225,127 @@ confusing and scary messages, but it won't actually do anything bad. In
contrast, running "git prune" while somebody is actively changing the
repository is a *BAD* idea).
[[recovering-from-repository-corruption]]
Recovering from repository corruption
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
By design, git treats data trusted to it with caution. However, even in
the absence of bugs in git itself, it is still possible that hardware or
operating system errors could corrupt data.
The first defense against such problems is backups. You can back up a
git directory using clone, or just using cp, tar, or any other backup
mechanism.
As a last resort, you can search for the corrupted objects and attempt
to replace them by hand. Back up your repository before attempting this
in case you corrupt things even more in the process.
We'll assume that the problem is a single missing or corrupted blob,
which is sometimes a solveable problem. (Recovering missing trees and
especially commits is *much* harder).
Before starting, verify that there is corruption, and figure out where
it is with gitlink:git-fsck[1]; this may be time-consuming.
Assume the output looks like this:
------------------------------------------------
$ git-fsck --full
broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
Because you're asking for raw output, you'll now get something like
------------------------------------------------
commit abc
Author:
Date:
...
:100644 100644 4b9458b... newsha... M somedirectory/myfile
commit xyz
Author:
Date:
...
:100644 100644 oldsha... 4b9458b... M somedirectory/myfile
------------------------------------------------
This tells you that the immediately preceding version of the file was
"newsha", and that the immediately following version was "oldsha".
You also know the commit messages that went with the change from oldsha
to 4b9458b and with the change from 4b9458b to newsha.
If you've been committing small enough changes, you may now have a good
shot at reconstructing the contents of the in-between state 4b9458b.
If you can do that, you can now recreate the missing object with
------------------------------------------------
$ git hash-object -w <recreated-file>
------------------------------------------------
and your repository is good again!
(Btw, you could have ignored the fsck, and started with doing a
------------------------------------------------
$ git log --raw --all
------------------------------------------------
and just looked for the sha of the missing object (4b9458b..) in that
whole thing. It's up to you - git does *have* a lot of information, it is
just missing one particular blob version.
[[the-index]]
The index
-----------
@ -4429,4 +4555,7 @@ Write a chapter on using plumbing and writing scripts.
@@ -4429,4 +4555,7 @@ Write a chapter on using plumbing and writing scripts.