Browse Source
* nd/index-doc: doc: technical details about the index file format doc: technical details about the index file formatmaint
Junio C Hamano
14 years ago
1 changed files with 185 additions and 0 deletions
@ -0,0 +1,185 @@
@@ -0,0 +1,185 @@
|
||||
GIT index format |
||||
================ |
||||
|
||||
= The git index file has the following format |
||||
|
||||
All binary numbers are in network byte order. Version 2 is described |
||||
here unless stated otherwise. |
||||
|
||||
- A 12-byte header consisting of |
||||
|
||||
4-byte signature: |
||||
The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache") |
||||
|
||||
4-byte version number: |
||||
The current supported versions are 2 and 3. |
||||
|
||||
32-bit number of index entries. |
||||
|
||||
- A number of sorted index entries (see below). |
||||
|
||||
- Extensions |
||||
|
||||
Extensions are identified by signature. Optional extensions can |
||||
be ignored if GIT does not understand them. |
||||
|
||||
GIT currently supports cached tree and resolve undo extensions. |
||||
|
||||
4-byte extension signature. If the first byte is 'A'..'Z' the |
||||
extension is optional and can be ignored. |
||||
|
||||
32-bit size of the extension |
||||
|
||||
Extension data |
||||
|
||||
- 160-bit SHA-1 over the content of the index file before this |
||||
checksum. |
||||
|
||||
== Index entry |
||||
|
||||
Index entries are sorted in ascending order on the name field, |
||||
interpreted as a string of unsigned bytes (i.e. memcmp() order, no |
||||
localization, no special casing of directory separator '/'). Entries |
||||
with the same name are sorted by their stage field. |
||||
|
||||
32-bit ctime seconds, the last time a file's metadata changed |
||||
this is stat(2) data |
||||
|
||||
32-bit ctime nanosecond fractions |
||||
this is stat(2) data |
||||
|
||||
32-bit mtime seconds, the last time a file's data changed |
||||
this is stat(2) data |
||||
|
||||
32-bit mtime nanosecond fractions |
||||
this is stat(2) data |
||||
|
||||
32-bit dev |
||||
this is stat(2) data |
||||
|
||||
32-bit ino |
||||
this is stat(2) data |
||||
|
||||
32-bit mode, split into (high to low bits) |
||||
|
||||
4-bit object type |
||||
valid values in binary are 1000 (regular file), 1010 (symbolic link) |
||||
and 1110 (gitlink) |
||||
|
||||
3-bit unused |
||||
|
||||
9-bit unix permission. Only 0755 and 0644 are valid for regular files. |
||||
Symbolic links and gitlinks have value 0 in this field. |
||||
|
||||
32-bit uid |
||||
this is stat(2) data |
||||
|
||||
32-bit gid |
||||
this is stat(2) data |
||||
|
||||
32-bit file size |
||||
This is the on-disk size from stat(2), truncated to 32-bit. |
||||
|
||||
160-bit SHA-1 for the represented object |
||||
|
||||
A 16-bit 'flags' field split into (high to low bits) |
||||
|
||||
1-bit assume-valid flag |
||||
|
||||
1-bit extended flag (must be zero in version 2) |
||||
|
||||
2-bit stage (during merge) |
||||
|
||||
12-bit name length if the length is less than 0xFFF; otherwise 0xFFF |
||||
is stored in this field. |
||||
|
||||
(Version 3) A 16-bit field, only applicable if the "extended flag" |
||||
above is 1, split into (high to low bits). |
||||
|
||||
1-bit reserved for future |
||||
|
||||
1-bit skip-worktree flag (used by sparse checkout) |
||||
|
||||
1-bit intent-to-add flag (used by "git add -N") |
||||
|
||||
13-bit unused, must be zero |
||||
|
||||
Entry path name (variable length) relative to top level directory |
||||
(without leading slash). '/' is used as path separator. The special |
||||
path components ".", ".." and ".git" (without quotes) are disallowed. |
||||
Trailing slash is also disallowed. |
||||
|
||||
The exact encoding is undefined, but the '.' and '/' characters |
||||
are encoded in 7-bit ASCII and the encoding cannot contain a NUL |
||||
byte (iow, this is a UNIX pathname). |
||||
|
||||
1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes |
||||
while keeping the name NUL-terminated. |
||||
|
||||
== Extensions |
||||
|
||||
=== Cached tree |
||||
|
||||
Cached tree extension contains pre-computed hashes for trees that can |
||||
be derived from the index. It helps speed up tree object generation |
||||
from index for a new commit. |
||||
|
||||
When a path is updated in index, the path must be invalidated and |
||||
removed from tree cache. |
||||
|
||||
The signature for this extension is { 'T', 'R', 'E', 'E' }. |
||||
|
||||
A series of entries fill the entire extension; each of which |
||||
consists of: |
||||
|
||||
- NUL-terminated path component (relative to its parent directory); |
||||
|
||||
- ASCII decimal number of entries in the index that is covered by the |
||||
tree this entry represents (entry_count); |
||||
|
||||
- A space (ASCII 32); |
||||
|
||||
- ASCII decimal number that represents the number of subtrees this |
||||
tree has; |
||||
|
||||
- A newline (ASCII 10); and |
||||
|
||||
- 160-bit object name for the object that would result from writing |
||||
this span of index as a tree. |
||||
|
||||
An entry can be in an invalidated state and is represented by having -1 |
||||
in the entry_count field. |
||||
|
||||
The entries are written out in the top-down, depth-first order. The |
||||
first entry represents the root level of the repository, followed by the |
||||
first subtree---let's call this A---of the root level (with its name |
||||
relative to the root level), followed by the first subtree of A (with |
||||
its name relative to A), ... |
||||
|
||||
=== Resolve undo |
||||
|
||||
A conflict is represented in the index as a set of higher stage entries. |
||||
When a conflict is resolved (e.g. with "git add path"), these higher |
||||
stage entries will be removed and a stage-0 entry with proper resoluton |
||||
is added. |
||||
|
||||
When these higher stage entries are removed, they are saved in the |
||||
resolve undo extension, so that conflicts can be recreated (e.g. with |
||||
"git checkout -m"), in case users want to redo a conflict resolution |
||||
from scratch. |
||||
|
||||
The signature for this extension is { 'R', 'E', 'U', 'C' }. |
||||
|
||||
A series of entries fill the entire extension; each of which |
||||
consists of: |
||||
|
||||
- NUL-terminated pathname the entry describes (relative to the root of |
||||
the repository, i.e. full pathname); |
||||
|
||||
- Three NUL-terminated ASCII octal numbers, entry mode of entries in |
||||
stage 1 to 3 (a missing stage is represented by "0" in this field); |
||||
and |
||||
|
||||
- At most three 160-bit object names of the entry in stages from 1 to 3 |
||||
(nothing is written for a missing stage). |
||||
|
Loading…
Reference in new issue