You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
118 lines
3.9 KiB
118 lines
3.9 KiB
GIT pack format |
|
=============== |
|
|
|
= pack-*.pack file has the following format: |
|
|
|
- The header appears at the beginning and consists of the following: |
|
|
|
4-byte signature: |
|
The signature is: {'P', 'A', 'C', 'K'} |
|
|
|
4-byte version number (network byte order): |
|
GIT currently accepts version number 2 or 3 but |
|
generates version 2 only. |
|
|
|
4-byte number of objects contained in the pack (network byte order) |
|
|
|
Observation: we cannot have more than 4G versions ;-) and |
|
more than 4G objects in a pack. |
|
|
|
- The header is followed by number of object entries, each of |
|
which looks like this: |
|
|
|
(undeltified representation) |
|
n-byte type and length (3-bit type, (n-1)*7+4-bit length) |
|
compressed data |
|
|
|
(deltified representation) |
|
n-byte type and length (3-bit type, (n-1)*7+4-bit length) |
|
20-byte base object name |
|
compressed delta data |
|
|
|
Observation: length of each object is encoded in a variable |
|
length format and is not constrained to 32-bit or anything. |
|
|
|
- The trailer records 20-byte SHA1 checksum of all of the above. |
|
|
|
= pack-*.idx file has the following format: |
|
|
|
- The header consists of 256 4-byte network byte order |
|
integers. N-th entry of this table records the number of |
|
objects in the corresponding pack, the first byte of whose |
|
object name are smaller than N. This is called the |
|
'first-level fan-out' table. |
|
|
|
Observation: we would need to extend this to an array of |
|
8-byte integers to go beyond 4G objects per pack, but it is |
|
not strictly necessary. |
|
|
|
- The header is followed by sorted 24-byte entries, one entry |
|
per object in the pack. Each entry is: |
|
|
|
4-byte network byte order integer, recording where the |
|
object is stored in the packfile as the offset from the |
|
beginning. |
|
|
|
20-byte object name. |
|
|
|
Observation: we would definitely need to extend this to |
|
8-byte integer plus 20-byte object name to handle a packfile |
|
that is larger than 4GB. |
|
|
|
- The file is concluded with a trailer: |
|
|
|
A copy of the 20-byte SHA1 checksum at the end of |
|
corresponding packfile. |
|
|
|
20-byte SHA1-checksum of all of the above. |
|
|
|
Pack Idx file: |
|
|
|
idx |
|
+--------------------------------+ |
|
| fanout[0] = 2 |-. |
|
+--------------------------------+ | |
|
| fanout[1] | | |
|
+--------------------------------+ | |
|
| fanout[2] | | |
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
|
| fanout[255] | | |
|
+--------------------------------+ | |
|
main | offset | | |
|
index | object name 00XXXXXXXXXXXXXXXX | | |
|
table +--------------------------------+ | |
|
| offset | | |
|
| object name 00XXXXXXXXXXXXXXXX | | |
|
+--------------------------------+ | |
|
.-| offset |<+ |
|
| | object name 01XXXXXXXXXXXXXXXX | |
|
| +--------------------------------+ |
|
| | offset | |
|
| | object name 01XXXXXXXXXXXXXXXX | |
|
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
| | offset | |
|
| | object name FFXXXXXXXXXXXXXXXX | |
|
| +--------------------------------+ |
|
trailer | | packfile checksum | |
|
| +--------------------------------+ |
|
| | idxfile checksum | |
|
| +--------------------------------+ |
|
.-------. |
|
| |
|
Pack file entry: <+ |
|
|
|
packed object header: |
|
1-byte size extension bit (MSB) |
|
type (next 3 bit) |
|
size0 (lower 4-bit) |
|
n-byte sizeN (as long as MSB is set, each 7-bit) |
|
size0..sizeN form 4+7+7+..+7 bit integer, size0 |
|
is the least significant part, and sizeN is the |
|
most significant part. |
|
packed object data: |
|
If it is not DELTA, then deflated bytes (the size above |
|
is the size before compression). |
|
If it is DELTA, then |
|
20-byte base object name SHA1 (the size above is the |
|
size of the delta data that follows). |
|
delta data, deflated.
|
|
|