132 lines
5.4 KiB
Plaintext
132 lines
5.4 KiB
Plaintext
gitformat-loose(5)
|
|
==================
|
|
|
|
NAME
|
|
----
|
|
gitformat-loose - Git loose object format
|
|
|
|
|
|
SYNOPSIS
|
|
--------
|
|
[verse]
|
|
$GIT_DIR/objects/[0-9a-f][0-9a-f]/*
|
|
$GIT_DIR/objects/object-map/map-*.map
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
|
|
Loose objects are how Git stores individual objects, where every object is
|
|
written as a separate file.
|
|
|
|
Over the lifetime of a repository, objects are usually written as loose objects
|
|
initially. Eventually, these loose objects will be compacted into packfiles
|
|
via repository maintenance to improve disk space usage and speed up the lookup
|
|
of these objects.
|
|
|
|
== Loose objects
|
|
|
|
Each loose object contains a prefix, followed immediately by the data of the
|
|
object. The prefix contains `<type> <size>\0`. `<type>` is one of `blob`,
|
|
`tree`, `commit`, or `tag` and `size` is the size of the data (without the
|
|
prefix) as a decimal integer expressed in ASCII.
|
|
|
|
The entire contents, prefix and data concatenated, is then compressed with zlib
|
|
and the compressed data is stored in the file. The object ID of the object is
|
|
the SHA-1 or SHA-256 (as appropriate) hash of the uncompressed data.
|
|
|
|
The file for the loose object is stored under the `objects` directory, with the
|
|
first two hex characters of the object ID being the directory and the remaining
|
|
characters being the file name. This is done to shard the data and avoid too
|
|
many files being in one directory, since some file systems perform poorly with
|
|
many items in a directory.
|
|
|
|
As an example, the empty tree contains the data (when uncompressed) `tree 0\0`
|
|
and, in a SHA-256 repository, would have the object ID
|
|
`6ef19b41225c5369f1c104d45d8d85efa9b057b53b14b4b9b939dd74decc5321` and would be
|
|
stored under
|
|
`$GIT_DIR/objects/6e/f19b41225c5369f1c104d45d8d85efa9b057b53b14b4b9b939dd74decc5321`.
|
|
|
|
Similarly, a blob containing the contents `abc` would have the uncompressed
|
|
data of `blob 3\0abc`.
|
|
|
|
== Loose object mapping
|
|
|
|
When the `compatObjectFormat` option is used, Git needs to store a mapping
|
|
between the repository's main algorithm and the compatibility algorithm for
|
|
loose objects as well as some auxiliary information.
|
|
|
|
The mapping consists of a set of files under `$GIT_DIR/objects/object-map`
|
|
ending in `.map`. The portion of the filename before the extension is that of
|
|
the main hash checksum (that is, the one specified in
|
|
`extensions.objectformat`) in hex format.
|
|
|
|
`git gc` will repack existing entries into one file, removing any unnecessary
|
|
objects, such as obsolete shallow entries or loose objects that have been
|
|
packed.
|
|
|
|
The file format is as follows. All values are in network byte order and all
|
|
4-byte and 8-byte values must be 4-byte aligned in the file, so the NUL padding
|
|
may be required in some cases. Git always uses the smallest number of NUL
|
|
bytes (including zero) that is required for the padding in order to make
|
|
writing files deterministic.
|
|
|
|
- A header appears at the beginning and consists of the following:
|
|
* A 4-byte mapping signature: `LMAP`
|
|
* 4-byte version number: 1
|
|
* 4-byte length of the header section (including reserved entries but
|
|
excluding any NUL padding).
|
|
* 4-byte number of objects declared in this map file.
|
|
* 4-byte number of object formats declared in this map file.
|
|
* For each object format:
|
|
** 4-byte format identifier (e.g., `sha1` for SHA-1)
|
|
** 4-byte length in bytes of shortened object names (that is, prefixes of
|
|
the full object names). This is the shortest possible length needed to
|
|
make names in the shortened object name table unambiguous.
|
|
** 8-byte integer, recording where tables relating to this format
|
|
are stored in this index file, as an offset from the beginning.
|
|
* 8-byte offset to the trailer from the beginning of this file.
|
|
* The remainder of the header section is reserved for future use.
|
|
Readers must ignore unrecognized data here.
|
|
- Zero or more NUL bytes. These are used to improve the alignment of the
|
|
4-byte quantities below.
|
|
- Tables for the first object format:
|
|
* A sorted table of shortened object names. These are prefixes of the names
|
|
of all objects in this file, packed together to reduce the cache footprint
|
|
of the binary search for a specific object name.
|
|
* A sorted table of full object names.
|
|
* A table of 4-byte metadata values.
|
|
- Zero or more NUL bytes.
|
|
- Tables for subsequent object formats:
|
|
* A sorted table of shortened object names. These are prefixes of the names
|
|
of all objects in this file, packed together without offset values to
|
|
reduce the cache footprint of the binary search for a specific object name.
|
|
* A table of full object names in the order specified by the first object format.
|
|
* A table of 4-byte values mapping object name order to the order of the
|
|
first object format. For an object in the table of sorted shortened object
|
|
names, the value at the corresponding index in this table is the index in
|
|
the previous table for that same object.
|
|
* Zero or more NUL bytes.
|
|
- The trailer consists of the following:
|
|
* Hash checksum of all of the above using the main hash.
|
|
|
|
The lower six bits of each metadata table contain a type field indicating the
|
|
reason that this object is stored:
|
|
|
|
0::
|
|
Reserved.
|
|
1::
|
|
This object is stored as a loose object in the repository.
|
|
2::
|
|
This object is a shallow entry. The mapping refers to a shallow value
|
|
returned by a remote server.
|
|
3::
|
|
This object is a submodule entry. The mapping refers to the commit stored
|
|
representing a submodule.
|
|
|
|
Other data may be stored in this field in the future. Bits that are not used
|
|
must be zero.
|
|
|
|
GIT
|
|
---
|
|
Part of the linkgit:git[1] suite
|