Documentation/gitpacking.txt: describe pseudo-merge bitmaps
Add some details to the gitpacking(7) manual page which motivate and describe pseudo-merge bitmaps. The exact on-disk format and many of the configuration knobs will be described in subsequent commits. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>maint
parent
0074cc2994
commit
40864ac902
|
@ -24,6 +24,78 @@ There are many aspects of packing in Git that are not covered in this
|
||||||
document that instead live in the aforementioned areas. Over time, those
|
document that instead live in the aforementioned areas. Over time, those
|
||||||
scattered bits may coalesce into this document.
|
scattered bits may coalesce into this document.
|
||||||
|
|
||||||
|
== Pseudo-merge bitmaps
|
||||||
|
|
||||||
|
NOTE: Pseudo-merge bitmaps are considered an experimental feature, so
|
||||||
|
the configuration and many of the ideas are subject to change.
|
||||||
|
|
||||||
|
=== Background
|
||||||
|
|
||||||
|
Reachability bitmaps are most efficient when we have on-disk stored
|
||||||
|
bitmaps for one or more of the starting points of a traversal. For this
|
||||||
|
reason, Git prefers storing bitmaps for commits at the tips of refs,
|
||||||
|
because traversals tend to start with those points.
|
||||||
|
|
||||||
|
But if you have a large number of refs, it's not feasible to store a
|
||||||
|
bitmap for _every_ ref tip. It takes up space, and just OR-ing all of
|
||||||
|
those bitmaps together is expensive.
|
||||||
|
|
||||||
|
One way we can deal with that is to create bitmaps that represent
|
||||||
|
_groups_ of refs. When a traversal asks about the entire group, then we
|
||||||
|
can use this single bitmap instead of considering each ref individually.
|
||||||
|
Because these bitmaps represent the set of objects which would be
|
||||||
|
reachable in a hypothetical merge of all of the commits, we call them
|
||||||
|
pseudo-merge bitmaps.
|
||||||
|
|
||||||
|
=== Overview
|
||||||
|
|
||||||
|
A "pseudo-merge bitmap" is used to refer to a pair of bitmaps, as
|
||||||
|
follows:
|
||||||
|
|
||||||
|
Commit bitmap::
|
||||||
|
|
||||||
|
A bitmap whose set bits describe the set of commits included in the
|
||||||
|
pseudo-merge's "merge" bitmap (as below).
|
||||||
|
|
||||||
|
Merge bitmap::
|
||||||
|
|
||||||
|
A bitmap whose set bits describe the reachability closure over the set
|
||||||
|
of commits in the pseudo-merge's "commits" bitmap (as above). An
|
||||||
|
identical bitmap would be generated for an octopus merge with the same
|
||||||
|
set of parents as described in the commits bitmap.
|
||||||
|
|
||||||
|
Pseudo-merge bitmaps can accelerate bitmap traversals when all commits
|
||||||
|
for a given pseudo-merge are listed on either side of the traversal,
|
||||||
|
either directly (by explicitly asking for them as part of the `HAVES`
|
||||||
|
or `WANTS`) or indirectly (by encountering them during a fill-in
|
||||||
|
traversal).
|
||||||
|
|
||||||
|
=== Use-cases
|
||||||
|
|
||||||
|
For example, suppose there exists a pseudo-merge bitmap with a large
|
||||||
|
number of commits, all of which are listed in the `WANTS` section of
|
||||||
|
some bitmap traversal query. When pseudo-merge bitmaps are enabled, the
|
||||||
|
bitmap machinery can quickly determine there is a pseudo-merge which
|
||||||
|
satisfies some subset of the wanted objects on either side of the query.
|
||||||
|
Then, we can inflate the EWAH-compressed bitmap, and `OR` it in to the
|
||||||
|
resulting bitmap. By contrast, without pseudo-merge bitmaps, we would
|
||||||
|
have to repeat the decompression and `OR`-ing step over a potentially
|
||||||
|
large number of individual bitmaps, which can take proportionally more
|
||||||
|
time.
|
||||||
|
|
||||||
|
Another benefit of pseudo-merges arises when there is some combination
|
||||||
|
of (a) a large number of references, with (b) poor bitmap coverage, and
|
||||||
|
(c) deep, nested trees, making fill-in traversal relatively expensive.
|
||||||
|
For example, suppose that there are a large enough number of tags where
|
||||||
|
bitmapping each of the tags individually is infeasible. Without
|
||||||
|
pseudo-merge bitmaps, computing the result of, say, `git rev-list
|
||||||
|
--use-bitmap-index --count --objects --tags` would likely require a
|
||||||
|
large amount of fill-in traversal. But when a large quantity of those
|
||||||
|
tags are stored together in a pseudo-merge bitmap, the bitmap machinery
|
||||||
|
can take advantage of the fact that we only care about the union of
|
||||||
|
objects reachable from all of those tags, and answer the query much
|
||||||
|
faster.
|
||||||
|
|
||||||
SEE ALSO
|
SEE ALSO
|
||||||
--------
|
--------
|
||||||
linkgit:git-pack-objects[1]
|
linkgit:git-pack-objects[1]
|
||||||
|
|
Loading…
Reference in New Issue