pack-bitmap: sort bitmaps before XORing

Reachability bitmaps may be stored as XORs against nearby bitmaps, up to
10 away. However, when callers provide selected commits in an arbitrary
order, the writer may miss good ancestor/descendant pairs and produce
much larger bitmap files without changing query coverage.

Sort the selected bitmaps in date order (from oldest to newest) before
computing XOR offsets, leaving pseudo-merge bitmaps alone (which we will
deal with separately in following commits).

On our same testing repository from previous commits, this change shrunk
our selection of 1,261 bitmaps from ~635.46 MiB to 176.4 MiB for a
~72.24% reduction in the on-disk size of our *.bitmap file. The time to
generate the smaller bitmap file decreased by ~3.69 seconds, though this
is likely mostly noise.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
main
Taylor Blau 2026-05-27 15:56:05 -04:00 committed by Junio C Hamano
parent c720bbcc53
commit dcccd99746
1 changed files with 29 additions and 0 deletions

View File

@ -327,11 +327,40 @@ missing:
return 0;
}

static int bitmapped_commit_date_cmp(const void *_a, const void *_b)
{
const struct bitmapped_commit *a = _a;
const struct bitmapped_commit *b = _b;

if (a->commit->date < b->commit->date)
return -1;
if (a->commit->date > b->commit->date)
return 1;
return 0;
}

static void compute_xor_offsets(struct bitmap_writer *writer)
{
static const int MAX_XOR_OFFSET_SEARCH = 10;

int i, next = 0;
int nr = bitmap_writer_nr_selected_commits(writer);

if (nr > 1) {
QSORT(writer->selected, nr, bitmapped_commit_date_cmp);

for (i = 0; i < nr; i++) {
struct bitmapped_commit *stored = &writer->selected[i];
khiter_t hash_pos = kh_get_oid_map(writer->bitmaps,
stored->commit->object.oid);

if (hash_pos == kh_end(writer->bitmaps))
BUG("selected commit missing from bitmap map: %s",
oid_to_hex(&stored->commit->object.oid));

kh_value(writer->bitmaps, hash_pos) = stored;
}
}

while (next < writer->selected_nr) {
struct bitmapped_commit *stored = &writer->selected[next];