git/xdiff
Phillip Wood dd2a4c0c7a diff --anchored: avoid checking unmatched lines
For a line to be an anchor it has to appear in each of the files being
diffed exactly once. With that in mind lets delay checking whether
a line is an anchor until we know there is exactly one instance of
the line in each file. As each line is checked at most once, there
is no need to cache the result of is_anchor() and we can drop that
field from the hashmap entries. When diffing 5000 recent commits in
git.git this gives a modest speedup of ~2%. In the (rather extreme)
example below that consists largely of deletions the speedup is ~16%.

    seq 0 10000000 >old
    printf '%s\n' 300000 100000 200000 >new
    git diff --no-index --anchored=300000 old new

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12 09:28:49 -08:00
..
xdiff.h xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK 2025-11-17 09:31:59 -08:00
xdiffi.c xdiff: rename rindex -> reference_index 2025-11-18 14:53:11 -08:00
xdiffi.h xdiff: delete struct diffdata_t 2025-09-30 14:12:46 -07:00
xemit.c xdiff: make xdfile_t.nrec a size_t instead of long 2025-11-18 14:53:10 -08:00
xemit.h
xhistogram.c xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash 2025-11-18 14:53:10 -08:00
xinclude.h xdiff: move sign comparison warning guard into each file 2025-02-12 09:41:15 -08:00
xmacros.h
xmerge.c xdiff: make xdfile_t.nrec a size_t instead of long 2025-11-18 14:53:10 -08:00
xpatience.c diff --anchored: avoid checking unmatched lines 2026-02-12 09:28:49 -08:00
xprepare.c xdiff: rename rindex -> reference_index 2025-11-18 14:53:11 -08:00
xprepare.h
xtypes.h xdiff: rename rindex -> reference_index 2025-11-18 14:53:11 -08:00
xutils.c xdiff: use unambiguous types in xdl_hash_record() 2025-11-18 14:53:10 -08:00
xutils.h xdiff: use unambiguous types in xdl_hash_record() 2025-11-18 14:53:10 -08:00