Browse Source
We find rename candidates by computing a fingerprint hash of each file, and then comparing those fingerprints. There are inherently O(n^2) comparisons, so it pays in CPU time to hoist the (rather expensive) computation of the fingerprint out of that loop (or to cache it once we have computed it once). Previously, we didn't keep the filespec information around because then we had the potential to consume a great deal of memory. However, instead of keeping all of the filespec data, we can instead just keep the fingerprint. This patch implements and uses diff_free_filespec_data_large to accomplish that goal. We also have to change estimate_similarity not to needlessly repopulate the filespec data when we already have the hash. Practical tests showed 4.5x speedup for a 10% memory usage increase. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>maint
Jeff King
18 years ago
committed by
Junio C Hamano
3 changed files with 11 additions and 4 deletions
Loading…
Reference in new issue