git/Documentation
Vicent Marti ae4f07fbcc pack-bitmap: implement optional name_hash cache
When we use pack bitmaps rather than walking the object
graph, we end up with the list of objects to include in the
packfile, but we do not know the path at which any tree or
blob objects would be found.

In a recently packed repository, this is fine. A fetch would
use the paths only as a heuristic in the delta compression
phase, and a fully packed repository should not need to do
much delta compression.

As time passes, though, we may acquire more objects on top
of our large bitmapped pack. If clients fetch frequently,
then they never even look at the bitmapped history, and all
works as usual. However, a client who has not fetched since
the last bitmap repack will have "have" tips in the
bitmapped history, but "want" newer objects.

The bitmaps themselves degrade gracefully in this
circumstance. We manually walk the more recent bits of
history, and then use bitmaps when we hit them.

But we would also like to perform delta compression between
the newer objects and the bitmapped objects (both to delta
against what we know the user already has, but also between
"new" and "old" objects that the user is fetching). The lack
of pathnames makes our delta heuristics much less effective.

This patch adds an optional cache of the 32-bit name_hash
values to the end of the bitmap file. If present, a reader
can use it to match bitmapped and non-bitmapped names during
delta compression.

Here are perf results for p5310:

Test                      origin/master       HEAD^                      HEAD
-------------------------------------------------------------------------------------------------
5310.2: repack to disk    36.81(37.82+1.43)   47.70(48.74+1.41) +29.6%   47.75(48.70+1.51) +29.7%
5310.3: simulated clone   30.78(29.70+2.14)   1.08(0.97+0.10) -96.5%     1.07(0.94+0.12) -96.5%
5310.4: simulated fetch   3.16(6.10+0.08)     3.54(10.65+0.06) +12.0%    1.70(3.07+0.06) -46.2%
5310.6: partial bitmap    36.76(43.19+1.81)   6.71(11.25+0.76) -81.7%    4.08(6.26+0.46) -88.9%

You can see that the time spent on an incremental fetch goes
down, as our delta heuristics are able to do their work.
And we save time on the partial bitmap clone for the same
reason.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
..
RelNotes Update draft release notes to 1.8.5 2013-10-23 13:37:27 -07:00
howto Merge branch 'rj/doc-formatting-fix' 2013-10-14 11:07:50 -07:00
technical pack-bitmap: implement optional name_hash cache 2013-12-30 12:19:23 -08:00
.gitattributes
.gitignore
CodingGuidelines CodingGuidelines: style for multi-line comments 2013-10-14 12:48:06 -07:00
Makefile Documentation/Makefile: make AsciiDoc dblatex dir configurable 2013-10-03 12:21:19 -07:00
SubmittingPatches Provide some linguistic guidance for the documentation. 2013-08-01 13:13:52 -07:00
asciidoc.conf
blame-options.txt blame: document multiple -L support 2013-08-06 14:34:43 -07:00
build-docdep.perl
cat-texi.perl
cmd-list.perl
config.txt pack-bitmap: implement optional name_hash cache 2013-12-30 12:19:23 -08:00
date-formats.txt
diff-config.txt Improve documentation concerning the status.submodulesummary setting 2013-09-11 12:20:41 -07:00
diff-format.txt
diff-generate-patch.txt
diff-options.txt Merge branch 'mm/diff-no-patch-synonym-to-s' 2013-07-22 11:23:27 -07:00
docbook-xsl.css
docbook.xsl
everyday.txt Documentation: make AsciiDoc links always point to HTML files 2013-09-06 14:49:06 -07:00
fetch-options.txt Merge branch 'rr/maint-fetch-tag-doc-asterisks' into maint 2013-07-21 22:51:45 -07:00
fix-texi.perl
git-add.txt
git-am.txt am: replace uses of --resolved with --continue 2013-06-27 09:37:12 -07:00
git-annotate.txt
git-apply.txt
git-archimport.txt
git-archive.txt
git-bisect-lk2009.txt typofix: documentation 2013-07-22 16:06:48 -07:00
git-bisect.txt
git-blame.txt blame: document multiple -L support 2013-08-06 14:34:43 -07:00
git-branch.txt Refer to branch.<name>.remote/merge when documenting --track 2013-09-09 11:03:01 -07:00
git-bundle.txt
git-cat-file.txt Merge branch 'rh/ishes-doc' 2013-09-17 11:42:51 -07:00
git-check-attr.txt Merge branch 'jc/check-x-z' 2013-09-04 12:23:25 -07:00
git-check-ignore.txt check-ignore: Add option to ignore index contents 2013-09-12 15:40:29 -07:00
git-check-mailmap.txt builtin: add git-check-mailmap command 2013-07-13 10:19:37 -07:00
git-check-ref-format.txt Add new @ shortcut for HEAD 2013-09-12 14:39:34 -07:00
git-checkout-index.txt
git-checkout.txt checkout: update synopsys and documentation on detaching HEAD 2013-09-11 12:32:01 -07:00
git-cherry-pick.txt
git-cherry.txt doc: don't claim that cherry calls patch-id 2013-09-24 15:54:48 -07:00
git-citool.txt
git-clean.txt Documentation/git-clean: fix description for range 2013-07-24 19:16:13 -07:00
git-clone.txt Revert "git-clone.txt: remove the restriction on pushing from a shallow clone" 2013-07-15 08:35:32 -07:00
git-column.txt
git-commit-tree.txt
git-commit.txt
git-config.txt Merge branch 'jc/url-match' 2013-09-09 14:50:36 -07:00
git-count-objects.txt
git-credential-cache--daemon.txt
git-credential-cache.txt
git-credential-store.txt
git-credential.txt Documentation: make AsciiDoc links always point to HTML files 2013-09-06 14:49:06 -07:00
git-cvsexportcommit.txt
git-cvsimport.txt
git-cvsserver.txt
git-daemon.txt
git-describe.txt use 'commit-ish' instead of 'committish' 2013-09-04 15:03:03 -07:00
git-diff-files.txt
git-diff-index.txt
git-diff-tree.txt
git-diff.txt diff --no-index: describe in a separate paragraph 2013-08-28 15:17:18 -07:00
git-difftool.txt
git-fast-export.txt Documentation: Update 'linux-2.6.git' -> 'linux.git' 2013-06-22 23:36:48 -07:00
git-fast-import.txt Merge branch 'rh/ishes-doc' 2013-09-17 11:42:51 -07:00
git-fetch-pack.txt Merge branch 'nd/clone-connectivity-shortcut' 2013-09-09 14:30:01 -07:00
git-fetch.txt
git-filter-branch.txt
git-fmt-merge-msg.txt
git-for-each-ref.txt
git-format-patch.txt format-patch doc: Thunderbird wraps lines unless mailnews.wraplength=0 2013-10-14 16:20:01 -07:00
git-fsck-objects.txt
git-fsck.txt
git-gc.txt Merge branch 'nd/gc-lock-against-each-other' 2013-09-04 12:35:34 -07:00
git-get-tar-commit-id.txt
git-grep.txt Merge branch 'mg/more-textconv' 2013-10-23 13:21:31 -07:00
git-gui.txt
git-hash-object.txt
git-help.txt
git-http-backend.txt
git-http-fetch.txt
git-http-push.txt
git-imap-send.txt
git-index-pack.txt
git-init-db.txt
git-init.txt
git-instaweb.txt
git-log.txt line-range-format.txt: clarify -L:regex usage form 2013-08-06 14:26:26 -07:00
git-lost-found.txt
git-ls-files.txt
git-ls-remote.txt ls-remote doc: don't encourage use of branches-file 2013-06-23 00:33:58 -07:00
git-ls-tree.txt
git-mailinfo.txt
git-mailsplit.txt
git-merge-base.txt
git-merge-file.txt Documentation/git-merge-file: document option "--diff3" 2013-08-09 14:19:59 -07:00
git-merge-index.txt
git-merge-one-file.txt
git-merge-tree.txt use 'tree-ish' instead of 'treeish' 2013-09-04 15:02:56 -07:00
git-merge.txt git-merge: document the -S option 2013-10-18 12:47:33 -07:00
git-mergetool--lib.txt
git-mergetool.txt
git-mktag.txt
git-mktree.txt
git-mv.txt mv: update the path entry in .gitmodules for moved submodules 2013-08-06 14:10:35 -07:00
git-name-rev.txt use 'commit-ish' instead of 'committish' 2013-09-04 15:03:03 -07:00
git-notes.txt
git-p4.txt Change "remote tracking" to "remote-tracking" 2013-07-03 13:27:15 -07:00
git-pack-objects.txt
git-pack-redundant.txt
git-pack-refs.txt Documentation: remove --prune from pack-refs examples 2013-07-18 16:23:46 -07:00
git-parse-remote.txt
git-patch-id.txt
git-peek-remote.txt
git-prune-packed.txt git-prune-packed.txt: fix reference to GIT_OBJECT_DIRECTORY 2013-10-15 16:01:22 -07:00
git-prune.txt Documentation: fix git-prune example usage 2013-07-18 16:23:51 -07:00
git-pull.txt pull: allow pull to preserve merges when rebasing 2013-09-04 12:45:48 -07:00
git-push.txt Merge branch 'rh/ishes-doc' 2013-09-17 11:42:51 -07:00
git-quiltimport.txt
git-read-tree.txt
git-rebase.txt Documentation: make AsciiDoc links always point to HTML files 2013-09-06 14:49:06 -07:00
git-receive-pack.txt
git-reflog.txt
git-relink.txt
git-remote-ext.txt
git-remote-fd.txt
git-remote-helpers.txto
git-remote-testgit.txt
git-remote.txt remote doc: document long forms of set-head options 2013-09-27 16:49:18 -07:00
git-repack.txt repack: consider bitmaps when performing repacks 2013-12-30 12:19:23 -08:00
git-replace.txt Doc: 'replace' merge and non-merge commits 2013-09-09 08:16:30 -07:00
git-repo-config.txt
git-request-pull.txt
git-rerere.txt
git-reset.txt Documentation: "git reset <tree-ish> <pathspec>" takes a tree-ish, not tree-sh 2013-07-19 10:15:09 -07:00
git-rev-list.txt rev-list: add bitmap mode to speed up object lists 2013-12-30 12:19:22 -08:00
git-rev-parse.txt Merge branch 'rj/doc-rev-parse' 2013-08-30 10:08:13 -07:00
git-revert.txt Documentation: make AsciiDoc links always point to HTML files 2013-09-06 14:49:06 -07:00
git-rm.txt rm: delete .gitmodules entry of submodules removed from the work tree 2013-08-06 14:11:00 -07:00
git-send-email.txt send-email: be explicit with SSL certificate verification 2013-07-18 16:01:30 -07:00
git-send-pack.txt
git-sh-i18n--envsubst.txt
git-sh-i18n.txt
git-sh-setup.txt Merge branch 'jc/reflog-doc' 2013-10-18 13:50:12 -07:00
git-shell.txt
git-shortlog.txt
git-show-branch.txt
git-show-index.txt
git-show-ref.txt Merge branch 'db/show-ref-head' 2013-07-22 11:23:56 -07:00
git-show.txt Documentation/git-show.txt: include common diff options, like git-log.txt 2013-07-17 17:50:56 -07:00
git-stage.txt
git-stash.txt Revert "git stash: avoid data loss when "git stash save" kills a directory" 2013-08-14 09:53:43 -07:00
git-status.txt Improve documentation concerning the status.submodulesummary setting 2013-09-11 12:20:41 -07:00
git-stripspace.txt
git-submodule.txt Merge branch 'fg/submodule-clone-depth' 2013-07-15 10:28:48 -07:00
git-svn.txt git-svn: Warn about changing default for --prefix in Git v2.0 2013-10-12 22:30:53 +00:00
git-symbolic-ref.txt
git-tag.txt Merge branch 'ds/doc-two-kinds-of-tags' 2013-07-31 12:38:21 -07:00
git-tar-tree.txt
git-tools.txt
git-unpack-file.txt
git-unpack-objects.txt
git-update-index.txt
git-update-ref.txt update-ref: support multiple simultaneous updates 2013-09-09 09:54:37 -07:00
git-update-server-info.txt
git-upload-archive.txt
git-upload-pack.txt
git-var.txt
git-verify-pack.txt
git-verify-tag.txt
git-web--browse.txt web--browse: support /usr/bin/cygstart on Cygwin 2013-06-21 09:05:15 -07:00
git-whatchanged.txt whatchanged: document its historical nature 2013-08-13 09:01:54 -07:00
git-write-tree.txt
git.txt Merge branch 'jc/reflog-doc' 2013-10-18 13:50:12 -07:00
gitattributes.txt
gitcli.txt Merge branch 'po/dot-url' 2013-10-23 13:21:48 -07:00
gitcore-tutorial.txt core-tutorial: trim the section on Inspecting Changes 2013-08-13 09:01:52 -07:00
gitcredentials.txt
gitcvs-migration.txt Documentation: make AsciiDoc links always point to HTML files 2013-09-06 14:49:06 -07:00
gitdiffcore.txt
gitglossary.txt
githooks.txt
gitignore.txt
gitk.txt
gitmodules.txt Improve documentation concerning the status.submodulesummary setting 2013-09-11 12:20:41 -07:00
gitnamespaces.txt
gitremote-helpers.txt Merge branch 'mm/remote-helpers-doc' 2013-09-12 14:41:50 -07:00
gitrepository-layout.txt
gitrevisions.txt
gittutorial-2.txt
gittutorial.txt
gitweb.conf.txt gitweb: allow extra breadcrumbs to prefix the trail 2013-07-04 21:52:15 -07:00
gitweb.txt typofix: documentation 2013-07-22 16:06:48 -07:00
gitworkflows.txt
glossary-content.txt Merge branch 'rh/ishes-doc' 2013-09-17 11:42:51 -07:00
howto-index.sh
i18n.txt
install-doc-quick.sh
install-webdoc.sh
line-range-format.txt line-range: teach -L^:RE to search from start of file 2013-08-06 14:48:02 -07:00
mailmap.txt
manpage-1.72.xsl
manpage-base-url.xsl.in
manpage-base.xsl
manpage-bold-literal.xsl
manpage-normal.xsl
manpage-quote-apos.xsl
manpage-suppress-sp.xsl
merge-config.txt
merge-options.txt Merge branch 'rh/merge-options-doc-fix' into maint 2013-07-03 15:36:30 -07:00
merge-strategies.txt
pretty-formats.txt
pretty-options.txt log doc: the argument to --encoding is not optional 2013-08-05 08:19:47 -07:00
pull-fetch-param.txt
rev-list-options.txt rev-list: add bitmap mode to speed up object lists 2013-12-30 12:19:22 -08:00
revisions.txt Merge branch 'fc/at-head' 2013-09-20 12:38:10 -07:00
sequencer.txt
urls-remotes.txt
urls.txt Merge branch 'ft/doc-git-transport' into maint 2013-07-21 22:51:24 -07:00
user-manual.conf
user-manual.txt Merge branch 'ss/doclinks' 2013-09-17 11:42:54 -07:00