git/Documentation
Derrick Stolee 28cb5e66dd maintenance: add prefetch task
When working with very large repositories, an incremental 'git fetch'
command can download a large amount of data. If there are many other
users pushing to a common repo, then this data can rival the initial
pack-file size of a 'git clone' of a medium-size repo.

Users may want to keep the data on their local repos as close as
possible to the data on the remote repos by fetching periodically in
the background. This can break up a large daily fetch into several
smaller hourly fetches.

The task is called "prefetch" because it is work done in advance
of a foreground fetch to make that 'git fetch' command much faster.

However, if we simply ran 'git fetch <remote>' in the background,
then the user running a foreground 'git fetch <remote>' would lose
some important feedback when a new branch appears or an existing
branch updates. This is especially true if a remote branch is
force-updated and this isn't noticed by the user because it occurred
in the background. Further, the functionality of 'git push
--force-with-lease' becomes suspect.

When running 'git fetch <remote> <options>' in the background, use
the following options for careful updating:

1. --no-tags prevents getting a new tag when a user wants to see
   the new tags appear in their foreground fetches.

2. --refmap= removes the configured refspec which usually updates
   refs/remotes/<remote>/* with the refs advertised by the remote.
   While this looks confusing, this was documented and tested by
   b40a50264a (fetch: document and test --refmap="", 2020-01-21),
   including this sentence in the documentation:

	Providing an empty `<refspec>` to the `--refmap` option
	causes Git to ignore the configured refspecs and rely
	entirely on the refspecs supplied as command-line arguments.

3. By adding a new refspec "+refs/heads/*:refs/prefetch/<remote>/*"
   we can ensure that we actually load the new values somewhere in
   our refspace while not updating refs/heads or refs/remotes. By
   storing these refs here, the commit-graph job will update the
   commit-graph with the commits from these hidden refs.

4. --prune will delete the refs/prefetch/<remote> refs that no
   longer appear on the remote.

5. --no-write-fetch-head prevents updating FETCH_HEAD.

We've been using this step as a critical background job in Scalar
[1] (and VFS for Git). This solved a pain point that was showing up
in user reports: fetching was a pain! Users do not like waiting to
download the data that was created while they were away from their
machines. After implementing background fetch, the foreground fetch
commands sped up significantly because they mostly just update refs
and download a small amount of new data. The effect is especially
dramatic when paried with --no-show-forced-udpates (through
fetch.showForcedUpdates=false).

[1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/FetchStep.cs

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:53:04 -07:00
..
RelNotes Eighth batch 2020-08-17 17:02:50 -07:00
config maintenance: add auto condition for commit-graph task 2020-09-17 11:30:05 -07:00
howto Merge branch 'js/pu-to-seen' 2020-07-06 22:09:16 -07:00
technical Merge branch 'bc/sha-256-part-3' 2020-08-11 18:04:11 -07:00
.gitattributes
.gitignore
CodingGuidelines Merge branch 'dl/python-2.7-is-the-floor-version' 2020-06-17 21:54:05 -07:00
Makefile git.txt: add list of guides 2020-08-04 18:34:02 -07:00
MyFirstContribution.txt docs: adjust for the recent rename of `pu` to `seen` 2020-06-25 09:18:53 -07:00
MyFirstObjectWalk.txt MyFirstObjectWalk: remove unnecessary conditional statement 2020-03-30 11:16:41 -07:00
SubmittingPatches Merge branch 'js/pu-to-seen' 2020-07-06 22:09:16 -07:00
asciidoc.conf Doc: drop support for docbook-xsl before 1.72.0 2020-03-29 09:25:38 -07:00
asciidoctor-extensions.rb
blame-options.txt blame-options.txt: document --first-parent option 2020-08-06 14:08:10 -07:00
build-docdep.perl
cat-texi.perl
cmd-list.perl git.txt: add list of guides 2020-08-04 18:34:02 -07:00
config.txt maintenance: create maintenance.<task>.enabled config 2020-09-17 11:30:05 -07:00
date-formats.txt date-formats.txt: fix list continuation 2020-05-18 13:18:56 -07:00
diff-format.txt doc: indent multi-line items in list 2019-12-13 12:18:07 -08:00
diff-generate-patch.txt
diff-options.txt doc/git-log: move "-t" into diff-options list 2020-07-29 13:44:03 -07:00
doc-diff doc-diff: use single-colon rule in rendering Makefile 2020-02-18 13:53:30 -08:00
docbook-xsl.css
docbook.xsl
everyday.txto
fetch-options.txt maintenance: replace run_auto_gc() 2020-09-17 11:30:05 -07:00
fix-texi.perl
git-add.txt add: support the --pathspec-from-file option 2019-12-04 10:10:37 -08:00
git-am.txt Documentation: document am --no-gpg-sign 2020-04-03 11:37:22 -07:00
git-annotate.txt
git-apply.txt
git-archimport.txt
git-archive.txt
git-bisect-lk2009.txt Merge branch 'dl/lore-is-the-archive' 2019-12-06 15:09:24 -08:00
git-bisect.txt bisect: introduce first-parent flag 2020-08-07 15:13:03 -07:00
git-blame.txt
git-branch.txt docs: add missing diamond brackets 2020-06-24 09:14:21 -07:00
git-bugreport.txt Merge branch 'es/bugreport-shell' 2020-06-08 18:06:28 -07:00
git-bundle.txt bundle: add new version for use with SHA-256 2020-07-30 09:16:48 -07:00
git-cat-file.txt cat-file: add missing [=<format>] to usage/synopsis 2020-07-01 15:54:05 -07:00
git-check-attr.txt
git-check-ignore.txt Merge branch 'en/check-ignore' into maint 2020-03-17 15:02:23 -07:00
git-check-mailmap.txt
git-check-ref-format.txt
git-checkout-index.txt
git-checkout.txt doc: --recurse-submodules mostly applies to active submodules 2020-04-06 13:42:43 -07:00
git-cherry-pick.txt cherry-pick/revert: honour --no-gpg-sign in all case 2020-04-03 11:37:22 -07:00
git-cherry.txt
git-citool.txt
git-clean.txt
git-clone.txt maintenance: replace run_auto_gc() 2020-09-17 11:30:05 -07:00
git-column.txt
git-commit-graph.txt Merge branch 'ds/commit-graph-bloom-updates' into master 2020-07-30 13:20:31 -07:00
git-commit-tree.txt Documentation: merge commit-tree --[no-]gpg-sign 2020-04-03 11:37:22 -07:00
git-commit.txt Documentation: reword commit --no-gpg-sign 2020-04-03 11:37:22 -07:00
git-config.txt config: add '--show-scope' to print the scope of a config value 2020-02-10 10:49:12 -08:00
git-count-objects.txt
git-credential-cache--daemon.txt
git-credential-cache.txt
git-credential-store.txt Merge branch 'cb/credential-store-ignore-bogus-lines' 2020-05-08 14:25:01 -07:00
git-credential.txt git-credential.txt: use list continuation 2020-05-18 13:19:33 -07:00
git-cvsexportcommit.txt
git-cvsimport.txt
git-cvsserver.txt
git-daemon.txt
git-describe.txt
git-diff-files.txt
git-diff-index.txt
git-diff-tree.txt
git-diff.txt git-diff.txt: reorder possible usages 2020-07-13 12:47:38 -07:00
git-difftool.txt
git-fast-export.txt fast-export: allow seeding the anonymized mapping 2020-06-25 14:19:23 -07:00
git-fast-import.txt Merge branch 'en/fast-import-looser-date' 2020-06-02 13:35:05 -07:00
git-fetch-pack.txt
git-fetch.txt docs: adjust for the recent rename of `pu` to `seen` 2020-06-25 09:18:53 -07:00
git-filter-branch.txt git-filter-branch.txt: wrap "maths" notation in backticks 2020-02-04 12:17:18 -08:00
git-fmt-merge-msg.txt
git-for-each-ref.txt ref-filter: add support for %(contents:size) 2020-07-16 10:46:55 -07:00
git-format-patch.txt format-patch: teach --no-encode-email-headers 2020-04-07 22:37:18 -07:00
git-fsck-objects.txt
git-fsck.txt
git-gc.txt
git-get-tar-commit-id.txt
git-grep.txt Merge branch 'mt/grep-cquote-path' 2020-04-28 15:50:09 -07:00
git-gui.txt
git-hash-object.txt
git-help.txt help: drop usage of 'common' and 'useful' for guides 2020-08-04 18:34:01 -07:00
git-http-backend.txt
git-http-fetch.txt http-fetch: support fetching packfiles by URL 2020-06-10 18:06:34 -07:00
git-http-push.txt
git-imap-send.txt
git-index-pack.txt Merge branch 'jb/doc-packfile-name' into master 2020-07-30 21:34:32 -07:00
git-init-db.txt
git-init.txt init: allow specifying the initial branch name for the new repository 2020-06-24 09:14:21 -07:00
git-instaweb.txt
git-interpret-trailers.txt
git-log.txt Merge branch 'so/log-diff-merges-opt' 2020-08-17 17:02:50 -07:00
git-ls-files.txt doc: --recurse-submodules mostly applies to active submodules 2020-04-06 13:42:43 -07:00
git-ls-remote.txt docs: adjust for the recent rename of `pu` to `seen` 2020-06-25 09:18:53 -07:00
git-ls-tree.txt
git-mailinfo.txt
git-mailsplit.txt
git-maintenance.txt maintenance: add prefetch task 2020-09-25 10:53:04 -07:00
git-merge-base.txt
git-merge-file.txt
git-merge-index.txt
git-merge-one-file.txt
git-merge-tree.txt
git-merge.txt Doc: reference the "stash list" in autostash docs 2020-05-05 16:07:30 -07:00
git-mergetool--lib.txt
git-mergetool.txt
git-mktag.txt
git-mktree.txt
git-multi-pack-index.txt multi-pack-index: respect repack.packKeptObjects=false 2020-05-10 09:50:55 -07:00
git-mv.txt
git-name-rev.txt
git-notes.txt docs: improve the example that illustrates git-notes path names 2020-08-03 12:40:09 -07:00
git-p4.txt git-p4: add p4 submit hooks 2020-02-14 08:58:53 -08:00
git-pack-objects.txt pack-objects: no fetch when allow-{any,promisor} 2020-08-06 13:01:03 -07:00
git-pack-redundant.txt
git-pack-refs.txt
git-parse-remote.txt
git-patch-id.txt
git-prune-packed.txt
git-prune.txt
git-pull.txt Merge branch 'dl/merge-autostash' 2020-04-29 16:15:27 -07:00
git-push.txt
git-quiltimport.txt
git-range-diff.txt Merge branch 'dl/range-diff-with-notes' 2019-12-05 12:52:44 -08:00
git-read-tree.txt doc: --recurse-submodules mostly applies to active submodules 2020-04-06 13:42:43 -07:00
git-rebase.txt Merge branch 'ma/rebase-doc-typofix' into master 2020-07-09 14:00:45 -07:00
git-receive-pack.txt
git-reflog.txt
git-remote-ext.txt
git-remote-fd.txt
git-remote-helpers.txto
git-remote.txt
git-repack.txt
git-replace.txt
git-request-pull.txt
git-rerere.txt
git-reset.txt doc: document --recurse-submodules for reset and restore 2020-04-06 13:42:43 -07:00
git-restore.txt Merge branch 'es/restore-staged-from-head-by-default' 2020-05-08 14:25:08 -07:00
git-rev-list.txt git-log.txt: include rev-list-description.txt 2020-07-08 22:08:54 -07:00
git-rev-parse.txt rev-parse: make --show-toplevel without a worktree an error 2019-11-20 10:19:58 +09:00
git-revert.txt cherry-pick/revert: honour --no-gpg-sign in all case 2020-04-03 11:37:22 -07:00
git-rm.txt rm: support the --pathspec-from-file option 2020-02-19 10:56:49 -08:00
git-send-email.txt
git-send-pack.txt
git-sh-i18n--envsubst.txt
git-sh-i18n.txt
git-sh-setup.txt
git-shell.txt
git-shortlog.txt
git-show-branch.txt
git-show-index.txt builtin/show-index: provide options to determine hash algo 2020-05-27 10:07:07 -07:00
git-show-ref.txt
git-show.txt
git-sparse-checkout.txt Merge branch 'en/sparse-with-submodule-doc' 2020-06-22 15:55:03 -07:00
git-stage.txt
git-stash.txt stash push: support the --pathspec-from-file option 2020-02-19 10:56:49 -08:00
git-status.txt
git-stripspace.txt
git-submodule.txt submodule: fall back to remote's HEAD for missing remote.<name>.branch 2020-06-24 09:14:21 -07:00
git-svn.txt git svn: stop using `rebase --preserve-merges` 2019-11-23 09:49:23 +09:00
git-switch.txt doc: --recurse-submodules mostly applies to active submodules 2020-04-06 13:42:43 -07:00
git-symbolic-ref.txt
git-tag.txt
git-tools.txt
git-unpack-file.txt
git-unpack-objects.txt
git-update-index.txt doc: dissuade users from trying to ignore tracked files 2020-01-22 12:27:49 -08:00
git-update-ref.txt Modify pseudo refs through ref backend storage 2020-07-27 10:06:49 -07:00
git-update-server-info.txt
git-upload-archive.txt
git-upload-pack.txt
git-var.txt
git-verify-commit.txt
git-verify-pack.txt
git-verify-tag.txt
git-web--browse.txt
git-whatchanged.txt
git-worktree.txt git-worktree.txt: link to man pages when citing other Git commands 2020-08-03 21:32:41 -07:00
git-write-tree.txt
git.txt git.txt: add list of guides 2020-08-04 18:34:02 -07:00
gitattributes.txt userdiff: support Markdown 2020-05-02 18:04:12 -07:00
gitcli.txt Merge branch 'jc/doc-single-h-is-for-help' into maint 2020-03-17 15:02:24 -07:00
gitcore-tutorial.txt doc/gitcore-tutorial: fix prose to match example command 2020-01-08 08:56:40 -08:00
gitcredentials.txt command-list.txt: add missing 'gitcredentials' and 'gitremote-helpers' 2020-08-04 18:34:01 -07:00
gitcvs-migration.txt
gitdiffcore.txt
giteveryday.txt docs: adjust for the recent rename of `pu` to `seen` 2020-06-25 09:18:53 -07:00
gitfaq.txt Merge branch 'ss/faq-ignore' 2020-05-26 09:32:08 -07:00
gitglossary.txt
githooks.txt githooks.txt: use correct "reference-transaction" hook name 2020-07-24 13:53:58 -07:00
gitignore.txt
gitk.txt doc: log, gitk: line-log arguments must exist in starting revision 2019-12-26 11:00:15 -08:00
gitmodules.txt submodule: fall back to remote's HEAD for missing remote.<name>.branch 2020-06-24 09:14:21 -07:00
gitnamespaces.txt
gitremote-helpers.txt Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
gitrepository-layout.txt
gitrevisions.txt
gitsubmodules.txt doc: list all commands affected by submodule.recurse 2020-04-06 13:42:43 -07:00
gittutorial-2.txt
gittutorial.txt
gitweb.conf.txt
gitweb.txt
gitworkflows.txt gitworkflows.txt: fix broken subsection underline 2020-07-18 13:43:34 -07:00
glossary-content.txt
howto-index.sh
i18n.txt
install-doc-quick.sh
install-webdoc.sh
line-range-format.txt
lint-gitlink.perl
mailmap.txt
manpage-base-url.xsl.in
manpage-bold-literal.xsl manpage-bold-literal.xsl: stop using git.docbook.backslash 2020-03-29 09:25:38 -07:00
manpage-normal.xsl manpage-normal.xsl: fold in manpage-base.xsl 2020-03-29 09:25:38 -07:00
manpage-quote-apos.xsl
manpage.xsl
merge-options.txt Merge branch 'dl/merge-autostash' 2020-04-29 16:15:27 -07:00
merge-strategies.txt
pretty-formats.txt Merge branch 'mk/pb-pretty-email-without-domain-part-fix' 2020-07-06 22:09:15 -07:00
pretty-options.txt Merge branch 'dl/pretty-reference' 2019-12-10 13:11:43 -08:00
pull-fetch-param.txt pull doc: refer to a specific section in 'fetch' doc 2020-04-05 15:00:03 -07:00
rev-list-description.txt git-log.txt: include rev-list-description.txt 2020-07-08 22:08:54 -07:00
rev-list-options.txt Merge branch 'jk/log-fp-implies-m' 2020-08-17 17:02:49 -07:00
revisions.txt revisions.txt: describe 'rev1 rev2 ...' meaning for ranges 2020-07-08 22:08:53 -07:00
sequencer.txt
texi.xsl
trace2-target-values.txt
transfer-data-leaks.txt
urls-remotes.txt
urls.txt
user-manual.conf user-manual.conf: don't specify [listingblock] 2020-03-31 16:08:02 -07:00
user-manual.txt docs: adjust for the recent rename of `pu` to `seen` 2020-06-25 09:18:53 -07:00