git/builtin
Patrick Steinhardt 835e0aaf6f builtin/pack-objects: reduce lock contention when writing packfile data
When running `git pack-objects --stdout` we feed the data through
`hashfd_ext()` with a progress meter and a smaller-than-usual buffer
length of 8kB so that we can track throughput more granularly. But as
packfiles tend to be on the larger side, this small buffer size may
cause a ton of write(3p) syscalls.

Originally, the buffer we used in `hashfd()` was 8kB for all use cases.
This was changed though in 2ca245f8be (csum-file.h: increase hashfile
buffer size, 2021-05-18) because we noticed that the number of writes
can have an impact on performance. So the buffer size was increased to
128kB, which improved performance a bit for some use cases.

But the commit didn't touch the buffer size for `hashd_throughput()`.
The reasoning here was that callers expect the progress indicator to
update frequently, and a larger buffer size would of course reduce the
update frequency especially on slow networks.

While that is of course true, there was (and still is, even though it's
now a call to `hashfd_ext()`) only a single caller of this function in
git-pack-objects(1). This command is responsible for writing packfiles,
and those packfiles are often on the bigger side. So arguably:

  - The user won't care about increments of 8kB when packfiles tend to
    be megabytes or even gigabytes in size.

  - Reducing the number of syscalls would be even more valuable here
    than it would be for multi-pack indices, which was the benchmark
    done in the mentioned commit, as MIDXs are typically significantly
    smaller than packfiles.

  - Nowadays, many internet connections should be able to transfer data
    at a rate significantly higher than 8kB per second.

Update the buffer to instead have a size of `LARGE_PACKET_DATA_MAX - 1`,
which translates to ~64kB. This limit was chosen because `git
pack-objects --stdout` is most often used when sending packfiles via
git-upload-pack(1), where packfile data is chunked into pktlines when
using the sideband. Furthermore, most internet connections should have a
bandwidth signifcantly higher than 64kB/s, so we'd still be able to
observe progress updates at a rate of at least once per second.

This change significantly reduces the number of write(3p) syscalls from
355,000 to 44,000 when packing the Linux repository. While this results
in a small performance improvement on an otherwise-unused system, this
improvement is mostly negligible. More importantly though, it will
reduce lock contention in the kernel on an extremely busy system where
we have many processes writing data at once.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-13 08:54:15 -07:00
..
add.c Merge branch 'jt/odb-transaction' 2025-10-02 12:26:11 -07:00
am.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
annotate.c Merge branch 'jc/a-commands-without-the-repo' 2024-10-25 14:02:36 -04:00
apply.c builtin: use default hash when outside a repository 2025-07-01 14:58:24 -07:00
archive.c archive: remove the_repository global variable 2024-10-11 09:37:18 -07:00
backfill.c packfile: split up responsibilities of `reprepare_packed_git()` 2025-09-24 11:53:50 -07:00
bisect.c Merge branch 'ps/ref-peeled-tags' 2025-11-19 10:55:39 -08:00
blame.c Merge branch 'jc/optional-path' 2025-12-05 14:49:56 +09:00
branch.c branch: advice using git-help(1) instead of man(1) 2025-12-03 00:16:05 -08:00
bugreport.c object-file: move `safe_create_leading_directories()` into "path.c" 2025-04-15 08:24:35 -07:00
bundle.c Merge branch 'jt/bundle-fsck' 2024-12-13 07:33:36 -08:00
cat-file.c Merge branch 'ps/read-object-info-improvements' 2026-01-21 08:29:00 -08:00
check-attr.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
check-ignore.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
check-mailmap.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
check-ref-format.c builtin: send usage() help text to standard output 2025-01-17 13:30:03 -08:00
checkout--worker.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
checkout-index.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
checkout.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
clean.c Merge branch 'jk/color-variable-fixes' 2025-09-29 11:40:35 -07:00
clone.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
column.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
commit-graph.c commit-graph: add new config for changed-paths & recommend it in scalar 2025-10-22 10:40:11 -07:00
commit-tree.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
commit.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
config.c Merge branch 'rs/config-set-multi-error-message-fix' 2025-12-05 14:49:59 +09:00
count-objects.c packfile: introduce macro to iterate through packs 2025-10-16 14:42:39 -07:00
credential-cache--daemon.c config: drop `git_config_get_bool()` wrapper 2025-07-23 08:15:20 -07:00
credential-cache.c Merge branch 'rj/cygwin-exit' 2024-11-01 12:53:19 -04:00
credential-store.c config: drop `git_config_get_int()` wrapper 2025-07-23 08:15:20 -07:00
credential.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
describe.c tag: support arbitrary repositories in parse_tag() 2025-12-29 22:02:54 +09:00
diagnose.c object-file: move `safe_create_leading_directories()` into "path.c" 2025-04-15 08:24:35 -07:00
diff-files.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
diff-index.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
diff-pairs.c builtin/diff-pairs: allow explicit diff queue flush 2025-03-03 08:17:47 -08:00
diff-tree.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
diff.c diff: --no-index should ignore the worktree 2025-08-09 17:22:01 -07:00
difftool.c odb: rename `repo_read_object_file()` 2025-07-01 14:46:38 -07:00
fast-export.c Merge branch 'cc/fast-import-strip-if-invalid' 2025-12-05 14:49:58 +09:00
fast-import.c packfile: move packfile store into object source 2026-01-09 06:40:07 -08:00
fetch-pack.c builtin/fetch-pack: cleanup before return error 2025-06-04 08:52:25 -07:00
fetch.c Merge branch 'kn/fix-fetch-backfill-tag-with-batched-ref-updates' 2025-12-23 11:33:17 +09:00
fmt-merge-msg.c builtin/fmt-merge-msg: stop depending on 'the_repository' 2025-08-11 09:19:40 -07:00
for-each-ref.c Merge branch 'ms/refs-list' 2025-08-22 13:13:20 -07:00
for-each-repo.c global: trivial conversions to fix `-Wsign-compare` warnings 2024-12-06 20:20:04 +09:00
fsck.c Merge branch 'ps/ref-consistency-checks' 2026-01-21 08:28:58 -08:00
fsmonitor--daemon.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
gc.c builtin/gc: fix condition for whether to write commit graphs 2026-01-07 09:16:50 +09:00
get-tar-commit-id.c builtin: send usage() help text to standard output 2025-01-17 13:30:03 -08:00
grep.c packfile: only prepare owning store in `packfile_store_prepare()` 2026-01-09 06:40:07 -08:00
hash-object.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
help.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
hook.c Revert "Merge branch 'ar/run-command-hook'" 2026-01-15 13:02:38 -08:00
index-pack.c packfile: move packfile store into object source 2026-01-09 06:40:07 -08:00
init-db.c Merge branch 'ps/parse-options-integers' 2025-04-24 17:25:34 -07:00
interpret-trailers.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
last-modified.c Merge branch 'tc/memzero-array' 2025-12-23 11:33:16 +09:00
log.c log: use commit_stack 2025-12-25 08:29:27 +09:00
ls-files.c Merge branch 'ds/ls-files-lazy-unsparse' 2025-09-08 14:54:35 -07:00
ls-remote.c ref-filter: propagate peeled object ID 2025-11-04 07:32:25 -08:00
ls-tree.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
mailinfo.c mailinfo: stop using `the_repository` 2024-12-18 10:44:31 -08:00
mailsplit.c builtin: send usage() help text to standard output 2025-01-17 13:30:03 -08:00
merge-base.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
merge-file.c Merge branch 'ps/object-file-wo-the-repository' 2025-08-05 11:53:55 -07:00
merge-index.c builtin: send usage() help text to standard output 2025-01-17 13:30:03 -08:00
merge-ours.c builtin: send usage() help text to standard output 2025-01-17 13:30:03 -08:00
merge-recursive.c builtin: also setup gently for --help-all 2025-08-08 11:13:12 -07:00
merge-tree.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
merge.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
mktag.c Merge branch 'ps/object-file-wo-the-repository' 2025-08-05 11:53:55 -07:00
mktree.c odb: introduce `odb_write_object()` 2025-07-16 22:16:15 -07:00
multi-pack-index.c Merge branch 'ps/object-store-midx-dedup-info' 2025-09-12 10:41:18 -07:00
mv.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
name-rev.c name-rev: use commit_stack 2025-12-25 08:29:27 +09:00
notes.c Merge branch 'jc/strbuf-split' 2025-08-21 13:47:00 -07:00
pack-objects.c builtin/pack-objects: reduce lock contention when writing packfile data 2026-03-13 08:54:15 -07:00
pack-redundant.c Merge branch 'ps/remove-packfile-store-get-packs' 2025-10-30 08:00:19 -07:00
pack-refs.c builtin/pack-refs: factor out core logic into a shared library 2025-09-19 10:02:55 -07:00
patch-id.c patch-id: use “patch ID” throughout 2026-01-09 06:07:21 -08:00
prune-packed.c
prune.c Merge branch 'ps/object-file-wo-the-repository' 2025-08-05 11:53:55 -07:00
pull.c pull: move options[] array into function scope 2025-12-12 22:08:02 +09:00
push.c color: use git_colorbool enum type to store colorbools 2025-09-16 17:59:53 -07:00
range-diff.c Merge branch 'kh/format-patch-range-diff-notes' 2025-10-14 12:56:09 -07:00
read-tree.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
rebase.c Merge branch 'jk/setup-revisions-freefix' 2025-09-29 11:40:34 -07:00
receive-pack.c Revert "Merge branch 'ar/run-command-hook'" 2026-01-15 13:02:38 -08:00
reflog.c Merge branch 'ps/reflog-migrate-fixes' into maint-2.51 2025-10-15 10:29:28 -07:00
refs.c Merge branch 'ms/refs-optimize' 2025-10-02 12:26:12 -07:00
remote-ext.c builtin: send usage() help text to standard output 2025-01-17 13:30:03 -08:00
remote-fd.c builtin: send usage() help text to standard output 2025-01-17 13:30:03 -08:00
remote.c refs: introduce wrapper struct for `each_ref_fn` 2025-11-04 07:32:24 -08:00
repack.c builtin/repack: handle promisor packs with geometric repacking 2026-01-14 06:29:24 -08:00
replace.c refs: introduce wrapper struct for `each_ref_fn` 2025-11-04 07:32:24 -08:00
replay.c replay: die if we cannot parse object 2026-01-06 07:30:16 +09:00
repo.c Merge branch 'jt/repo-struct-more-objinfo' 2025-12-30 12:58:19 +09:00
rerere.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
reset.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
rev-list.c Merge branch 'ps/config-wo-the-repository' 2025-08-04 08:10:33 -07:00
rev-parse.c Merge branch 'ps/ref-peeled-tags' 2025-11-19 10:55:39 -08:00
revert.c Merge branch 'pw/3.0-commentchar-auto-deprecation' 2025-09-18 10:07:00 -07:00
rm.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
send-pack.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
shortlog.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
show-branch.c Merge branch 'rs/show-branch-prio-queue' 2026-01-06 16:33:52 +09:00
show-index.c builtin: use default hash when outside a repository 2025-07-01 14:58:24 -07:00
show-ref.c builtin/show-ref: convert to use `reference_get_peeled_oid()` 2025-11-04 07:32:25 -08:00
sparse-checkout.c sparse-checkout: add --verbose option to 'clean' 2025-09-15 12:10:56 -07:00
stash.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
stripspace.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
submodule--helper.c Merge branch 'jc/submodule-add' 2025-12-23 11:33:15 +09:00
symbolic-ref.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
tag.c tag: support arbitrary repositories in gpg_verify_tag() 2025-12-29 22:02:53 +09:00
unpack-file.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
unpack-objects.c object-file: refactor writing objects via a stream 2025-11-03 12:18:48 -08:00
update-index.c odb: add transaction interface 2025-09-16 11:37:06 -07:00
update-ref.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
update-server-info.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
upload-archive.c path: move `enter_repo()` into "setup.c" 2025-11-19 17:41:03 -08:00
upload-pack.c path: move `enter_repo()` into "setup.c" 2025-11-19 17:41:03 -08:00
var.c Merge branch 'jc/string-list-split' 2025-08-21 13:46:59 -07:00
verify-commit.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
verify-pack.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
verify-tag.c tag: support arbitrary repositories in gpg_verify_tag() 2025-12-29 22:02:53 +09:00
worktree.c Merge branch 'pw/worktree-list-display-width-fix' 2025-11-26 10:32:42 -08:00
write-tree.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00