git/builtin
Derrick Stolee 6d87f0e8a3 path-walk: support blobless filter
The 'git pack-objects' command can opt-in to using the path-walk API for
scanning the objects. Currently, this option is dynamically disabled if
combined with '--filter=<X>', even when using a simple filter such as
'blob:none' to signal a blobless packfile. This is a common scenario for
repos at scale, so is worth integrating.

Also, users can opt-in to the '--path-walk' option by default through
the pack.usePathWalk=true config option. When using that in a blobless
partial clone, the following warning can appear even though the user did
not specify either option directly:

  warning: cannot use --filter with --path-walk

Teach the path-walk API to handle the 'blob:none' object filter
natively. When revs->filter.choice is LOFC_BLOB_NONE, the path-walk
sets info->blobs to 0 (skipping all blob objects) and clears the
filter from revs so that prepare_revision_walk() does not reject the
configuration.

This check is implemented in the static prepare_filters() method, which
will simultaneously check if the input filters are compatible and will
make the appropriate mutations to the path_walk_info and filters if the
path_walk_info is non-NULL. This allows us to use this logic both in the
API method path_walk_filter_compatible() for use in
builtin/pack-objects.c and as a prep step in walk_objects_by_path().

Update the test helper (test-path-walk) to accept --filter=<spec>
as a test-tool option (before '--'), applying it to revs after
setup_revisions() to avoid the --objects requirement check. We can also
revert recent GIT_TEST_PACK_PATH_WALK overrides in t5620.

Also switch test-path-walk from REV_INFO_INIT with manual repo
assignment to repo_init_revisions(), which properly initializes
the filter_spec strbuf needed for filter parsing.

Add tests for blob:none with --all and with a single branch.

The performance test p5315 shows the impact of this change when using
blobless filters:

Test                                           HEAD~1     HEAD
---------------------------------------------------------------------
5315.6: repack (blob:none)                      13.53   13.87  +2.5%
5315.7: repack size (blob:none)                137.7M  137.8M  +0.1%
5315.8: repack (blob:none, --path-walk)         13.51   23.43 +73.4%
5315.9: repack size (blob:none, --path-walk)   137.7M  115.2M -16.3%

These performance tests were run on the Git repository. The --path-walk
feature shows meaningful space savings (16% smaller for blobless packs)
at the cost of increased computation time due to the two compression
passes. This data demonstrates that the feature is engaged and provides
real compression benefits when --no-reuse-delta forces fresh deltas.

Co-Authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-24 18:41:06 +09:00
..
add.c Merge branch 'ps/history-split' 2026-03-24 12:31:32 -07:00
am.c Merge branch 'vp/http-rate-limit-retries' 2026-04-01 10:28:18 -07:00
annotate.c
apply.c builtin: use default hash when outside a repository 2025-07-01 14:58:24 -07:00
archive.c
backfill.c backfill: default to grabbing edge blobs too 2026-04-15 20:32:29 -07:00
bisect.c refs: replace `refs_for_each_glob_ref_in()` 2026-02-23 13:21:19 -08:00
blame.c mailmap: stop using the_repository 2026-02-20 08:13:58 -08:00
branch.c object-name: turn INTERPRET_BRANCH_* constants into enum values 2026-03-18 12:52:29 -07:00
bugreport.c object-file: move `safe_create_leading_directories()` into "path.c" 2025-04-15 08:24:35 -07:00
bundle.c
cat-file.c odb: rename `odb_has_object()` flags 2026-03-31 20:43:14 -07:00
check-attr.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
check-ignore.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
check-mailmap.c mailmap: stop using the_repository 2026-02-20 08:13:58 -08:00
check-ref-format.c
checkout--worker.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
checkout-index.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
checkout.c Merge branch 'ps/history-split' 2026-03-24 12:31:32 -07:00
clean.c Merge branch 'jk/color-variable-fixes' 2025-09-29 11:40:35 -07:00
clone.c Merge branch 'ob/core-attributesfile-in-repository' 2026-03-05 10:04:49 -08:00
column.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
commit-graph.c commit-graph: add new config for changed-paths & recommend it in scalar 2025-10-22 10:40:11 -07:00
commit-tree.c commit: rename `free_commit_list()` to conform to coding guidelines 2026-01-15 05:32:31 -08:00
commit.c Merge branch 'ps/history-split' 2026-03-24 12:31:32 -07:00
config.c config: store allocated string in non-const pointer 2026-03-26 12:47:17 -07:00
count-objects.c packfile: introduce macro to iterate through packs 2025-10-16 14:42:39 -07:00
credential-cache--daemon.c config: drop `git_config_get_bool()` wrapper 2025-07-23 08:15:20 -07:00
credential-cache.c
credential-store.c builtin/credential-store: move is_rfc3986_unreserved to url.[ch] 2026-01-12 11:56:56 -08:00
credential.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
describe.c refs: replace `refs_for_each_rawref()` 2026-02-23 13:21:18 -08:00
diagnose.c object-file: move `safe_create_leading_directories()` into "path.c" 2025-04-15 08:24:35 -07:00
diff-files.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
diff-index.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
diff-pairs.c
diff-tree.c Merge branch 'ps/commit-list-functions-renamed' 2026-02-13 13:39:25 -08:00
diff.c diff: --no-index should ignore the worktree 2025-08-09 17:22:01 -07:00
difftool.c odb: rename `repo_read_object_file()` 2025-07-01 14:46:38 -07:00
fast-export.c fast-import: add 'abort-if-invalid' mode to '--signed-commits=<mode>' 2026-03-26 12:42:57 -07:00
fast-import.c Merge branch 'jt/fast-import-signed-modes' 2026-04-07 14:59:27 -07:00
fetch-pack.c builtin/fetch-pack: cleanup before return error 2025-06-04 08:52:25 -07:00
fetch.c odb: rename `odb_has_object()` flags 2026-03-31 20:43:14 -07:00
fmt-merge-msg.c builtin/fmt-merge-msg: stop depending on 'the_repository' 2025-08-11 09:19:40 -07:00
for-each-ref.c Merge branch 'ms/refs-list' 2025-08-22 13:13:20 -07:00
for-each-repo.c for-each-repo: simplify passing of parameters 2026-03-03 10:20:00 -08:00
fsck.c Merge branch 'ps/odb-cleanup' 2026-04-08 10:19:17 -07:00
fsmonitor--daemon.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
gc.c Merge branch 'ps/object-counting' 2026-03-25 12:58:05 -07:00
get-tar-commit-id.c
grep.c Merge branch 'ps/odb-sources' 2026-03-12 14:09:07 -07:00
hash-object.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
help.c Merge branch 'ac/help-sort-correctly' 2026-03-23 09:20:30 -07:00
history.c history: fix short help for argument of --update-refs 2026-04-06 10:17:36 -07:00
hook.c hook: reject unknown hook names in git-hook(1) 2026-03-25 14:00:48 -07:00
index-pack.c Merge branch 'ps/odb-cleanup' 2026-04-08 10:19:17 -07:00
init-db.c Merge branch 'ps/parse-options-integers' 2025-04-24 17:25:34 -07:00
interpret-trailers.c Merge branch 'kh/doc-interpret-trailers-1' 2026-03-27 11:00:02 -07:00
last-modified.c Merge branch 'tc/last-modified-not-a-tree' 2026-02-13 13:39:25 -08:00
log.c Merge branch 'mf/format-patch-cover-letter-format' 2026-04-03 13:01:08 -07:00
ls-files.c Merge branch 'ds/ls-files-lazy-unsparse' 2025-09-08 14:54:35 -07:00
ls-remote.c ref-filter: propagate peeled object ID 2025-11-04 07:32:25 -08:00
ls-tree.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
mailinfo.c
mailsplit.c
merge-base.c commit: rename `free_commit_list()` to conform to coding guidelines 2026-01-15 05:32:31 -08:00
merge-file.c Merge branch 'mr/merge-file-object-id-worktree-fix' 2026-03-27 11:00:01 -07:00
merge-index.c
merge-ours.c merge-ours: integrate with sparse-index 2026-02-06 11:45:33 -08:00
merge-recursive.c builtin: also setup gently for --help-all 2025-08-08 11:13:12 -07:00
merge-tree.c Merge branch 'ps/commit-list-functions-renamed' 2026-02-13 13:39:25 -08:00
merge.c run-command: wean auto_maintenance() functions off the_repository 2026-03-12 08:30:57 -07:00
mktag.c fsck: store repository in fsck options 2026-03-23 08:33:10 -07:00
mktree.c builtin/mktree: remove USE_THE_REPOSITORY_VARIABLE 2026-03-12 10:03:23 -07:00
multi-pack-index.c Merge branch 'ps/object-counting' 2026-03-25 12:58:05 -07:00
mv.c environment: stop using core.sparseCheckout globally 2026-02-26 07:22:51 -08:00
name-rev.c use commit_stack instead of prio_queue in LIFO mode 2026-03-18 10:39:56 -07:00
notes.c Merge branch 'jc/strbuf-split' 2025-08-21 13:47:00 -07:00
pack-objects.c path-walk: support blobless filter 2026-05-24 18:41:06 +09:00
pack-redundant.c pack-redundant: fix memory leak when open_pack_index() fails 2026-02-21 21:26:53 -08:00
pack-refs.c builtin/pack-refs: factor out core logic into a shared library 2025-09-19 10:02:55 -07:00
patch-id.c patch-id: use “patch ID” throughout 2026-01-09 06:07:21 -08:00
prune-packed.c
prune.c Merge branch 'ps/object-file-wo-the-repository' 2025-08-05 11:53:55 -07:00
pull.c run-command: wean start_command() off the_repository 2026-03-12 08:30:57 -07:00
push.c environment: move "branch.autoSetupMerge" into `struct repo_config_values` 2026-02-26 07:22:53 -08:00
range-diff.c Merge branch 'kh/format-patch-range-diff-notes' 2025-10-14 12:56:09 -07:00
read-tree.c cocci: convert parse_tree functions to repo_ variants 2026-01-09 18:36:18 -08:00
rebase.c use strvec_pushv() to add another strvec 2026-03-24 12:26:58 -07:00
receive-pack.c Merge branch 'jk/c23-const-preserving-fixes-more' 2026-04-09 11:21:59 -07:00
reflog.c Merge branch 'ps/reflog-migrate-fixes' into maint-2.51 2025-10-15 10:29:28 -07:00
refs.c fsck: store repository in fsck options 2026-03-23 08:33:10 -07:00
remote-ext.c
remote-fd.c
remote.c odb: rename `odb_has_object()` flags 2026-03-31 20:43:14 -07:00
repack.c repack: mark non-MIDX packs above the split as excluded-open 2026-03-27 13:40:40 -07:00
replace.c refs: introduce wrapper struct for `each_ref_fn` 2025-11-04 07:32:24 -08:00
replay.c replay: allow to specify a ref with option --ref 2026-04-01 21:34:25 -07:00
repo.c repo: show subcommand-specific help text 2026-03-25 10:35:27 -07:00
rerere.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
reset.c add-patch: allow disabling editing of hunks 2026-03-03 15:09:36 -08:00
rev-list.c rev-list: use reduce_heads() for --maximal-only 2026-04-06 12:02:30 -07:00
rev-parse.c rev-parse: avoid writing to const string for parent marks 2026-03-26 12:47:17 -07:00
revert.c Merge branch 'pw/3.0-commentchar-auto-deprecation' 2025-09-18 10:07:00 -07:00
rm.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
send-pack.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
shortlog.c mailmap: stop using the_repository 2026-02-20 08:13:58 -08:00
show-branch.c commit: rename `free_commit_list()` to conform to coding guidelines 2026-01-15 05:32:31 -08:00
show-index.c show-index: use gettext wrapping in user facing error messages 2026-01-30 08:58:12 -08:00
show-ref.c odb: rename `odb_has_object()` flags 2026-03-31 20:43:14 -07:00
sparse-checkout.c Merge branch 'ob/core-attributesfile-in-repository' 2026-03-05 10:04:49 -08:00
stash.c docs: fix "git stash [push]" documentation 2026-03-30 08:19:40 -07:00
stripspace.c config: drop `git_config()` wrapper 2025-07-23 08:15:18 -07:00
submodule--helper.c Merge branch 'ps/object-counting' 2026-03-25 12:58:05 -07:00
symbolic-ref.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
tag.c Merge branch 'jt/fast-import-sign-again' 2026-03-24 12:31:31 -07:00
unpack-file.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
unpack-objects.c Merge branch 'ps/odb-cleanup' 2026-04-08 10:19:17 -07:00
update-index.c odb: add transaction interface 2025-09-16 11:37:06 -07:00
update-ref.c update-ref: utilize rejected error details if available 2026-01-25 22:27:33 -08:00
update-server-info.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
upload-archive.c path: move `enter_repo()` into "setup.c" 2025-11-19 17:41:03 -08:00
upload-pack.c path: move `enter_repo()` into "setup.c" 2025-11-19 17:41:03 -08:00
var.c Merge branch 'jc/string-list-split' 2025-08-21 13:46:59 -07:00
verify-commit.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
verify-pack.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00
verify-tag.c tag: support arbitrary repositories in gpg_verify_tag() 2025-12-29 22:02:53 +09:00
worktree.c Merge branch 'pw/worktree-reduce-the-repository' 2026-04-03 13:01:09 -07:00
write-tree.c config: move Git config parsing into "environment.c" 2025-07-23 08:15:22 -07:00