kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Patrick Steinhardt	f234df07f6	reftable/stack: handle locked tables during auto-compaction When compacting tables, it may happen that we want to compact a set of tables which are already locked by a concurrent process that compacts them. In the case where we wanted to perform a full compaction of all tables it is sensible to bail out in this case, as we cannot fulfill the requested action. But when performing auto-compaction it isn't necessarily in our best interest of us to abort the whole operation. For example, due to the geometric compacting schema that we use, it may be that process A takes a lot of time to compact the bulk of all tables whereas process B appends a bunch of new tables to the stack. B would in this case also notice that it has to compact the tables that process A is compacting already and thus also try to compact the same range, probably including the new tables it has appended. But because those tables are locked already, it will fail and thus abort the complete auto-compaction. The consequence is that the stack will grow longer and longer while A isn't yet done with compaction, which will lead to a growing performance impact. Instead of aborting auto-compaction altogether, let's gracefully handle this situation by instead compacting tables which aren't locked. To do so, instead of locking from the beginning of the slice-to-be-compacted, we start locking tables from the end of the slice. Once we hit the first table that is locked already, we abort. If we succeeded to lock two or more tables, then we simply reduce the slice of tables that we're about to compact to those which we managed to lock. This ensures that we can at least make some progress for compaction in said scenario. It also helps in other scenarios, like for example when a process died and left a stale lockfile behind. In such a case we can at least ensure some compaction on a best-effort basis. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:43 -07:00
Patrick Steinhardt	fba95dad6a	t: mark a bunch of tests as leak-free There are a bunch of tests which do not have any leaks: - t0411: Introduced via `5c5a4a1c05` (t0411: add tests for cloning from partial repo, 2024-01-28), passes since its inception. - t0610: Introduced via `57db2a094d` (refs: introduce reftable backend, 2024-02-07), passes since its inception. - t2405: Passes since `6741e917de` (repository: avoid leaking `fsmonitor` data, 2024-04-12). - t7423: Introduced via `b20c10fd9b` (t7423: add tests for symlinked submodule directories, 2024-01-28), passes since `e8d0608944` (submodule: require the submodule path to contain directories only, 2024-03-26). The fix is not obviously related, but probably works because we now die early in many code paths. - t9xxx: All of these are exercising CVS-related tooling and pass since at least Git v2.40. It's likely that these pass for a long time already, but nobody ever noticed because Git developers do not tend to have CVS on their machines. Mark all of these tests as passing. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-27 11:19:57 -07:00
Junio C Hamano	d25ad94df6	Merge branch 'ps/ci-test-with-jgit' Tests to ensure interoperability between reftable written by jgit and our code have been added and enabled in CI. * ps/ci-test-with-jgit: t0612: add tests to exercise Git/JGit reftable compatibility t0610: fix non-portable variable assignment t06xx: always execute backend-specific tests ci: install JGit dependency ci: make Perforce binaries executable for all users ci: merge scripts which install dependencies ci: fix setup of custom path for GitLab CI ci: merge custom PATH directories ci: convert "install-dependencies.sh" to use "/bin/sh" ci: drop duplicate package installation for "linux-gcc-default" ci: skip sudo when we are already root ci: expose distro name in dockerized GitHub jobs ci: rename "runs_on_pool" to "distro"	2024-05-08 10:18:44 -07:00
Junio C Hamano	5aec7231c8	Merge branch 'ps/reftable-write-optim' Code to write out reftable has seen some optimization and simplification. * ps/reftable-write-optim: reftable/block: reuse compressed array reftable/block: reuse zstream when writing log blocks reftable/writer: reset `last_key` instead of releasing it reftable/writer: unify releasing memory reftable/writer: refactorings for `writer_flush_nonempty_block()` reftable/writer: refactorings for `writer_add_record()` refs/reftable: don't recompute committer ident reftable: remove name checks refs/reftable: skip duplicate name checks refs/reftable: perform explicit D/F check when writing symrefs refs/reftable: fix D/F conflict error message on ref copy	2024-05-08 10:18:43 -07:00
Junio C Hamano	82a31ec324	Merge branch 'jt/reftable-geometric-compaction' The strategy to compact multiple tables of reftables after many operations accumulate many entries has been improved to avoid accumulating too many tables uncollected. * jt/reftable-geometric-compaction: reftable/stack: use geometric table compaction reftable/stack: add env to disable autocompaction reftable/stack: expose option to disable auto-compaction	2024-04-16 14:50:30 -07:00
Junio C Hamano	92e8388bd3	Merge branch 'jc/local-extern-shell-rules' Document and apply workaround for a buggy version of dash that mishandles "local var=val" construct. * jc/local-extern-shell-rules: t1016: local VAR="VAL" fix t0610: local VAR="VAL" fix t: teach lint that RHS of 'local VAR=VAL' needs to be quoted t: local VAR="VAL" (quote ${magic-reference}) t: local VAR="VAL" (quote command substitution) t: local VAR="VAL" (quote positional parameters) CodingGuidelines: quote assigned value in 'local var=$val' CodingGuidelines: describe "export VAR=VAL" rule	2024-04-16 14:50:27 -07:00
Junio C Hamano	372aabe912	Merge branch 'ps/t0610-umask-fix' The "shared repository" test in the t0610 reftable test failed under restrictive umask setting (e.g. 007), which has been corrected. * ps/t0610-umask-fix: t0610: execute git-pack-refs(1) with specified umask t0610: make `--shared=` tests reusable	2024-04-15 14:11:43 -07:00
Patrick Steinhardt	db1d63bf57	t0610: fix non-portable variable assignment Older versions of the Dash shell fail to parse `local var=val` assignments in some cases when `val` is unquoted. Such failures can be observed e.g. with Ubuntu 20.04 and older, which has a Dash version that still has this bug. Such an assignment has been introduced in t0610. The issue wasn't detected for a while because this test used to only run when the GIT_TEST_DEFAULT_REF_FORMAT environment variable was set to "reftable". We have dropped that requirement now though, meaning that it runs unconditionally, including on jobs which use such older versions of Ubuntu. We have worked around such issues in the past, e.g. in `ebee5580ca` (parallel-checkout: avoid dash local bug in tests, 2021-06-06), by quoting the `val` side. Apply the same fix to t0610. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-12 08:47:51 -07:00
Patrick Steinhardt	ca13c3e94a	t06xx: always execute backend-specific tests The tests in t06xx exercise specific ref formats. Next to probing some basic functionality, these tests also exercise other low-level details specific to the format. Those tests are only executed though in case `GIT_TEST_DEFAULT_REF_FORMAT` is set to the ref format of the respective backend-under-test. Ideally, we would run the full test matrix for ref formats such that our complete test suite is executed with every supported format on every supported platform. This is quite an expensive undertaking though, and thus we only execute e.g. the "reftable" tests on macOS and Linux. As a result, we basically have no test coverage for the "reftable" format at all on other platforms like Windows. Adapt these tests so that they override `GIT_TEST_DEFAULT_REF_FORMAT`, which means that they'll always execute. This increases test coverage on platforms that don't run the full test matrix, which at least gives us some basic test coverage on those platforms for the "reftable" format. This of course comes at the cost of running those tests multiple times on platforms where we do run the full test matrix. But arguably, this is a good thing because it will also cause us to e.g. run those tests with the address sanitizer and other non-standard parameters. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-12 08:47:51 -07:00
Patrick Steinhardt	69d87802da	t0610: execute git-pack-refs(1) with specified umask The tests for git-pack-refs(1) with the `core.sharedRepository` config execute git-pack-refs(1) outside of the shell that has the expected umask set. This is wrong because we want to test the behaviour of that command with different umasks. The issue went unnoticed because most distributions have a default umask of 0022, and we only ever test with `--shared=true`, which re-adds the group write bit. Fix the issue by moving git-pack-refs(1) into the umask'd shell and add a bunch of test cases that exercise behaviour more thoroughly. Note that we drop the check for whether `core.sharedRepository` was set to the correct value to make the test setup a bit easier. We should be able to rely on git-init(1) doing its thing correctly. Furthermore, to help readability, we convert tests that pass `--shared=true` to instead pass the equivalent `--shared=group`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-09 14:14:00 -07:00
Patrick Steinhardt	2f960dd5fe	t0610: make `--shared=` tests reusable We have two kinds of `--shared=` tests, one for git-init(1) and one for git-pack-refs(1). Merge them into a reusable function such that we can easily add additional testcases with different umasks and flags for the `--shared=` switch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-09 14:14:00 -07:00
Patrick Steinhardt	455d61b6d2	refs/reftable: perform explicit D/F check when writing symrefs We already perform explicit D/F checks in all reftable callbacks which write refs, except when writing symrefs. For one this leads to an error message which isn't perfectly actionable because we only tell the user that there was a D/F conflict, but not which refs conflicted with each other. But second, once all ref updating callbacks explicitly check for D/F conflicts, we can disable the D/F checks in the reftable library itself and thus avoid some duplicated efforts. Refactor the code that writes symref tables to explicitly call into `refs_verify_refname_available()` when writing symrefs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-08 16:59:01 -07:00
Patrick Steinhardt	f57cc987a9	refs/reftable: fix D/F conflict error message on ref copy The `write_copy_table()` function is shared between the reftable implementations for renaming and copying refs. The only difference between those two cases is that the rename will also delete the old reference, whereas copying won't. This has resulted in a bug though where we don't properly verify refname availability. When calling `refs_verify_refname_available()`, we always add the old ref name to the list of refs to be skipped when computing availability, which indicates that the name would be available even if it already exists at the current point in time. This is only the right thing to do for renames though, not for copies. The consequence of this bug is quite harmless because the reftable backend has its own checks for D/F conflicts further down in the call stack, and thus we refuse the update regardless of the bug. But all the user gets in this case is an uninformative message that copying the ref has failed, without any further details. Fix the bug and only add the old name to the skip-list in case we rename the ref. Consequently, this error case will now be handled by `refs_verify_refname_available()`, which knows to provide a proper error message. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-08 16:59:01 -07:00
Justin Tobler	a949ebd342	reftable/stack: use geometric table compaction To reduce the number of on-disk reftables, compaction is performed. Contiguous tables with the same binary log value of size are grouped into segments. The segment that has both the lowest binary log value and contains more than one table is set as the starting point when identifying the compaction segment. Since segments containing a single table are not initially considered for compaction, if the table appended to the list does not match the previous table log value, no compaction occurs for the new table. It is therefore possible for unbounded growth of the table list. This can be demonstrated by repeating the following sequence: git branch -f foo git branch -d foo Each operation results in a new table being written with no compaction occurring until a separate operation produces a table matching the previous table log value. Instead, to avoid unbounded growth of the table list, the compaction strategy is updated to ensure tables follow a geometric sequence after each operation by individually evaluating each table in reverse index order. This strategy results in a much simpler and more robust algorithm compared to the previous one while also maintaining a minimal ordered set of tables on-disk. When creating 10 thousand references, the new strategy has no performance impact: Benchmark 1: update-ref: create refs sequentially (revision = HEAD~) Time (mean ± σ): 26.516 s ± 0.047 s [User: 17.864 s, System: 8.491 s] Range (min … max): 26.447 s … 26.569 s 10 runs Benchmark 2: update-ref: create refs sequentially (revision = HEAD) Time (mean ± σ): 26.417 s ± 0.028 s [User: 17.738 s, System: 8.500 s] Range (min … max): 26.366 s … 26.444 s 10 runs Summary update-ref: create refs sequentially (revision = HEAD) ran 1.00 ± 0.00 times faster than update-ref: create refs sequentially (revision = HEAD~) Some tests in `t0610-reftable-basics.sh` assert the on-disk state of tables and are therefore updated to specify the correct new table count. Since compaction is more aggressive in ensuring tables maintain a geometric sequence, the expected table count is reduced in these tests. In `reftable/stack_test.c` tests related to `sizes_to_segments()` are removed because the function is no longer needed. Also, the `test_suggest_compaction_segment()` test is updated to better showcase and reflect the new geometric compaction behavior. Signed-off-by: Justin Tobler <jltobler@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-08 12:11:10 -07:00
Justin Tobler	7c8eb5928f	reftable/stack: add env to disable autocompaction In future tests it will be neccesary to create repositories with a set number of tables. To make this easier, introduce the `GIT_TEST_REFTABLE_AUTOCOMPACTION` environment variable that, when set to false, disables autocompaction of reftables. Signed-off-by: Justin Tobler <jltobler@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-08 12:11:10 -07:00
Junio C Hamano	26ba7477d9	t0610: local VAR="VAL" fix The series was based on maint and fixes all the tests that exist there, but we have acquired a few more. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-05 22:50:11 -07:00
Patrick Steinhardt	9f6714ab3e	builtin/gc: pack refs when using `git maintenance run --auto` When running `git maintenance run --auto`, then the various subtasks will only run as needed. Thus, we for example end up only packing loose objects if we hit a certain threshold. Interestingly enough, the "pack-refs" task is actually _never_ executed when the auto-flag is set because it does not have a condition at all. As `41abfe15d9` (maintenance: add pack-refs task, 2021-02-09) mentions: The 'auto_condition' function pointer is left NULL for now. We could extend this in the future to have a condition check if pack-refs should be run during 'git maintenance run --auto'. It is not quite clear from that quote whether it is actually intended that the task doesn't run at all in this mode. Also, no test was added to verify this behaviour. Ultimately though, it feels quite surprising that `git maintenance run --auto --task=pack-refs` would quietly never do anything at all. In any case, now that we do have the logic in place to let ref backends decide whether or not to repack refs, it does make sense to wire it up accordingly. With the "reftable" backend we will thus now perform auto-compaction, which optimizes the refdb as needed. But for the "files" backend we now unconditionally pack refs as it does not yet know to handle the "auto" flag. Arguably, this can be seen as a bug fix given that previously the task never did anything at all. Eventually though we should amend the "files" backend to use some heuristics for auto compaction, as well. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-25 09:54:07 -07:00
Patrick Steinhardt	bfc2f9eb8e	builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs Forward the `--auto` flag to git-pack-refs(1) when it has been invoked with this flag itself. This does not change anything for the "files" backend, which will continue to eagerly pack refs. But it does ensure that the "reftable" backend only compacts refs as required. This change does not impact git-maintenance(1) because this command will in fact never run the pack-refs task when run with `--auto`. This issue will be addressed in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-25 09:54:07 -07:00
Patrick Steinhardt	6dcffc68f4	builtin/pack-refs: introduce new "--auto" flag Calling git-pack-refs(1) will unconditionally cause it to pack all requested refs regardless of the current state of the ref database. For example: - With the "files" backend we will end up rewriting the complete "packed-refs" file even if only a single ref would require compaction. - With the "reftable" backend we will end up always compacting all tables into a single table. This behaviour can be completely unnecessary depending on the backend and is thus wasteful. With the introduction of the `PACK_REFS_AUTO` flag in the preceding commit we can improve this and let the backends decide for themselves whether to pack refs in the first place. Expose this functionality via a new "--auto" flag in git-pack-refs(1), which mirrors the same flag in both git-gc(1) and git-maintenance(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-25 09:54:07 -07:00
Patrick Steinhardt	4ccf7060d8	refs/reftable: print errors on compaction failure When git-pack-refs(1) fails in the reftable backend we end up printing no error message at all, leaving the caller puzzled as to why compaction has failed. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-25 09:54:07 -07:00
Patrick Steinhardt	a2f711ade0	reftable/stack: gracefully handle failed auto-compaction due to locks Whenever we commit a new table to the reftable stack we will end up invoking auto-compaction of the stack to keep the total number of tables at bay. This auto-compaction may fail though in case at least one of the tables which we are about to compact is locked. This is indicated by the compaction function returning `REFTABLE_LOCK_ERROR`. We do not handle this case though, and thus bubble that return value up the calling chain, which will ultimately cause a failure. Fix this bug by ignoring `REFTABLE_LOCK_ERROR`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-25 09:54:07 -07:00
Patrick Steinhardt	e0795e2c79	t0610: remove unused variable assignment In `b0f6b6b523` (refs/reftable: don't fail empty transactions in repo without HEAD, 2024-02-27), we have added a new test to t0610. This test contains a useless assignment to a variable that is never actually used. Remove it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-06 08:40:40 -08:00
Patrick Steinhardt	b0f6b6b523	refs/reftable: don't fail empty transactions in repo without HEAD Under normal circumstances, it shouldn't ever happen that a repository has no HEAD reference. In fact, git-update-ref(1) would fail any request to delete the HEAD reference, and a newly initialized repository always pre-creates it, too. We have however changed git-clone(1) to partially initialize the refdb just up to the point where remote helpers can find the repository. With that change, we are going to run into a situation where repositories have no refs at all. Now there is a very particular edge case in this situation: when preparing an empty ref transacton, we end up returning whatever value `read_ref_without_reload()` returned to the caller. Under normal conditions this would be fine: "HEAD" should usually exist, and thus the function would return `0`. But if "HEAD" doesn't exist, the function returns a positive value which we end up returning to the caller. Fix this bug by resetting the return code to `0` and add a test. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-02-27 13:53:39 -08:00
Patrick Steinhardt	57db2a094d	refs: introduce reftable backend Due to scalability issues, Shawn Pearce has originally proposed a new "reftable" format more than six years ago [1]. Initially, this new format was implemented in JGit with promising results. Around two years ago, we have then added the "reftable" library to the Git codebase via `a4bbd13be3` (Merge branch 'hn/reftable', 2021-12-15). With this we have landed all the low-level code to read and write reftables. Notably missing though was the integration of this low-level code into the Git code base in the form of a new ref backend that ties all of this together. This gap is now finally closed by introducing a new "reftable" backend into the Git codebase. This new backend promises to bring some notable improvements to Git repositories: - It becomes possible to do truly atomic writes where either all refs are committed to disk or none are. This was not possible with the "files" backend because ref updates were split across multiple loose files. - The disk space required to store many refs is reduced, both compared to loose refs and packed-refs. This is enabled both by the reftable format being a binary format, which is more compact, and by prefix compression. - We can ignore filesystem-specific behaviour as ref names are not encoded via paths anymore. This means there is no need to handle case sensitivity on Windows systems or Unicode precomposition on macOS. - There is no need to rewrite the complete refdb anymore every time a ref is being deleted like it was the case for packed-refs. This means that ref deletions are now constant time instead of scaling linearly with the number of refs. - We can ignore file/directory conflicts so that it becomes possible to store both "refs/heads/foo" and "refs/heads/foo/bar". - Due to this property we can retain reflogs for deleted refs. We have previously been deleting reflogs together with their refs to avoid file/directory conflicts, which is not necessary anymore. - We can properly enumerate all refs. With the "files" backend it is not easily possible to distinguish between refs and non-refs because they may live side by side in the gitdir. Not all of these improvements are realized with the current "reftable" backend implementation. At this point, the new backend is supposed to be a drop-in replacement for the "files" backend that is used by basically all Git repositories nowadays. It strives for 1:1 compatibility, which means that a user can expect the same behaviour regardless of whether they use the "reftable" backend or the "files" backend for most of the part. Most notably, this means we artificially limit the capabilities of the "reftable" backend to match the limits of the "files" backend. It is not possible to create refs that would end up with file/directory conflicts, we do not retain reflogs, we perform stricter-than-necessary checks. This is done intentionally due to two main reasons: - It makes it significantly easier to land the "reftable" backend as tests behave the same. It would be tough to argue for each and every single test that doesn't pass with the "reftable" backend. - It ensures compatibility between repositories that use the "files" backend and repositories that use the "reftable" backend. Like this, hosters can migrate their repositories to use the "reftable" backend without causing issues for clients that use the "files" backend in their clones. It is expected that these artificial limitations may eventually go away in the long term. Performance-wise things very much depend on the actual workload. The following benchmarks compare the "files" and "reftable" backends in the current version: - Creating N refs in separate transactions shows that the "files" backend is ~50% faster. This is not surprising given that creating a ref only requires us to create a single loose ref. The "reftable" backend will also perform auto compaction on updates. In real-world workloads we would likely also want to perform pack loose refs, which would likely change the picture. Benchmark 1: update-ref: create refs sequentially (refformat = files, refcount = 1) Time (mean ± σ): 2.1 ms ± 0.3 ms [User: 0.6 ms, System: 1.7 ms] Range (min … max): 1.8 ms … 4.3 ms 133 runs Benchmark 2: update-ref: create refs sequentially (refformat = reftable, refcount = 1) Time (mean ± σ): 2.7 ms ± 0.1 ms [User: 0.6 ms, System: 2.2 ms] Range (min … max): 2.4 ms … 2.9 ms 132 runs Benchmark 3: update-ref: create refs sequentially (refformat = files, refcount = 1000) Time (mean ± σ): 1.975 s ± 0.006 s [User: 0.437 s, System: 1.535 s] Range (min … max): 1.969 s … 1.980 s 3 runs Benchmark 4: update-ref: create refs sequentially (refformat = reftable, refcount = 1000) Time (mean ± σ): 2.611 s ± 0.013 s [User: 0.782 s, System: 1.825 s] Range (min … max): 2.597 s … 2.622 s 3 runs Benchmark 5: update-ref: create refs sequentially (refformat = files, refcount = 100000) Time (mean ± σ): 198.442 s ± 0.241 s [User: 43.051 s, System: 155.250 s] Range (min … max): 198.189 s … 198.670 s 3 runs Benchmark 6: update-ref: create refs sequentially (refformat = reftable, refcount = 100000) Time (mean ± σ): 294.509 s ± 4.269 s [User: 104.046 s, System: 190.326 s] Range (min … max): 290.223 s … 298.761 s 3 runs - Creating N refs in a single transaction shows that the "files" backend is significantly slower once we start to write many refs. The "reftable" backend only needs to update two files, whereas the "files" backend needs to write one file per ref. Benchmark 1: update-ref: create many refs (refformat = files, refcount = 1) Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.4 ms, System: 1.4 ms] Range (min … max): 1.8 ms … 2.6 ms 151 runs Benchmark 2: update-ref: create many refs (refformat = reftable, refcount = 1) Time (mean ± σ): 2.5 ms ± 0.1 ms [User: 0.7 ms, System: 1.7 ms] Range (min … max): 2.4 ms … 3.4 ms 148 runs Benchmark 3: update-ref: create many refs (refformat = files, refcount = 1000) Time (mean ± σ): 152.5 ms ± 5.2 ms [User: 19.1 ms, System: 133.1 ms] Range (min … max): 148.5 ms … 167.8 ms 15 runs Benchmark 4: update-ref: create many refs (refformat = reftable, refcount = 1000) Time (mean ± σ): 58.0 ms ± 2.5 ms [User: 28.4 ms, System: 29.4 ms] Range (min … max): 56.3 ms … 72.9 ms 40 runs Benchmark 5: update-ref: create many refs (refformat = files, refcount = 1000000) Time (mean ± σ): 152.752 s ± 0.710 s [User: 20.315 s, System: 131.310 s] Range (min … max): 152.165 s … 153.542 s 3 runs Benchmark 6: update-ref: create many refs (refformat = reftable, refcount = 1000000) Time (mean ± σ): 51.912 s ± 0.127 s [User: 26.483 s, System: 25.424 s] Range (min … max): 51.769 s … 52.012 s 3 runs - Deleting a ref in a fully-packed repository shows that the "files" backend scales with the number of refs. The "reftable" backend has constant-time deletions. Benchmark 1: update-ref: delete ref (refformat = files, refcount = 1) Time (mean ± σ): 1.7 ms ± 0.1 ms [User: 0.4 ms, System: 1.2 ms] Range (min … max): 1.6 ms … 2.1 ms 316 runs Benchmark 2: update-ref: delete ref (refformat = reftable, refcount = 1) Time (mean ± σ): 1.8 ms ± 0.1 ms [User: 0.4 ms, System: 1.3 ms] Range (min … max): 1.7 ms … 2.1 ms 294 runs Benchmark 3: update-ref: delete ref (refformat = files, refcount = 1000) Time (mean ± σ): 2.0 ms ± 0.1 ms [User: 0.5 ms, System: 1.4 ms] Range (min … max): 1.9 ms … 2.5 ms 287 runs Benchmark 4: update-ref: delete ref (refformat = reftable, refcount = 1000) Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.5 ms, System: 1.3 ms] Range (min … max): 1.8 ms … 2.1 ms 217 runs Benchmark 5: update-ref: delete ref (refformat = files, refcount = 1000000) Time (mean ± σ): 229.8 ms ± 7.9 ms [User: 182.6 ms, System: 46.8 ms] Range (min … max): 224.6 ms … 245.2 ms 6 runs Benchmark 6: update-ref: delete ref (refformat = reftable, refcount = 1000000) Time (mean ± σ): 2.0 ms ± 0.0 ms [User: 0.6 ms, System: 1.3 ms] Range (min … max): 2.0 ms … 2.1 ms 3 runs - Listing all refs shows no significant advantage for either of the backends. The "files" backend is a bit faster, but not by a significant margin. When repositories are not packed the "reftable" backend outperforms the "files" backend because the "reftable" backend performs auto-compaction. Benchmark 1: show-ref: print all refs (refformat = files, refcount = 1, packed = true) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.0 ms 1729 runs Benchmark 2: show-ref: print all refs (refformat = reftable, refcount = 1, packed = true) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 1.8 ms 1816 runs Benchmark 3: show-ref: print all refs (refformat = files, refcount = 1000, packed = true) Time (mean ± σ): 4.3 ms ± 0.1 ms [User: 0.9 ms, System: 3.3 ms] Range (min … max): 4.1 ms … 4.6 ms 645 runs Benchmark 4: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = true) Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.0 ms, System: 3.3 ms] Range (min … max): 4.2 ms … 5.9 ms 643 runs Benchmark 5: show-ref: print all refs (refformat = files, refcount = 1000000, packed = true) Time (mean ± σ): 2.537 s ± 0.034 s [User: 0.488 s, System: 2.048 s] Range (min … max): 2.511 s … 2.627 s 10 runs Benchmark 6: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = true) Time (mean ± σ): 2.712 s ± 0.017 s [User: 0.653 s, System: 2.059 s] Range (min … max): 2.692 s … 2.752 s 10 runs Benchmark 7: show-ref: print all refs (refformat = files, refcount = 1, packed = false) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 1.9 ms 1834 runs Benchmark 8: show-ref: print all refs (refformat = reftable, refcount = 1, packed = false) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.4 ms … 2.0 ms 1840 runs Benchmark 9: show-ref: print all refs (refformat = files, refcount = 1000, packed = false) Time (mean ± σ): 13.8 ms ± 0.2 ms [User: 2.8 ms, System: 10.8 ms] Range (min … max): 13.3 ms … 14.5 ms 208 runs Benchmark 10: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = false) Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.2 ms, System: 3.3 ms] Range (min … max): 4.3 ms … 6.2 ms 624 runs Benchmark 11: show-ref: print all refs (refformat = files, refcount = 1000000, packed = false) Time (mean ± σ): 12.127 s ± 0.129 s [User: 2.675 s, System: 9.451 s] Range (min … max): 11.965 s … 12.370 s 10 runs Benchmark 12: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = false) Time (mean ± σ): 2.799 s ± 0.022 s [User: 0.735 s, System: 2.063 s] Range (min … max): 2.769 s … 2.836 s 10 runs - Printing a single ref shows no real difference between the "files" and "reftable" backends. Benchmark 1: show-ref: print single ref (refformat = files, refcount = 1) Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.4 ms, System: 1.0 ms] Range (min … max): 1.4 ms … 1.8 ms 1779 runs Benchmark 2: show-ref: print single ref (refformat = reftable, refcount = 1) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.4 ms … 2.5 ms 1753 runs Benchmark 3: show-ref: print single ref (refformat = files, refcount = 1000) Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.3 ms, System: 1.1 ms] Range (min … max): 1.4 ms … 1.9 ms 1840 runs Benchmark 4: show-ref: print single ref (refformat = reftable, refcount = 1000) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.0 ms 1831 runs Benchmark 5: show-ref: print single ref (refformat = files, refcount = 1000000) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.1 ms 1848 runs Benchmark 6: show-ref: print single ref (refformat = reftable, refcount = 1000000) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.1 ms 1762 runs So overall, performance depends on the usecases. Except for many sequential writes the "reftable" backend is roughly on par or significantly faster than the "files" backend though. Given that the "files" backend has received 18 years of optimizations by now this can be seen as a win. Furthermore, we can expect that the "reftable" backend will grow faster over time when attention turns more towards optimizations. The complete test suite passes, except for those tests explicitly marked to require the REFFILES prerequisite. Some tests in t0610 are marked as failing because they depend on still-in-flight bug fixes. Tests can be run with the new backend by setting the GIT_TEST_DEFAULT_REF_FORMAT environment variable to "reftable". There is a single known conceptual incompatibility with the dumb HTTP transport. As "info/refs" SHOULD NOT contain the HEAD reference, and because the "HEAD" file is not valid anymore, it is impossible for the remote client to figure out the default branch without changing the protocol. This shortcoming needs to be handled in a subsequent patch series. As the reftable library has already been introduced a while ago, this commit message will not go into the details of how exactly the on-disk format works. Please refer to our preexisting technical documentation at Documentation/technical/reftable for this. [1]: https://public-inbox.org/git/CAJo=hJtyof=HRy=2sLP0ng0uZ4=S-DpZ5dR1aF+VHVETKG20OQ@mail.gmail.com/ Original-idea-by: Shawn Pearce <spearce@spearce.org> Based-on-patch-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-02-07 08:28:37 -08:00

24 Commits (e44f15ba3ee873b5df5e8e5d8cc018df288472ef)