kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Patrick Steinhardt	71c871b48d	refs: implement releasing ref storages Ref storages are typically only initialized once for `the_repository` and then never released. Until now we got away with that without causing memory leaks because `the_repository` stays reachable, and because the ref backend is reachable via `the_repository` its memory basically never leaks. This is about to change though because of the upcoming migration logic, which will create a secondary ref storage. In that case, we will either have to release the old or new ref storage to avoid leaks. Implement a new `release` callback and expose it via a new `ref_storage_release()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-17 10:33:37 -07:00
Patrick Steinhardt	ed93ea1602	refs: rename `init_db` callback to avoid confusion Reference backends have two callbacks `init` and `init_db`. The similarity of these two callbacks has repeatedly confused me whenever I was looking at them, where I always had to look up which of them does what. Rename the `init_db` callback to `create_on_disk`, which should hopefully be clearer. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-17 10:33:36 -07:00
Junio C Hamano	d037212d97	Merge branch 'kn/for-all-refs' "git for-each-ref" learned "--include-root-refs" option to show even the stuff outside the 'refs/' hierarchy. * kn/for-all-refs: for-each-ref: add new option to include root refs ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR' refs: introduce `refs_for_each_include_root_refs()` refs: extract out `loose_fill_ref_dir_regular_file()` refs: introduce `is_pseudoref()` and `is_headref()`	2024-03-05 09:44:44 -08:00
Karthik Nayak	d0f00c1ac1	refs: introduce `refs_for_each_include_root_refs()` Introduce a new ref iteration flag `DO_FOR_EACH_INCLUDE_ROOT_REFS`, which will be used to iterate over regular refs plus pseudorefs and HEAD. Refs which fall outside the `refs/` and aren't either pseudorefs or HEAD are more of a grey area. This is because we don't block the users from creating such refs but they are not officially supported. Introduce `refs_for_each_include_root_refs()` which calls `do_for_each_ref()` with this newly introduced flag. In `refs/files-backend.c`, introduce a new function `add_pseudoref_and_head_entries()` to add pseudorefs and HEAD to the `ref_dir`. We then finally call `add_pseudoref_and_head_entries()` whenever the `DO_FOR_EACH_INCLUDE_ROOT_REFS` flag is set. Any new ref backend will also have to implement similar changes on its end. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-02-23 10:36:27 -08:00
Patrick Steinhardt	5e01d83841	refs: always treat iterators as ordered In the preceding commit we have converted the reflog iterator of the "files" backend to be ordered, which was the only remaining ref iterator that wasn't ordered. Refactor the ref iterator infrastructure so that we always assume iterators to be ordered, thus simplifying the code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-02-21 09:58:06 -08:00
Patrick Steinhardt	6f22780017	refs/files: sort merged worktree and common reflogs When iterating through reflogs in a worktree we create a merged iterator that merges reflogs from both refdbs. The resulting refs are ordered so that instead we first return all worktree reflogs before we return all common refs. This is the only remaining case where a ref iterator returns entries in a non-lexicographic order. The result would look something like the following (listed with a command we introduce in a subsequent commit): ``` $ git reflog list HEAD refs/worktree/per-worktree refs/heads/main refs/heads/wt ``` So we first print the per-worktree reflogs in lexicographic order, then the common reflogs in lexicographic order. This is confusing and not consistent with how we print per-worktree refs, which are exclusively sorted lexicographically. Sort reflogs lexicographically in the same way as we sort normal refs. As this is already implemented properly by the "reftable" backend via a separate selection function, we simply pull out that logic and reuse it for the "files" backend. As logs are properly sorted now, mark the merged reflog iterator as sorted. Tests will be added in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-02-21 09:58:06 -08:00
Patrick Steinhardt	57db2a094d	refs: introduce reftable backend Due to scalability issues, Shawn Pearce has originally proposed a new "reftable" format more than six years ago [1]. Initially, this new format was implemented in JGit with promising results. Around two years ago, we have then added the "reftable" library to the Git codebase via `a4bbd13be3` (Merge branch 'hn/reftable', 2021-12-15). With this we have landed all the low-level code to read and write reftables. Notably missing though was the integration of this low-level code into the Git code base in the form of a new ref backend that ties all of this together. This gap is now finally closed by introducing a new "reftable" backend into the Git codebase. This new backend promises to bring some notable improvements to Git repositories: - It becomes possible to do truly atomic writes where either all refs are committed to disk or none are. This was not possible with the "files" backend because ref updates were split across multiple loose files. - The disk space required to store many refs is reduced, both compared to loose refs and packed-refs. This is enabled both by the reftable format being a binary format, which is more compact, and by prefix compression. - We can ignore filesystem-specific behaviour as ref names are not encoded via paths anymore. This means there is no need to handle case sensitivity on Windows systems or Unicode precomposition on macOS. - There is no need to rewrite the complete refdb anymore every time a ref is being deleted like it was the case for packed-refs. This means that ref deletions are now constant time instead of scaling linearly with the number of refs. - We can ignore file/directory conflicts so that it becomes possible to store both "refs/heads/foo" and "refs/heads/foo/bar". - Due to this property we can retain reflogs for deleted refs. We have previously been deleting reflogs together with their refs to avoid file/directory conflicts, which is not necessary anymore. - We can properly enumerate all refs. With the "files" backend it is not easily possible to distinguish between refs and non-refs because they may live side by side in the gitdir. Not all of these improvements are realized with the current "reftable" backend implementation. At this point, the new backend is supposed to be a drop-in replacement for the "files" backend that is used by basically all Git repositories nowadays. It strives for 1:1 compatibility, which means that a user can expect the same behaviour regardless of whether they use the "reftable" backend or the "files" backend for most of the part. Most notably, this means we artificially limit the capabilities of the "reftable" backend to match the limits of the "files" backend. It is not possible to create refs that would end up with file/directory conflicts, we do not retain reflogs, we perform stricter-than-necessary checks. This is done intentionally due to two main reasons: - It makes it significantly easier to land the "reftable" backend as tests behave the same. It would be tough to argue for each and every single test that doesn't pass with the "reftable" backend. - It ensures compatibility between repositories that use the "files" backend and repositories that use the "reftable" backend. Like this, hosters can migrate their repositories to use the "reftable" backend without causing issues for clients that use the "files" backend in their clones. It is expected that these artificial limitations may eventually go away in the long term. Performance-wise things very much depend on the actual workload. The following benchmarks compare the "files" and "reftable" backends in the current version: - Creating N refs in separate transactions shows that the "files" backend is ~50% faster. This is not surprising given that creating a ref only requires us to create a single loose ref. The "reftable" backend will also perform auto compaction on updates. In real-world workloads we would likely also want to perform pack loose refs, which would likely change the picture. Benchmark 1: update-ref: create refs sequentially (refformat = files, refcount = 1) Time (mean ± σ): 2.1 ms ± 0.3 ms [User: 0.6 ms, System: 1.7 ms] Range (min … max): 1.8 ms … 4.3 ms 133 runs Benchmark 2: update-ref: create refs sequentially (refformat = reftable, refcount = 1) Time (mean ± σ): 2.7 ms ± 0.1 ms [User: 0.6 ms, System: 2.2 ms] Range (min … max): 2.4 ms … 2.9 ms 132 runs Benchmark 3: update-ref: create refs sequentially (refformat = files, refcount = 1000) Time (mean ± σ): 1.975 s ± 0.006 s [User: 0.437 s, System: 1.535 s] Range (min … max): 1.969 s … 1.980 s 3 runs Benchmark 4: update-ref: create refs sequentially (refformat = reftable, refcount = 1000) Time (mean ± σ): 2.611 s ± 0.013 s [User: 0.782 s, System: 1.825 s] Range (min … max): 2.597 s … 2.622 s 3 runs Benchmark 5: update-ref: create refs sequentially (refformat = files, refcount = 100000) Time (mean ± σ): 198.442 s ± 0.241 s [User: 43.051 s, System: 155.250 s] Range (min … max): 198.189 s … 198.670 s 3 runs Benchmark 6: update-ref: create refs sequentially (refformat = reftable, refcount = 100000) Time (mean ± σ): 294.509 s ± 4.269 s [User: 104.046 s, System: 190.326 s] Range (min … max): 290.223 s … 298.761 s 3 runs - Creating N refs in a single transaction shows that the "files" backend is significantly slower once we start to write many refs. The "reftable" backend only needs to update two files, whereas the "files" backend needs to write one file per ref. Benchmark 1: update-ref: create many refs (refformat = files, refcount = 1) Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.4 ms, System: 1.4 ms] Range (min … max): 1.8 ms … 2.6 ms 151 runs Benchmark 2: update-ref: create many refs (refformat = reftable, refcount = 1) Time (mean ± σ): 2.5 ms ± 0.1 ms [User: 0.7 ms, System: 1.7 ms] Range (min … max): 2.4 ms … 3.4 ms 148 runs Benchmark 3: update-ref: create many refs (refformat = files, refcount = 1000) Time (mean ± σ): 152.5 ms ± 5.2 ms [User: 19.1 ms, System: 133.1 ms] Range (min … max): 148.5 ms … 167.8 ms 15 runs Benchmark 4: update-ref: create many refs (refformat = reftable, refcount = 1000) Time (mean ± σ): 58.0 ms ± 2.5 ms [User: 28.4 ms, System: 29.4 ms] Range (min … max): 56.3 ms … 72.9 ms 40 runs Benchmark 5: update-ref: create many refs (refformat = files, refcount = 1000000) Time (mean ± σ): 152.752 s ± 0.710 s [User: 20.315 s, System: 131.310 s] Range (min … max): 152.165 s … 153.542 s 3 runs Benchmark 6: update-ref: create many refs (refformat = reftable, refcount = 1000000) Time (mean ± σ): 51.912 s ± 0.127 s [User: 26.483 s, System: 25.424 s] Range (min … max): 51.769 s … 52.012 s 3 runs - Deleting a ref in a fully-packed repository shows that the "files" backend scales with the number of refs. The "reftable" backend has constant-time deletions. Benchmark 1: update-ref: delete ref (refformat = files, refcount = 1) Time (mean ± σ): 1.7 ms ± 0.1 ms [User: 0.4 ms, System: 1.2 ms] Range (min … max): 1.6 ms … 2.1 ms 316 runs Benchmark 2: update-ref: delete ref (refformat = reftable, refcount = 1) Time (mean ± σ): 1.8 ms ± 0.1 ms [User: 0.4 ms, System: 1.3 ms] Range (min … max): 1.7 ms … 2.1 ms 294 runs Benchmark 3: update-ref: delete ref (refformat = files, refcount = 1000) Time (mean ± σ): 2.0 ms ± 0.1 ms [User: 0.5 ms, System: 1.4 ms] Range (min … max): 1.9 ms … 2.5 ms 287 runs Benchmark 4: update-ref: delete ref (refformat = reftable, refcount = 1000) Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.5 ms, System: 1.3 ms] Range (min … max): 1.8 ms … 2.1 ms 217 runs Benchmark 5: update-ref: delete ref (refformat = files, refcount = 1000000) Time (mean ± σ): 229.8 ms ± 7.9 ms [User: 182.6 ms, System: 46.8 ms] Range (min … max): 224.6 ms … 245.2 ms 6 runs Benchmark 6: update-ref: delete ref (refformat = reftable, refcount = 1000000) Time (mean ± σ): 2.0 ms ± 0.0 ms [User: 0.6 ms, System: 1.3 ms] Range (min … max): 2.0 ms … 2.1 ms 3 runs - Listing all refs shows no significant advantage for either of the backends. The "files" backend is a bit faster, but not by a significant margin. When repositories are not packed the "reftable" backend outperforms the "files" backend because the "reftable" backend performs auto-compaction. Benchmark 1: show-ref: print all refs (refformat = files, refcount = 1, packed = true) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.0 ms 1729 runs Benchmark 2: show-ref: print all refs (refformat = reftable, refcount = 1, packed = true) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 1.8 ms 1816 runs Benchmark 3: show-ref: print all refs (refformat = files, refcount = 1000, packed = true) Time (mean ± σ): 4.3 ms ± 0.1 ms [User: 0.9 ms, System: 3.3 ms] Range (min … max): 4.1 ms … 4.6 ms 645 runs Benchmark 4: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = true) Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.0 ms, System: 3.3 ms] Range (min … max): 4.2 ms … 5.9 ms 643 runs Benchmark 5: show-ref: print all refs (refformat = files, refcount = 1000000, packed = true) Time (mean ± σ): 2.537 s ± 0.034 s [User: 0.488 s, System: 2.048 s] Range (min … max): 2.511 s … 2.627 s 10 runs Benchmark 6: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = true) Time (mean ± σ): 2.712 s ± 0.017 s [User: 0.653 s, System: 2.059 s] Range (min … max): 2.692 s … 2.752 s 10 runs Benchmark 7: show-ref: print all refs (refformat = files, refcount = 1, packed = false) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 1.9 ms 1834 runs Benchmark 8: show-ref: print all refs (refformat = reftable, refcount = 1, packed = false) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.4 ms … 2.0 ms 1840 runs Benchmark 9: show-ref: print all refs (refformat = files, refcount = 1000, packed = false) Time (mean ± σ): 13.8 ms ± 0.2 ms [User: 2.8 ms, System: 10.8 ms] Range (min … max): 13.3 ms … 14.5 ms 208 runs Benchmark 10: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = false) Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.2 ms, System: 3.3 ms] Range (min … max): 4.3 ms … 6.2 ms 624 runs Benchmark 11: show-ref: print all refs (refformat = files, refcount = 1000000, packed = false) Time (mean ± σ): 12.127 s ± 0.129 s [User: 2.675 s, System: 9.451 s] Range (min … max): 11.965 s … 12.370 s 10 runs Benchmark 12: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = false) Time (mean ± σ): 2.799 s ± 0.022 s [User: 0.735 s, System: 2.063 s] Range (min … max): 2.769 s … 2.836 s 10 runs - Printing a single ref shows no real difference between the "files" and "reftable" backends. Benchmark 1: show-ref: print single ref (refformat = files, refcount = 1) Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.4 ms, System: 1.0 ms] Range (min … max): 1.4 ms … 1.8 ms 1779 runs Benchmark 2: show-ref: print single ref (refformat = reftable, refcount = 1) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.4 ms … 2.5 ms 1753 runs Benchmark 3: show-ref: print single ref (refformat = files, refcount = 1000) Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.3 ms, System: 1.1 ms] Range (min … max): 1.4 ms … 1.9 ms 1840 runs Benchmark 4: show-ref: print single ref (refformat = reftable, refcount = 1000) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.0 ms 1831 runs Benchmark 5: show-ref: print single ref (refformat = files, refcount = 1000000) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.1 ms 1848 runs Benchmark 6: show-ref: print single ref (refformat = reftable, refcount = 1000000) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.1 ms 1762 runs So overall, performance depends on the usecases. Except for many sequential writes the "reftable" backend is roughly on par or significantly faster than the "files" backend though. Given that the "files" backend has received 18 years of optimizations by now this can be seen as a win. Furthermore, we can expect that the "reftable" backend will grow faster over time when attention turns more towards optimizations. The complete test suite passes, except for those tests explicitly marked to require the REFFILES prerequisite. Some tests in t0610 are marked as failing because they depend on still-in-flight bug fixes. Tests can be run with the new backend by setting the GIT_TEST_DEFAULT_REF_FORMAT environment variable to "reftable". There is a single known conceptual incompatibility with the dumb HTTP transport. As "info/refs" SHOULD NOT contain the HEAD reference, and because the "HEAD" file is not valid anymore, it is impossible for the remote client to figure out the default branch without changing the protocol. This shortcoming needs to be handled in a subsequent patch series. As the reftable library has already been introduced a while ago, this commit message will not go into the details of how exactly the on-disk format works. Please refer to our preexisting technical documentation at Documentation/technical/reftable for this. [1]: https://public-inbox.org/git/CAJo=hJtyof=HRy=2sLP0ng0uZ4=S-DpZ5dR1aF+VHVETKG20OQ@mail.gmail.com/ Original-idea-by: Shawn Pearce <spearce@spearce.org> Based-on-patch-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-02-07 08:28:37 -08:00
Patrick Steinhardt	2e573d61ff	refs: prepare `refs_init_db()` for initializing worktree refs The purpose of `refs_init_db()` is to initialize the on-disk files of a new ref database. The function is quite inflexible right now though, as callers can neither specify the `struct ref_store` nor can they pass any flags. Refactor the interface to accept both of these. This will be required so that we can start initializing per-worktree ref databases via the ref backend instead of open-coding the initialization in "worktree.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-01-08 13:17:30 -08:00
Patrick Steinhardt	0fcc285c5e	refs: refactor logic to look up storage backends In order to look up ref storage backends, we're currently using a linked list of backends, where each backend is expected to set up its `next` pointer to the next ref storage backend. This is kind of a weird setup as backends need to be aware of other backends without much of a reason. Refactor the code so that the array of backends is centrally defined in "refs.c", where each backend is now identified by an integer constant. Expose functions to translate from those integer constants to the name and vice versa, which will be required by subsequent patches. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-01-02 09:24:47 -08:00
Patrick Steinhardt	29a186917b	refs: remove `delete_refs` callback from backends Now that `refs_delete_refs` is implemented in a generic way via the ref transaction interfaces there are no callers left that invoke the `delete_refs` callback anymore. Remove it from all of our backends. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-11-17 10:12:12 +09:00
Junio C Hamano	39fe402d67	Merge branch 'tb/refs-exclusion-and-packed-refs' Enumerating refs in the packed-refs file, while excluding refs that match certain patterns, has been optimized. * tb/refs-exclusion-and-packed-refs: ls-refs.c: avoid enumerating hidden refs where possible upload-pack.c: avoid enumerating hidden refs where possible builtin/receive-pack.c: avoid enumerating hidden references refs.h: implement `hidden_refs_to_excludes()` refs.h: let `for_each_namespaced_ref()` take excluded patterns revision.h: store hidden refs in a `strvec` refs/packed-backend.c: add trace2 counters for jump list refs/packed-backend.c: implement jump lists to avoid excluded pattern(s) refs/packed-backend.c: refactor `find_reference_location()` refs: plumb `exclude_patterns` argument throughout builtin/for-each-ref.c: add `--exclude` option ref-filter.c: parameterize match functions over patterns ref-filter: add `ref_filter_clear()` ref-filter: clear reachable list pointers after freeing ref-filter.h: provide `REF_FILTER_INIT` refs.c: rename `ref_filter`	2023-07-21 13:47:26 -07:00
Taylor Blau	b269ac53c0	refs: plumb `exclude_patterns` argument throughout The subsequent patch will want to access an optional `excluded_patterns` array within `refs/packed-backend.c` that will cull out certain references matching any of the given patterns on a best-effort basis. To do so, the refs subsystem needs to be updated to pass this value across a number of different locations. Prepare for a future patch by introducing this plumbing now, passing NULLs at top-level APIs in order to make that patch less noisy and more easily readable. Signed-off-by: Taylor Blau <me@ttaylorr.co> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-07-10 14:48:55 -07:00
John Cai	826ae79fca	pack-refs: teach --exclude option to exclude refs from being packed At GitLab, we have a system that creates ephemeral internal refs that don't live long before getting deleted. Having an option to exclude certain refs from a packed-refs file allows these internal references to be deleted much more efficiently. Add an --exclude option to the pack-refs builtin, and use the ref exclusions API to exclude certain refs from being packed into the final packed-refs file Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-12 14:54:14 -07:00
Elijah Newren	ba3d1c73da	treewide: remove unnecessary cache.h includes We had several header files include cache.h unnecessarily. Remove those. These have all been verified via both ensuring that gcc -E $HEADER \| grep '"cache.h"' found no hits and that cat >temp.c <<EOF && #include "git-compat-util.h" #include "$HEADER" int main() {} EOF gcc -c temp.c successfully compiles without warnings. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-23 17:25:28 -08:00
Junio C Hamano	c6da34a610	Revert "Merge branch 'ps/avoid-unnecessary-hook-invocation-with-packed-refs'" This reverts commit `991b4d47f0`, reversing changes made to `bcd020f88e`.	2022-04-13 15:51:33 -07:00
Patrick Steinhardt	cd475b3b03	refs: add ability for backends to special-case reading of symbolic refs Reading of symbolic and non-symbolic references is currently treated the same in reference backends: we always call `refs_read_raw_ref()` and then decide based on the returned flags what type it is. This has one downside though: symbolic references may be treated different from normal references in a backend from normal references. The packed-refs backend for example doesn't even know about symbolic references, and as a result it is pointless to even ask it for one. There are cases where we really only care about whether a reference is symbolic or not, but don't care about whether it exists at all or may be a non-symbolic reference. But it is not possible to optimize for this case right now, and as a consequence we will always first check for a loose reference to exist, and if it doesn't, we'll query the packed-refs backend for a known-to-not-be-symbolic reference. This is inefficient and requires us to search all packed references even though we know to not care for the result at all. Introduce a new function `refs_read_symbolic_ref()` which allows us to fix this case. This function will only ever return symbolic references and can thus optimize for the scenario layed out above. By default, if the backend doesn't provide an implementation for it, we just use the old code path and fall back to `read_raw_ref()`. But in case the backend provides its own, more efficient implementation, we will use that one instead. Note that this function is explicitly designed to not distinguish between missing references and non-symbolic references. If it did, we'd be forced to always search the packed-refs backend to see whether the symbolic reference the user asked for really doesn't exist, or if it exists as a non-symbolic reference. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 10:13:46 -08:00
Junio C Hamano	991b4d47f0	Merge branch 'ps/avoid-unnecessary-hook-invocation-with-packed-refs' Because a deletion of ref would need to remove it from both the loose ref store and the packed ref store, a delete-ref operation that logically removes one ref may end up invoking ref-transaction hook twice, which has been corrected. * ps/avoid-unnecessary-hook-invocation-with-packed-refs: refs: skip hooks when deleting uncovered packed refs refs: do not execute reference-transaction hook on packing refs refs: demonstrate excessive execution of the reference-transaction hook refs: allow skipping the reference-transaction hook refs: allow passing flags when beginning transactions refs: extract packed_refs_delete_refs() to allow control of transaction	2022-02-18 13:53:27 -08:00
Patrick Steinhardt	fbe73f61cb	refs: allow passing flags when beginning transactions We do not currently have any flags when creating reference transactions, but we'll add one to disable execution of the reference transaction hook in some cases. Allow passing flags to `ref_store_transaction_begin()` to prepare for this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-01-17 11:01:44 -08:00
Han-Wen Nienhuys	f9f7fd3b23	refs: centralize initialization of the base ref_store. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-22 13:51:38 -08:00
Junio C Hamano	b8148376a2	Merge branch 'hn/create-reflog-simplify' A small simplification of API. * hn/create-reflog-simplify: refs: drop force_create argument of create_reflog API	2021-12-10 14:35:13 -08:00
Junio C Hamano	96f6623ada	Merge branch 'ab/refs-errno-cleanup' The "remainder" of hn/refs-errno-cleanup topic. * ab/refs-errno-cleanup: (21 commits) refs API: post-migration API renaming [2/2] refs API: post-migration API renaming [1/2] refs API: don't expose "errno" in run_transaction_hook() refs API: make expand_ref() & repo_dwim_log() not set errno refs API: make resolve_ref_unsafe() not set errno refs API: make refs_ref_exists() not set errno refs API: make refs_resolve_refdup() not set errno refs tests: ignore ignore errno in test-ref-store helper refs API: ignore errno in worktree.c's find_shared_symref() refs API: ignore errno in worktree.c's add_head_info() refs API: make files_copy_or_rename_ref() et al not set errno refs API: make loose_fill_ref_dir() not set errno refs API: make resolve_gitlink_ref() not set errno refs API: remove refs_read_ref_full() wrapper refs/files: remove "name exist?" check in lock_ref_oid_basic() reflog tests: add --updateref tests refs API: make refs_rename_ref_available() static refs API: make parse_loose_ref_contents() not set errno refs API: make refs_read_raw_ref() not set errno refs API: add a version of refs_resolve_ref_unsafe() with "errno" ...	2021-11-29 15:41:45 -08:00
Han-Wen Nienhuys	7b089120d9	refs: drop force_create argument of create_reflog API There is only one caller, builtin/checkout.c, and it hardcodes force_create=1. This argument was introduced in `abd0cd3a30` (refs: new public ref function: safe_create_reflog, 2015-07-21), which promised to immediately use it in a follow-on commit, but that never happened. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-11-22 11:01:25 -08:00
Junio C Hamano	162a13b855	Merge branch 'jt/no-abuse-alternate-odb-for-submodules' Follow through the work to use the repo interface to access submodule objects in-process, instead of abusing the alternate object database interface. * jt/no-abuse-alternate-odb-for-submodules: submodule: trace adding submodule ODB as alternate submodule: pass repo to check_has_commit() object-file: only register submodule ODB if needed merge-{ort,recursive}: remove add_submodule_odb() refs: peeling non-the_repository iterators is BUG refs: teach arbitrary repo support to iterators refs: plumb repo into ref stores	2021-10-25 16:06:56 -07:00
Ævar Arnfjörð Bjarmason	c339ff690f	refs API: make refs_rename_ref_available() static Move the refs_rename_ref_available() function into "refs/files-backend.c". It is file-backend specific. This function was added in `5fe7d825da` (refs.c: pass a list of names to skip to is_refname_available, 2014-05-01) as rename_ref_available() and was only ever used in this one file-backend specific codepath. So let's move it there. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-16 11:17:02 -07:00
Han-Wen Nienhuys	df3458e957	refs API: make parse_loose_ref_contents() not set errno Change the parse_loose_ref_contents() function to stop setting "errno" and failure, and to instead pass up a "failure_errno" via a parameter. This requires changing its callers to do the same. The EINVAL error from parse_loose_ref_contents is used in files-backend to create a custom error message. In untangling this we discovered a tricky edge case. The refs_read_special_head() function was relying on parse_loose_ref_contents() setting EINVAL. By converting it to use "saved_errno" we can migrate away from "errno" in this part of the code entirely, and do away with an existing "save_errno" pattern, its only purpose was to not clobber the "errno" we previously needed at the end of files_read_raw_ref(). Let's assert that we can do that by not having files_read_raw_ref() itself operate on *failure_errno in addition to passing it on. Instead we'll assert that if we return non-zero we actually do set errno, thus assuring ourselves and callers that they can trust the resulting "failure_errno". Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-16 11:17:02 -07:00
Han-Wen Nienhuys	8b72fea7e9	refs API: make refs_read_raw_ref() not set errno Add a "failure_errno" to refs_read_raw_ref(), his allows refs_werrres_ref_unsafe() to pass along its "failure_errno", as a first step before its own callers are migrated to pass it further up the chain. We are leaving out out the refs_read_special_head() in refs_read_raw_ref() for now, as noted in a subsequent commit moving it to "failure_errno" will require some special consideration. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-16 11:17:02 -07:00
Junio C Hamano	f6c075ad71	Merge branch 'jk/ref-paranoia' The ref iteration code used to optionally allow dangling refs to be shown, which has been tightened up. * jk/ref-paranoia: refs: drop "broken" flag from for_each_fullref_in() ref-filter: drop broken-ref code entirely ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN repack, prune: drop GIT_REF_PARANOIA settings refs: turn on GIT_REF_PARANOIA by default refs: omit dangling symrefs when using GIT_REF_PARANOIA refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag refs-internal.h: reorganize DO_FOR_EACH_* flag documentation refs-internal.h: move DO_FOR_EACH_* flags next to each other t5312: be more assertive about command failure t5312: test non-destructive repack t5312: create bogus ref as necessary t5312: drop "verbose" helper t5600: provide detached HEAD for corruption failures t5516: don't use HEAD ref for invalid ref-deletion tests t7900: clean up some more broken refs	2021-10-11 10:21:47 -07:00
Jonathan Tan	9bc45a2802	refs: teach arbitrary repo support to iterators Note that should_pack_ref() is called when writing refs, which is only supported for the_repository, hence the_repository is hardcoded there. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-08 15:06:05 -07:00
Jonathan Tan	34224e14d6	refs: plumb repo into ref stores In preparation for the next 2 patches that adds (partial) support for arbitrary repositories to ref iterators, plumb a repository into all ref stores. There are no changes to program logic. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-08 15:06:05 -07:00
Jeff King	8dccb2244c	refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag When the DO_FOR_EACH_INCLUDE_BROKEN flag is used, we include both actual corrupt refs (illegal names, missing objects), but also symrefs that point to nothing. This latter is not really a corruption, but just something that may happen normally. For example, the symref at refs/remotes/origin/HEAD may point to a tracking branch which is later deleted. (The local HEAD may also be unborn, of course, but we do not access it through ref iteration). Most callers of for_each_ref() etc, do not care. They don't pass INCLUDE_BROKEN, so don't see it at all. But for those which do pass it, this somewhat-normal state causes extra warnings (e.g., from for-each-ref) or even aborts operations (destructive repacks with GIT_REF_PARANOIA set). This patch just introduces the flag and the mechanism; there are no callers yet (and hence no tests). Two things to note on the implementation: - we actually skip any symref that does not resolve to a ref. This includes ones which point to an invalidly-named ref. You could argue this is a more serious breakage than simple dangling. But the overall effect is the same (we could not follow the symref), as well as the impact on things like REF_PARANOIA (either way, a symref we can't follow won't impact reachability, because we'll see the ref itself during iteration). The underlying resolution function doesn't distinguish these two cases (they both get REF_ISBROKEN). - we change the iterator in refs/files-backend.c where we check INCLUDE_BROKEN. There's a matching spot in refs/packed-backend.c, but we don't know need to do anything there. The packed backend does not support symrefs at all. The resulting set of flags might be a bit easier to follow if we broke this down into "INCLUDE_CORRUPT_REFS" and "INCLUDE_DANGLING_SYMREFS". But there are a few reasons not do so: - adding a new OMIT_DANGLING_SYMREFS flag lets us leave existing callers intact, without changing their behavior (and some of them really do want to see the dangling symrefs; e.g., t5505 has a test which expects us to report when a symref becomes dangling) - they're not actually independent. You cannot say "include dangling symrefs" without also including refs whose objects are not reachable, because dangling symrefs by definition do not have an object. We could tweak the implementation to distinguish this, but in practice nobody wants to ask for that. Adding the OMIT flag keeps the implementation simple and makes sure we don't regress the current behavior. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	9aab952e85	refs-internal.h: reorganize DO_FOR_EACH_* flag documentation The documentation for the DO_FOR_EACH_* flags is sprinkled over the refs-internal.h file. We define the two flags in one spot, and then describe them in more detail far away from there, in the definitions of refs_ref_iterator_begin() and ref_iterator_advance_fn(). Let's try to organize this a bit better: - convert the #defines to an enum. This makes it clear that they are related, and that the enum shows the complete set of flags. - combine all descriptions for each flag in a single spot, next to the flag's definition - use the enum rather than a bare int for functions which take the flags. This helps readers realize which flags can be used. - clarify the mention of flags for ref_iterator_advance_fn(). It does not take flags itself, but is meant to depend on ones set up earlier. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	bf708add2e	refs-internal.h: move DO_FOR_EACH_* flags next to each other There are currently two DO_FOR_EACH_* flags, which must not have their bits overlap. Yet they're defined hundreds of lines apart. Let's move them next to each other to make it clear that they are related and are a complete set (which matters if you are adding a new flag and would like to know what the next available bit is). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Han-Wen Nienhuys	5b12e16bb1	refs: make errno output explicit for read_raw_ref_fn This makes it explicit how alternative ref backends should report errors in read_raw_ref_fn. read_raw_ref_fn needs to supply a credible errno for a number of cases. These are primarily: 1) The files backend calls read_raw_ref from lock_raw_ref, and uses the resulting error codes to create/remove directories as needed. 2) ENOENT should be translated in a zero OID, optionally with REF_ISBROKEN set, returning the last successfully resolved symref. This is necessary so read_raw_ref("HEAD") on an empty repo returns refs/heads/main (or the default branch du-jour), and we know on which branch to create the first commit. Make this information flow explicit by adding a failure_errno to the signature of read_raw_ref. All errnos from the files backend are still propagated unchanged, even though inspection suggests only ENOTDIR, EISDIR and ENOENT are relevant. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-25 13:30:26 -07:00
Han-Wen Nienhuys	20d422cfd7	refs: remove EINVAL errno output from specification of read_raw_ref_fn This commit does not change code; it documents the fact that an alternate ref backend does not need to return EINVAL from read_raw_ref_fn to function properly. This is correct, because refs_read_raw_ref is only called from; * resolve_ref_unsafe(), which does not care for the EINVAL errno result. * refs_verify_refname_available(), which does not inspect errno. * files-backend.c, where errno is overwritten on failure. * packed-backend.c (is_packed_transaction_needed), which calls it for the packed ref backend, which never emits EINVAL. A grep for EINVAL /c reveals that no code checks errno against EINVAL after reading references. In addition, the refs.h file does not mention errno at all. A grep over resolve_ref_unsafe() turned up the following callers that inspect errno: * sequencer.c::print_commit_summary, which uses it for die_errno * lock_ref_oid_basic(), which only treats EISDIR and ENOTDIR specially. The files ref backend does use EINVAL. The files backend does not call into the generic API (refs_read_raw), but into the files-specific function (files_read_raw_ref), which we are not changing in this commit. As the errno sideband is unintuitive and error-prone, remove EINVAL value, as a step towards getting rid of the errno sideband altogether. Spotted by Ævar Arnfjörð Bjarmason <avarab@gmail.com>. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-25 13:30:26 -07:00
Ævar Arnfjörð Bjarmason	cc40b5ce13	refs API: remove OID argument to reflog_expire() Since the the preceding commit the "oid" parameter to reflog_expire() is always NULL, but it was not cleaned up to reduce the size of the diff. Let's do that subsequent API and documentation cleanup now. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-25 13:27:37 -07:00
Han-Wen Nienhuys	617480d75b	refs: make explicit that ref_iterator_peel returns boolean Use -1 as error return value throughout. This removes spurious differences in the GIT_TRACE_REFS output, depending on the ref storage backend active. Before, the cached ref_iterator (but only that iterator!) would return peel_object() output directly. No callers relied on the peel_status values beyond success/failure. All calls to these functions go through peel_iterated_oid(), which returns peel_object() as a fallback, but also squashing the error values. The iteration interface already passes REF_ISSYMREF and REF_ISBROKEN through the flags argument, so the additional error values in enum peel_status provide no value. The ref iteration interface provides a separate peel() function because certain formats (eg. packed-refs and reftable) can store the peeled object next to the tag SHA1. Passing the peeled SHA1 as an optional argument to each_ref_fn maps more naturally to the implementation of ref databases. Changing the code in this way is left for a future refactoring. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-20 07:54:12 +09:00
Junio C Hamano	c9a04f036f	Merge branch 'hn/refs-trace-backend' Developer support. * hn/refs-trace-backend: refs: add GIT_TRACE_REFS debugging mechanism	2020-09-22 12:36:28 -07:00
Han-Wen Nienhuys	4441f42707	refs: add GIT_TRACE_REFS debugging mechanism When set in the environment, GIT_TRACE_REFS makes git print operations and results as they flow through the ref storage backend. This helps debug discrepancies between different ref backends. Example: $ GIT_TRACE_REFS="1" ./git branch 15:42:09.769631 refs/debug.c:26 ref_store for .git 15:42:09.769681 refs/debug.c:249 read_raw_ref: HEAD: 0000000000000000000000000000000000000000 (=> refs/heads/ref-debug) type 1: 0 15:42:09.769695 refs/debug.c:249 read_raw_ref: refs/heads/ref-debug: `3a238e539b` (=> refs/heads/ref-debug) type 0: 0 15:42:09.770282 refs/debug.c:233 ref_iterator_begin: refs/heads/ (0x1) 15:42:09.770290 refs/debug.c:189 iterator_advance: refs/heads/b4 (0) 15:42:09.770295 refs/debug.c:189 iterator_advance: refs/heads/branch3 (0) Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-09 12:58:37 -07:00
Han-Wen Nienhuys	63c0567365	refs: move REF_LOG_ONLY to refs-internal.h REF_LOG_ONLY is used in the transaction preparation: if a symref is involved in a transaction, the referent of the symref should be updated, and the symref itself should only be updated in the reflog. Other ref backends will need to duplicate this logic too, so move it to a central place. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-08 15:51:07 -07:00
Han-Wen Nienhuys	5085aef4c8	refs: move gitdir into base ref_store Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-19 14:08:04 -07:00
Han-Wen Nienhuys	4877c6c738	refs: fix comment about submodule ref_stores Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-19 14:08:03 -07:00
Han-Wen Nienhuys	e39620f07e	refs: split off reading loose ref data in separate function This prepares for handling FETCH_HEAD (which is not a regular ref) separately from the ref backend. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-19 14:08:03 -07:00
Junio C Hamano	523fa69c36	reflog: cleanse messages in the refs.c layer Regarding reflog messages: - We expect that a reflog message consists of a single line. The file format used by the files backend may add a LF after the message as a delimiter, and output by commands like "git log -g" may complete such an incomplete line by adding a LF at the end, but philosophically, the terminating LF is not a part of the message. - We however allow callers of refs API to supply a random sequence of NUL terminated bytes. We cleanse caller-supplied message by squashing a run of whitespaces into a SP, and by trimming trailing whitespace, before storing the message. This is how we tolerate, instead of erring out, a message with LF in it (be it at the end, in the middle, or both). Currently, the cleansing of the reflog message is done by the files backend, before the log is written out. This is sufficient with the current code, as that is the only backend that writes reflogs. But new backends can be added that write reflogs, and we'd want the resulting log message we would read out of "log -g" the same no matter what backend is used, and moving the code to do so to the generic layer is a way to do so. An added benefit is that the "cleansing" function could be updated later, independent from individual backends, to e.g. allow multi-line log messages if we wanted to, and when that happens, it would help a lot to ensure we covered all bases if the cleansing function (which would be updated) is called from the generic layer. Side note: I am not interested in supporting multi-line reflog messages right at the moment (nobody is asking for it), but I envision that instead of the "squash a run of whitespaces into a SP and rtrim" cleansing, we can %urlencode problematic bytes in the message AND append a SP at the end, when a new version of Git that supports multi-line and/or verbatim reflog messages writes a reflog record. The reading side can detect the presense of SP at the end (which should have been rtrimmed out if it were written by existing versions of Git) as a signal that decoding %urlencode recovers the original reflog message. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-10 13:53:37 -07:00
Han-Wen Nienhuys	84ee4ca10d	refs: improve documentation for ref iterator Document some of the flag options in refs_ref_iterator_begin, and explain how ref_iterator_advance_fn should handle them. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-20 10:39:02 -07:00
Elijah Newren	15beaaa3d1	Fix spelling errors in code comments Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-10 16:00:54 +09:00
Junio C Hamano	ff6bbce6e3	Merge branch 'rj/header-check' Header files clean-up. * rj/header-check: delta-islands.h: add missing forward declarations (hdr-check) midx.h: add missing forward declarations (hdr-check) refs/refs-internal.h: add missing declarations (hdr-check) refs/packed-backend.h: add missing declaration (hdr-check) refs/ref-cache.h: add missing declarations (hdr-check) ewah/ewok_rlw.h: add missing include (hdr-check) json-writer.h: add missing include (hdr-check) Makefile: add a hdr-check target	2018-10-16 16:16:00 +09:00
Junio C Hamano	eea5e03a5a	Merge branch 'tq/refs-internal-comment-fix' Fix for typo in a sample code in comment. * tq/refs-internal-comment-fix: refs: docstring typo	2018-10-16 16:15:59 +09:00
Ramsay Jones	4eb4416d37	refs/refs-internal.h: add missing declarations (hdr-check) Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-20 11:50:00 -07:00
Tao Qingyun	7b6057c852	refs: docstring typo Signed-off-by: Tao Qingyun <taoqy@ls-a.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 10:17:22 -07:00
Stefan Beller	4a6067cda5	refs.c: migrate internal ref iteration to pass thru repository argument In `60ce76d358` (refs: add repository argument to for_each_replace_ref, 2018-04-11) and `0d296c57ae` (refs: allow for_each_replace_ref to handle arbitrary repositories, 2018-04-11), for_each_replace_ref learned how to iterate over refs by a given arbitrary repository. New attempts in the object store conversion have shown that it is useful to have the repository handle available that the refs iteration is currently iterating over. To achieve this goal we will need to add a repository argument to each_ref_fn in refs.h. However as many callers rely on the signature such a patch would be too large. So convert the internals of the ref subsystem first to pass through a repository argument without exposing the change to the user. Assume the_repository for the passed through repository, although it is not used anywhere yet. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-21 10:22:50 -07:00

1 2 3

136 Commits (71c871b48dfaf300ca20e205917db72ab9c6d7b3)