kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Patrick Steinhardt	8e9a1d0dc2	t/helper: fix segfault in "oid-array" command without repository The "oid-array" test helper can supposedly work without a Git repository, but will in fact crash because `the_repository->hash_algo` is not initialized. This is because `oid_pos()`, which is used by `oid_array_lookup()`, depends on `the_hash_algo->rawsz`. Ideally, we'd adapt `oid_pos()` to not depend on `the_hash_algo` anymore. That is a bigger untertaking though, so instead we fall back to SHA1 when there is no repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00
Patrick Steinhardt	fa9e009aa7	t/helper: use correct object hash in partial-clone helper The `object_info()` function of the partial-clone helper is responsible for checking the object ID of a repository other than `the_repository`. We use `parse_oid_hex()` in this function though, which means that we still depend on `the_repository->hash_algo`. Fix this by using the object hash of the function-local repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00
Patrick Steinhardt	2a0e11479f	compat/fsmonitor: fix socket path in networked SHA256 repos The IPC socket used by the fsmonitor on Darwin is usually contained in the Git repository itself. When the repository is hosted on a networked filesystem though, we instead create the socket path in the user's home directory or the socket directory. In that case, we derive the path by hashing the repository path. But while we always use SHA1 to hash the repository path, we then end up using `hash_to_hex()` to append the computed hash to the socket path. This is wrong because `hash_to_hex()` uses the hash algorithm configured in `the_repository`, which may not be SHA1. The consequence is that we may append uninitialized bytes to the path when operating in a SHA256 repository. Fix this bug by using `hash_to_hex_algop()` with SHA1. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00
Patrick Steinhardt	99cf4d6d35	replace-object: use hash algorithm from passed-in repository In `register_replace_ref()`, we pass in a repository but then use `get_oid_hex()` to parse passed-in object IDs, which implicitly uses `the_repository`. Fix this by using the hash algorithm from the passed-in repository instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00
Patrick Steinhardt	58650befd9	protocol-caps: use hash algorithm from passed-in repository In `send_info()`, we pass in a repository but then use `get_oid_hex()` to parse passed-in object IDs, which implicitly uses `the_repository`. Fix this by using the hash algorithm from the passed-in repository instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00
Patrick Steinhardt	f2c32a66f5	oidset: pass hash algorithm when parsing file The `oidset_parse_file_carefully()` function implicitly depends on `the_repository` when parsing object IDs. Fix this by having callers pass in the hash algorithm to use. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00
Patrick Steinhardt	afa2c6ddc8	http-fetch: don't crash when parsing packfile without a repo The git-http-fetch(1) command accepts a `--packfile=` option, which allows the user to specify that it shall fetch a specific packfile, only. The parameter here is the hash of the packfile, which is specific to the object hash used by the repository. This requirement is implicit though via our use of `parse_oid_hex()`, which internally uses `the_repository`. The git-http-fetch(1) command allows for there to be no repository though, which only exists such that we can show usage via the "-h" option. In that case though, starting with `c8aed5e8da` (repository: stop setting SHA1 as the default object hash, 2024-05-07), `the_repository` does not have its object hash initialized anymore and thus we would crash when trying to parse the object ID outside of a repository. Fix this issue by dying immediately when we see a "--packfile=" parameter when outside a Git repository. This is not a functional regression as we would die later on with the same error anyway. Add a test to detect the segfault. We use the "nongit" function to do so, which we need to allow-list in `test_must_fail ()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00
Patrick Steinhardt	8a676bdc5c	hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via `d1cbe1e6d8` (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:33 -07:00
Patrick Steinhardt	36026a0f30	refs: avoid include cycle with "repository.h" There is an include cycle between "refs.h" and "repository.h" via "commit.h", "object.h" and "hash.h". This has the effect that several definitions of structs and enums will not be visible once we merge "hash-ll.h" back into "hash.h" in the next commit. The only reason that "repository.h" includes "refs.h" is the definition of `enum ref_storage_format`. Move it into "repository.h" and have "refs.h" include "repository.h" instead to fix the cycle. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:33 -07:00
Patrick Steinhardt	e7da938570	global: introduce `USE_THE_REPOSITORY_VARIABLE` macro Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:33 -07:00
Patrick Steinhardt	7abbca0e74	hash: require hash algorithm in `empty_tree_oid_hex()` The `empty_tree_oid_hex()` function use `the_repository` to derive the hash function that shall be used. Require callers to pass in the hash algorithm to get rid of this implicit dependency. While at it, remove the unused `empty_blob_oid_hex()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:33 -07:00
Patrick Steinhardt	9c34eb93fb	hash: require hash algorithm in `is_empty_{blob,tree}_oid()` Both functions `is_empty_{blob,tree}_oid()` use `the_repository` to derive the hash function that shall be used. Require callers to pass in the hash algorithm to get rid of this implicit dependency. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:33 -07:00
Patrick Steinhardt	861e8c76f6	hash: make `is_null_oid()` independent of `the_repository` The function `is_null_oid()` uses `oideq(oid, null_oid())` to check whether a given object ID is the all-zero object ID. `null_oid()` implicitly relies on `the_repository` though to return the correct null object ID. Get rid of this dependency by always comparing the complete hash array for being all-zeroes. This is possible due to the refactoring of object IDs so that their hash arrays are always fully initialized. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:33 -07:00
Patrick Steinhardt	d4d364b2c7	hash: convert `oidcmp()` and `oideq()` to compare whole hash With the preceding commit, the hash array of object IDs is now fully zero-padded even when the hash algorithm's output is smaller than the array length. With that, we can now adapt both `oidcmp()` and `oideq()` to unconditionally memcmp(3P) the whole array instead of depending on the hash size. While it may feel inefficient to compare unused bytes for e.g. SHA-1, in practice the compiler should now be able to produce code that is better optimized both because we have no branch anymore, but also because the size to compare is now known at compile time. Goldbolt spits out the following assembly on an x86_64 platform with GCC 14.1 for the old and new implementations of `oidcmp()`: oidcmp_old: movsx rax, DWORD PTR [rdi+32] test eax, eax jne .L2 mov rax, QWORD PTR the_repository[rip] cmp QWORD PTR [rax+16], 32 je .L6 .L4: mov edx, 20 jmp memcmp .L2: lea rdx, [rax+rax2] lea rax, [rax+rdx4] lea rax, hash_algos[0+rax8] cmp QWORD PTR [rax+16], 32 jne .L4 .L6: mov edx, 32 jmp memcmp oidcmp_new: mov edx, 32 jmp memcmp The new implementation gets ridi of all the branches and effectively only ends setting up `edx` for `memcmp()` and then calling it. And for `oideq()`: oideq_old: movsx rcx, DWORD PTR [rdi+32] mov rax, rdi mov rdx, rsi test ecx, ecx jne .L2 mov rcx, QWORD PTR the_repository[rip] cmp QWORD PTR [rcx+16], 32 mov rcx, QWORD PTR [rax] je .L12 .L4: mov rsi, QWORD PTR [rax+8] xor rcx, QWORD PTR [rdx] xor rsi, QWORD PTR [rdx+8] or rcx, rsi je .L13 .L8: mov eax, 1 test eax, eax sete al movzx eax, al ret .L2: lea rsi, [rcx+rcx2] lea rcx, [rcx+rsi4] lea rcx, hash_algos[0+rcx8] cmp QWORD PTR [rcx+16], 32 mov rcx, QWORD PTR [rax] jne .L4 .L12: mov rsi, QWORD PTR [rax+8] xor rcx, QWORD PTR [rdx] xor rsi, QWORD PTR [rdx+8] or rcx, rsi jne .L8 mov rcx, QWORD PTR [rax+16] mov rax, QWORD PTR [rax+24] xor rcx, QWORD PTR [rdx+16] xor rax, QWORD PTR [rdx+24] or rcx, rax jne .L8 xor eax, eax .L14: test eax, eax sete al movzx eax, al ret .L13: mov edi, DWORD PTR [rdx+16] cmp DWORD PTR [rax+16], edi jne .L8 xor eax, eax jmp .L14 oideq_new: mov rax, QWORD PTR [rdi] mov rdx, QWORD PTR [rdi+8] xor rax, QWORD PTR [rsi] xor rdx, QWORD PTR [rsi+8] or rax, rdx je .L5 .L2: mov eax, 1 xor eax, 1 ret .L5: mov rax, QWORD PTR [rdi+16] mov rdx, QWORD PTR [rdi+24] xor rax, QWORD PTR [rsi+16] xor rdx, QWORD PTR [rsi+24] or rax, rdx jne .L2 xor eax, eax xor eax, 1 ret Interestingly, the compiler decides to split the comparisons into two so that it first compares the lower half of the object ID for equality and then the upper half. If the first check shows a difference, then we wouldn't even end up comparing the second half. In both cases, the new generated code is significantly shorter and has way less branches. While I didn't benchmark the change, I'd be surprised if the new code was slower. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:32 -07:00
Patrick Steinhardt	c98d762ed9	global: ensure that object IDs are always padded The `oidcmp()` and `oideq()` functions only compare the prefix length as specified by the given hash algorithm. This mandates that the object IDs have a valid hash algorithm set, or otherwise we wouldn't be able to figure out that prefix. As we do not have a hash algorithm in many cases, for example when handling null object IDs, this assumption cannot always be fulfilled. We thus have a fallback in place that instead uses `the_repository` to derive the hash function. This implicit dependency is hidden away from callers and can be quite surprising, especially in contexts where there may be no repository. In theory, we can adapt those functions to always memcmp(3P) the whole length of their hash arrays. But there exist a couple of sites where we populate `struct object_id`s such that only the prefix of its hash that is actually used by the hash algorithm is populated. The remaining bytes are left uninitialized. The fact that those bytes are uninitialized also leads to warnings under Valgrind in some places where we copy those bytes. Refactor callsites where we populate object IDs to always initialize all bytes. This also allows us to get rid of `oidcpy_with_padding()`, for one because the input is now fully initialized, and because `oidcpy()` will now always copy the whole hash array. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:32 -07:00
Patrick Steinhardt	9da95bda74	hash: require hash algorithm in `oidread()` and `oidclr()` Both `oidread()` and `oidclr()` use `the_repository` to derive the hash function that shall be used. Require callers to pass in the hash algorithm to get rid of this implicit dependency. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:32 -07:00
Patrick Steinhardt	f4836570a7	hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()` Many of our hash functions have two variants, one receiving a `struct git_hash_algo` and one that derives it via `the_repository`. Adapt all of those functions to always require the hash algorithm as input and drop the variants that do not accept one. As those functions are now independent of `the_repository`, we can move them from "hash.h" to "hash-ll.h". Note that both in this and subsequent commits in this series we always just pass `the_repository->hash_algo` as input even if it is obvious that there is a repository in the context that we should be using the hash from instead. This is done to be on the safe side and not introduce any regressions. All callsites should eventually be amended to use a repo passed via parameters, but this is outside the scope of this patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:32 -07:00
Patrick Steinhardt	129cb1b99d	hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions The functions `is_empty_{blob,tree}_sha1()` are mostly unused, except for a single callsite in "read-cache.c". Most callsites have long since been converted to use the equivalents that accept a `struct object_id` instead of a string. Adapt the remaining callsite and drop those functions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:32 -07:00
Junio C Hamano	10aa7c74a2	Merge branch 'gt/unit-test-oidtree' into ps/use-the-repository * gt/unit-test-oidtree: t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c	2024-06-13 09:39:46 -07:00
Junio C Hamano	092b33da2b	Merge branch 'ps/ref-storage-migration' into ps/use-the-repository * ps/ref-storage-migration: builtin/refs: new command to migrate ref storage formats refs: implement logic to migrate between ref storage formats refs: implement removal of ref storages worktree: don't store main worktree twice reftable: inline `merged_table_release()` refs/files: fix NULL pointer deref when releasing ref store refs/files: extract function to iterate through root refs refs/files: refactor `add_pseudoref_and_head_entries()` refs: allow to skip creation of reflog entries refs: pass storage format to `ref_store_init()` explicitly refs: convert ref storage format to an enum setup: unset ref storage when reinitializing repository version	2024-06-13 09:39:08 -07:00
Junio C Hamano	d63586cb31	The thirteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-12 13:37:18 -07:00
Junio C Hamano	2a061a62e2	Merge branch 'gt/decorate-unit-test' A test helper that essentially is unit tests on the "decorate" logic has been rewritten using the unit-tests framework. * gt/decorate-unit-test: t/: migrate helper/test-example-decorate to the unit testing framework	2024-06-12 13:37:18 -07:00
Junio C Hamano	51ea70c18a	Merge branch 'jk/sparse-leakfix' Many memory leaks in the sparse-checkout code paths have been plugged. * jk/sparse-leakfix: sparse-checkout: free duplicate hashmap entries sparse-checkout: free string list after displaying sparse-checkout: free pattern list in sparse_checkout_list() sparse-checkout: free sparse_filename after use sparse-checkout: refactor temporary sparse_checkout_patterns sparse-checkout: always free "line" strbuf after reading input sparse-checkout: reuse --stdin buffer when reading patterns dir.c: always copy input to add_pattern() dir.c: free removed sparse-pattern hashmap entries sparse-checkout: clear patterns when init() sees existing sparse file dir.c: free strings in sparse cone pattern hashmaps sparse-checkout: pass string literals directly to add_pattern() sparse-checkout: free string list in write_cone_to_file()	2024-06-12 13:37:17 -07:00
Junio C Hamano	c2f79440ac	Merge branch 'jk/cap-exclude-file-size' An overly large ".gitignore" files are now rejected silently. * jk/cap-exclude-file-size: dir.c: reduce max pattern file size to 100MB dir.c: skip .gitignore, etc larger than INT_MAX	2024-06-12 13:37:17 -07:00
Junio C Hamano	b8bdb2f283	Merge branch 'jc/safe-directory-leading-path' The safe.directory configuration knob has been updated to optionally allow leading path matches. * jc/safe-directory-leading-path: safe.directory: allow "lead/ing/path/*" match	2024-06-12 13:37:16 -07:00
Junio C Hamano	22cf18fd9e	Merge branch 'gt/t-hash-unit-test' A pair of test helpers that essentially are unit tests on hash algorithms have been rewritten using the unit-tests framework. * gt/t-hash-unit-test: t/: migrate helper/test-{sha1, sha256} to unit-tests/t-hash strbuf: introduce strbuf_addstrings() to repeatedly add a string	2024-06-12 13:37:15 -07:00
Junio C Hamano	56346ba24e	Merge branch 'cp/reftable-unit-test' Basic unit tests for reftable have been reimplemented under the unit test framework. * cp/reftable-unit-test: t: improve the test-case for parse_names() t: add test for put_be16() t: move tests from reftable/record_test.c to the new unit test t: move tests from reftable/stack_test.c to the new unit test t: move reftable/basics_test.c to the unit testing framework	2024-06-12 13:37:14 -07:00
Junio C Hamano	a39e28ace7	Merge branch 'jc/t1517-more' A new test was added to ensure git commands that are designed to run outside repositories do work. * jc/t1517-more: imap-send: minimum leakfix t1517: more coverage for commands that work without repository	2024-06-12 13:37:14 -07:00
Ghanshyam Thakkar	ed54840872	t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c helper/test-oidtree.c along with t0069-oidtree.sh test the oidtree.h library, which is a wrapper around crit-bit tree. Migrate them to the unit testing framework for better debugging and runtime performance. Along with the migration, add an extra check for oidtree_each() test, which showcases how multiple expected matches can be given to check_each() helper. To achieve this, introduce a new library called 'lib-oid.h' exclusively for the unit tests to use. It currently mainly includes utility to generate object_id from an arbitrary hex string (i.e. '12a' -> '12a0000000000000000000000000000000000000'). This also handles the hash algo selection based on GIT_TEST_DEFAULT_HASH. This library will also be helpful when we port other unit tests such as oid-array, oidset etc. Helped-by: Junio C Hamano <gitster@pobox.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> [jc: small fixlets squashed in] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-12 13:33:20 -07:00
Junio C Hamano	8d94cfb545	The twelfth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-10 10:30:39 -07:00
Junio C Hamano	5235e56ea5	Merge branch 'jk/leakfixes' Memory leaks in "git mv" has been plugged. * jk/leakfixes: mv: replace src_dir with a strvec mv: factor out empty src_dir removal mv: move src_dir cleanup to end of cmd_mv() t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL t-strvec: use va_end() to match va_start()	2024-06-10 10:30:39 -07:00
Junio C Hamano	718b50e3bf	Merge branch 'iw/trace-argv-on-alias' The alias-expanded command lines are logged to the trace output. * iw/trace-argv-on-alias: run-command: show prepared command Documentation: alias: add notes on shell expansion Documentation: alias: rework notes into points	2024-06-10 10:30:38 -07:00
Junio C Hamano	1b76f06508	Merge branch 'tb/midx-write-cleanup' Code clean-up around writing the .midx files. * tb/midx-write-cleanup: pack-bitmap.c: reimplement `midx_bitmap_filename()` with helper midx: replace `get_midx_rev_filename()` with a generic helper midx-write.c: support reading an existing MIDX with `packs_to_include` midx-write.c: extract `fill_packs_from_midx()` midx-write.c: extract `should_include_pack()` midx-write.c: pass `start_pack` to `compute_sorted_entries()` midx-write.c: reduce argument count for `get_sorted_entries()` midx-write.c: tolerate `--preferred-pack` without bitmaps	2024-06-07 10:57:23 -07:00
Junio C Hamano	cd77e87115	The eleventh batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 12:49:25 -07:00
Junio C Hamano	9d8e7d2ef7	Merge branch 'mt/openindiana-scalar' Avoid removing the $(cwd) for portability. * mt/openindiana-scalar: scalar: make enlistment delete to work on all POSIX platforms	2024-06-06 12:49:25 -07:00
Junio C Hamano	df5c2c4962	Merge branch 'rs/difftool-env-simplify' Code simplification. * rs/difftool-env-simplify: difftool: add env vars directly in run_file_diff()	2024-06-06 12:49:24 -07:00
Junio C Hamano	d11b0c75ec	Merge branch 'th/quiet-lazy-fetch-from-promisor' The promisor.quiet configuration knob can be set to true to make lazy fetching from promisor remotes silent. * th/quiet-lazy-fetch-from-promisor: promisor-remote: add promisor.quiet configuration option	2024-06-06 12:49:24 -07:00
Junio C Hamano	cf792653ad	Merge branch 'ps/leakfixes' Leakfixes. * ps/leakfixes: builtin/mv: fix leaks for submodule gitfile paths builtin/mv: refactor to use `struct strvec` builtin/mv duplicate string list memory builtin/mv: refactor `add_slash()` to always return allocated strings strvec: add functions to replace and remove strings submodule: fix leaking memory for submodule entries commit-reach: fix memory leak in `ahead_behind()` builtin/credential: clear credential before exit config: plug various memory leaks config: clarify memory ownership in `git_config_string()` builtin/log: stop using globals for format config builtin/log: stop using globals for log config convert: refactor code to clarify ownership of check_roundtrip_encoding diff: refactor code to clarify memory ownership of prefixes config: clarify memory ownership in `git_config_pathname()` http: refactor code to clarify memory ownership checkout: clarify memory ownership in `unique_tracking_name()` strbuf: fix leak when `appendwholeline()` fails with EOF transport-helper: fix leaking helper name	2024-06-06 12:49:23 -07:00
Patrick Steinhardt	25a0023f28	builtin/refs: new command to migrate ref storage formats Introduce a new command that allows the user to migrate a repository between ref storage formats. This new command is implemented as part of a new git-refs(1) executable. This is due to two reasons: - There is no good place to put the migration logic in existing commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is not the correct place to put it, either. - I had it in my mind to create a new low-level command for accessing refs for quite a while already. git-refs(1) is that command and can over time grow more functionality relating to refs. This should help discoverability by consolidating low-level access to refs into a single executable. As mentioned in the preceding commit that introduces the ref storage format migration logic, the new `git refs migrate` command still has a bunch of restrictions. These restrictions are documented accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:34 -07:00
Patrick Steinhardt	6d6a3a99c7	refs: implement logic to migrate between ref storage formats With the introduction of the new "reftable" backend, users may want to migrate repositories between the backends without having to recreate the whole repository. Add the logic to do so. The implementation is generic and works with arbitrary ref storage formats so that a backend does not need to implement any migration logic. It does have a few limitations though: - We do not migrate repositories with worktrees, because worktrees have separate ref storages. It makes the overall affair more complex if we have to migrate multiple storages at once. - We do not migrate reflogs, because we have no interfaces to write many reflog entries. - We do not lock the repository for concurrent access, and thus concurrent writes may end up with weird in-between states. There is no way to fully lock the "files" backend for writes due to its format, and thus we punt on this topic altogether and defer to the user to avoid those from happening. In other words, this version is a minimum viable product for migrating a repository's ref storage format. It works alright for bare repos, which often have neither worktrees nor reflogs. But it will not work for many other repositories without some preparations. These limitations are not set into stone though, and ideally we will eventually address them over time. The logic is not yet used by anything, and thus there are no tests for it. Those will be added in the next commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:33 -07:00
Patrick Steinhardt	64a6dd8ffc	refs: implement removal of ref storages We're about to introduce logic to migrate ref storages. One part of the migration will be to delete the files that are part of the old ref storage format. We don't yet have a way to delete such data generically across ref backends though. Implement a new `delete` callback and expose it via a new `ref_storage_delete()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:33 -07:00
Patrick Steinhardt	1339cb3c47	worktree: don't store main worktree twice In `get_worktree_ref_store()` we either return the repository's main ref store, or we look up the ref store via the map of worktree ref stores. Which of these worktrees gets picked depends on the `is_current` bit of the worktree, which indicates whether the worktree is the one that corresponds to `the_repository`. The bit is getting set in `get_worktrees()`, but only after we have computed the list of all worktrees. This is too late though, because at that time we have already called `get_worktree_ref_store()` on each of the worktrees via `add_head_info()`. The consequence is that the current worktree will not have been marked accordingly, which means that we did not use the main ref store, but instead created a new ref store. We thus have two separate ref stores now that map to the same ref database. Fix this by setting `is_current` before we call `add_head_info()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:33 -07:00
Patrick Steinhardt	b5d7db9e83	reftable: inline `merged_table_release()` The function `merged_table_release()` releases a merged table, whereas `reftable_merged_table_free()` releases a merged table and then also free's its pointer. But all callsites of `merged_table_release()` are in fact followed by `reftable_merged_table_free()`, which is redundant. Inline `merged_table_release()` into `reftable_merged_table_free()` to get rid of this redundance. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:32 -07:00
Patrick Steinhardt	b3e098d6e7	refs/files: fix NULL pointer deref when releasing ref store The `free_ref_cache()` function is not `NULL` safe and will thus segfault when being passed such a pointer. This can easily happen when trying to release a partially initialized "files" ref store. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:32 -07:00
Patrick Steinhardt	120b67172f	refs/files: extract function to iterate through root refs Extract a new function that can be used to iterate through all root refs known to the "files" backend. This will be used in the next commit, where we start to teach ref backends to remove themselves. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:32 -07:00
Patrick Steinhardt	66275a6311	refs/files: refactor `add_pseudoref_and_head_entries()` The `add_pseudoref_and_head_entries()` function accepts both the ref store as well as a directory name as input. This is unnecessary though as the ref store already uniquely identifies the root directory of the ref store anyway. Furthermore, the function is misnamed now that we have clarified the meaning of pseudorefs as it doesn't add pseudorefs, but root refs. Rename it accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:32 -07:00
Patrick Steinhardt	fbd1a693c7	refs: allow to skip creation of reflog entries The ref backends do not have any way to disable the creation of reflog entries. This will be required for upcoming ref format migration logic so that we do not create any entries that didn't exist in the original ref database. Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to disable reflog entry creation. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:31 -07:00
Patrick Steinhardt	6e1683ace9	refs: pass storage format to `ref_store_init()` explicitly We're about to introduce logic to migrate refs from one storage format to another one. This will require us to initialize a ref store with a different format than the one used by the passed-in repository. Prepare for this by accepting the desired ref storage format as parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:31 -07:00
Patrick Steinhardt	318efb966b	refs: convert ref storage format to an enum The ref storage format is tracked as a simple unsigned integer, which makes it harder than necessary to discover what that integer actually is or where its values are defined. Convert the ref storage format to instead be an enum. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:31 -07:00
Patrick Steinhardt	a83f7f51e1	setup: unset ref storage when reinitializing repository version When reinitializing a repository's version we may end up unsetting the hash algorithm when it matches the default hash algorithm. If we didn't do that then the previously configured value might remain intact. While the same issue exists for the ref storage extension, we don't do this here. This has been fine for most of the part because it is not supported to re-initialize a repository with a different ref storage format anyway. We're about to introduce a new command to migrate ref storages though, so this is about to become an issue there. Prepare for this and unset the ref storage format when reinitializing a repository with the "files" format. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:31 -07:00

1 2 3 4 5 ...

73650 Commits (8e9a1d0dc2d543c05cb0c11a598fb7675d5deea8) All Branches Search

73650 Commits (8e9a1d0dc2d543c05cb0c11a598fb7675d5deea8)

All Branches