kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Taylor Blau	f77dbf0285	pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs Since we may ask for a pack_id that is in an earlier MIDX layer relative to the one corresponding to our bitmap, use nth_midxed_pack() instead of accessing the ->packs array directly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 04:33:45 -07:00
Taylor Blau	ae61324f0a	pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs The pack-bitmap machinery uses `bitmap_for_commit()` to locate the EWAH-compressed bitmap corresponding to some given commit object. Teach this function about incremental MIDX bitmaps by teaching it to recur on earlier bitmap layers when it fails to find a given commit in the current layer. The changes to do so are as follows: - Avoid initializing hash_pos at its declaration, since bitmap_for_commit() is now a recursive function and may receive a NULL bitmap_index pointer as its first argument. - In cases where we would previously return NULL (to indicate that a lookup failed and the given bitmap_index does not contain an entry corresponding to the given commit), recursively call the function on the previous bitmap layer. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 04:33:41 -07:00
Taylor Blau	f31a17cea5	pack-bitmap.c: open and store incremental bitmap layers Prepare the pack-bitmap machinery to work with incremental MIDXs by adding a new "base" field to keep track of the bitmap index associated with the previous MIDX layer. The changes in this commit are mostly boilerplate to open the correct bitmap(s), add them to the chain of bitmap layers along the "base" pointer, ensure that the correct packs and their reverse indexes are loaded across MIDX layers, etc. While we're at it, keep track of a base_nr field to indicate how many bitmap layers (including the current bitmap) exist. This will be used in a future commit to allocate an array of 'struct ewah_bitmap' pointers to collect all of the respective type bitmaps among all layers to initialize a multi-EWAH iterator. Subsequent commits will teach the functions within the pack-bitmap machinery how to interact with these new fields. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 04:33:36 -07:00
Taylor Blau	8331c17b79	pack-revindex: prepare for incremental MIDX bitmaps Prepare the reverse index machinery to handle object lookups in an incremental MIDX bitmap. These changes are broken out across a few functions: - load_midx_revindex() learns to use the appropriate MIDX filename depending on whether the given 'struct multi_pack_index ' is incremental or not. - pack_pos_to_midx() and midx_to_pack_pos() now both take in a global object position in the MIDX pseudo-pack order, and find the earliest containing MIDX (similar to midx.c::midx_for_object(). - midx_pack_order_cmp() adjusts its call to pack_pos_to_midx() by the number of objects in the base (since 'vb - midx->revindx_data' is relative to the containing MIDX, and pack_pos_to_midx() expects a global position). Likewise, this function adjusts its output by adding m->num_objects_in_base to return a global position out through the `pos` pointer. Together, these changes are sufficient to use the multi-pack index's reverse index format for incremental multi-pack reachability bitmaps. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 04:33:33 -07:00
Taylor Blau	4887bdd4c7	Documentation: describe incremental MIDX bitmaps Prepare to implement support for reachability bitmaps for the new incremental multi-pack index (MIDX) feature over the following commits. This commit begins by first describing the relevant format and usage details for incremental MIDX bitmaps. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 04:33:28 -07:00
Taylor Blau	4a9179d151	Documentation: remove a "future work" item from the MIDX docs One of the items listed as "future work" in the MIDX's technical documentation is to extend the format to allow MIDXs to be written incrementally across multiple layers. This was suggested all the way back in `ceab693d1f` (multi-pack-index: add design document, 2018-07-12), and implemented in `b9497848df` (Merge branch 'tb/incremental-midx-part-1', 2024-08-19). Let's remove it accordingly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 04:33:05 -07:00
Patrick Steinhardt	0a3dceabf1	compat/mingw: fix EACCESS when opening files with `O_CREAT \| O_EXCL` In our CI systems we can observe that t0610 fails rather frequently. This testcase races a bunch of git-update-ref(1) processes with one another which are all trying to update a unique reference, where we expect that all processes succeed and end up updating the reftable stack. The error message in this case looks like the following: fatal: update_ref failed for ref 'refs/heads/branch-88': reftable: transaction prepare: I/O error Instrumenting the code with a couple of calls to `BUG()` in relevant sites where we return `REFTABLE_IO_ERROR` quickly leads one to discover that this error is caused when calling `flock_acquire()`, which is a thin wrapper around our lockfile API. Curiously, the error code we get in such cases is `EACCESS`, indicating that we are not allowed to access the file. The root cause of this is an oddity of `CreateFileW()`, which is what `_wopen()` uses internally. Quoting its documentation [1]: If you call CreateFile on a file that is pending deletion as a result of a previous call to DeleteFile, the function fails. The operating system delays file deletion until all handles to the file are closed. GetLastError returns ERROR_ACCESS_DENIED. This behaviour is triggered quite often in the above testcase because all the processes race with one another trying to acquire the lock for the "tables.list" file. This is due to how locking works in the reftable library when compacting a stack: 1. Lock the "tables.list" file and reads its contents. 2. Decide which tables to compact. 3. Lock each of the individual tables that we are about to compact. 4. Unlock the "tables.list" file. 5. Compact the individual tables into one large table. 6. Re-lock the "tables.list" file. 7. Write the new list of tables into it. 8. Commit the "tables.list" file. The important step is (4): we don't commit the file directly by renaming it into place, but instead we delete the lockfile so that concurrent processes can continue to append to the reftable stack while we compact the tables. And because we use `DeleteFileW()` to do so, we may now race with another process that wants to acquire that lockfile. So if we are unlucky, we would now see `ERROR_ACCESS_DENIED` instead of the expected `ERROR_FILE_EXISTS`, which the lockfile subsystem isn't prepared to handle and thus it will bail out without retrying to acquire the lock. In theory, the issue is not limited to the reftable library and can be triggered by every other user of the lockfile subsystem, as well. My gut feeling tells me it's rather unlikely to surface elsewhere though. Fix the issue by translating the error to `EEXIST`. This makes the lockfile subsystem handle the error correctly: in case a timeout is set it will now retry acquiring the lockfile until the timeout has expired. With this, t0610 is now always passing on my machine whereas it was previously failing in around 20-30% of all test runs. [1]: https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 04:11:15 -07:00
Patrick Steinhardt	871491f7ad	meson: fix compat sources when compiling with MSVC In our compat library we have both "msvc.c" and "mingw.c". The former is mostly a thin wrapper around the latter as it directly includes it, but it has a couple of extra headers that aren't included in "mingw.c" and is expected to be used with the Visual Studio compiler toolchain. While our Makefile knows to pick up the correct file depending on whether or not the Visual Studio toolchain is used, we don't do the same with Meson. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 04:11:15 -07:00
Justin Tobler	b9fadeead7	builtin/fetch: avoid aborting closed reference transaction As part of the reference transaction commit phase, the transaction is set to a closed state regardless of whether it was successful of not. Attempting to abort a closed transaction via `ref_transaction_abort()` results in a `BUG()`. In `c92abe71df` (builtin/fetch: fix leaking transaction with `--atomic`, 2024-08-22), logic to free a transaction after the commit phase is moved to the centralized exit path. In cases where the transaction commit failed, this results in a closed transaction being aborted and signaling a bug. Free the transaction and set it to NULL when the commit fails. This allows the exit path to correctly handle the error without attempting to abort the transaction. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:59:46 -07:00
Taylor Blau	484d7adcda	repack: begin combining cruft packs with `--combine-cruft-below-size` The previous commit changed the behavior of repack's '--max-cruft-size' to specify a cruft pack-specific override for '--max-pack-size'. Introduce a new flag, '--combine-cruft-below-size' which is a replacement for the old behavior of '--max-cruft-size'. This new flag does explicitly what it says: it combines together cruft packs which are smaller than a given threshold, and leaves alone ones which are larger. This accomplishes the original intent of '--max-cruft-size', which was to avoid repacking cruft packs larger than the given threshold. The new behavior is slightly different. Instead of building up small packs together until the threshold is met, '--combine-cruft-below-size' packs up all cruft packs smaller than the threshold. This means that we may make a pack much larger than the given threshold (e.g., if you aggregate 5 packs which are each 99 MiB in size with a threshold of 100 MiB). But that's OK: the point isn't to restrict the size of the cruft packs we generate, it's to avoid working with ones that have already grown too large. If repositories still want to limit the size of the generated cruft pack(s), they may use '--max-cruft-size'. There's some minor test fallout as a result of the slight differences in behavior between the old meaning of '--max-cruft-size' and the behavior of '--combine-cruft-below-size'. In the test which is now called "--combine-cruft-below-size combines packs", we need to use the new flag over the old one to exercise that test's intended behavior. The remainder of the changes there are to improve the clarity of the comments. Suggested-by: Elijah Newren <newren@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:42:07 -07:00
Taylor Blau	0855ed966c	repack: avoid combining cruft packs with `--max-cruft-size` In `37dc6d8104` (builtin/repack.c: implement support for `--max-cruft-size`, 2023-10-02), we exposed new functionality that allowed repositories to specify the behavior of when we should combine multiple cruft packs together. This feature was designed to ensure that we never repacked cruft packs which were larger than the given threshold in order to provide tighter I/O bounds for repositories that have many unreachable objects. In essence, specifying '--max-cruft-size=N' instructed 'repack' to aggregate cruft packs together (in order of ascending size) until the combine size grows past 'N', and then make a new cruft pack whose contents includes the packs we rolled up. But this isn't quite how it works in practice. Suppose for example that we have two cruft packs which are each 100MiB in size. One might expect specifying "--max-cruft-size=200M" would combine these two packs together, and then avoid repacking them until a pruning GC takes place. In reality, 'repack' would try and aggregate these together, but writing a pack that is strictly smaller than 200 MiB (since pack-objects' "--max-pack-size" provides a strict bound for packs containing more than one object). So instead we'll write out a pack that is, say, 199 MiB in size, and then another 1 MiB pack containing the balance. If we later repack the repository without adding any new unreachable objects, we'll repeat the same exercise again, making the same 199 MiB and 1 MiB packs each time. This happens because of a poor choice to bolt the '--max-cruft-size' functionality onto pack-objects' '--max-pack-size', forcing us to generate packs which are always smaller than the provided threshold and thus subject to repacking. The following commit will introduce a new flag that implements something similar to the behavior above. Let's prepare for that by making repack's '--max-cruft-size' flag behave as an cruft pack-specific override for '--max-pack-size'. Do so by temporarily repurposing the 'collapse_small_cruft_packs()' function to instead generate a cruft pack using the same instructions as if we didn't specify any maximum pack size. The calling code looks something like: if (args->max_pack_size && !cruft_expiration) { collapse_small_cruft_packs(in, args->max_pack_size, existing); } else { for_each_string_list_item(item, &existing->non_kept_packs) fprintf(in, "-%s.pack\n", item->string); for_each_string_list_item(item, &existing->cruft_packs) fprintf(in, "-%s.pack\n", item->string); } This patch makes collapse_small_cruft_packs() behave identically to the 'else' arm of the conditional above. This repurposing of 'collapse_small_cruft_packs()' is intentional, since it will set us up nicely to introduce the new behavior in the following commit. Naturally, there is some test fallout in the test which exercises the old meaning of '--max-cruft-size'. Mark that test as failing for now to be dealt with in the following commit. Likewise, add a new test which explicitly tests the behavior of '--max-cruft-size' to place a hard limit on the size of any generated cruft pack(s). Note that this is a breaking change, as it alters the user-visible behavior of '--max-cruft-size'. But I'm OK changing this behavior in this instance, since the behavior wasn't accurate to begin with. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:42:07 -07:00
Taylor Blau	7fb12bb27e	t/t7704-repack-cruft.sh: consolidate `write_blob()` A previous commit moved a handful of tests from a different script into t7704, including one that relies on generating random blobs. Incidentally, the original home of this test defined its own helper "write_blob" for doing so, which is identical in function to our "generate_random_blob" (and is slightly inferior to the latter, which cleans up after itself). Rewrite the test that uses "write_blob" to no longer do so and then remove the function. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:42:06 -07:00
Taylor Blau	1b01b03e52	t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests Now that a number of new tests have landed in t7704, make sure that they all make sense and are testing the things they say they are. Things are mostly OK, but a handful of tests needed tweaks. Those tweaks are as follows: - Use the terms "too large" or "too small" in tests that exercise the '--max-cruft-size' behavior. This has historically been treated as a threshold beneath which to combine cruft packs, but that will change in a subsequent commit. Prepare for that by using a more generic term. - Remove references to "--max-cruft-size" in the freshening tests. These tests provide coverage of our ability to record updated mtimes for objects already in cruft packs whose mtimes are upserted from various sources (loose objects, finding that object in a new pack, another cruft pack, etc.). These have nothing to do with the '--max-cruft-size' feature, and in fact none of the tests even use '--max-cruft-size'. Name them appropriately to make it clear that these tests exercise freshening behavior, not '--max-cruft-size' behavior. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:42:06 -07:00
Taylor Blau	cee95f2670	t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests The cruft pack feature has two primary test scripts which exercise various parts of it, which are: - t5329-pack-objects-cruft.sh - t7704-repack-cruft.sh The former is designed to test low-level pack generation mechanics at the 'git pack-objects --cruft'-level, which is plumbing. The latter, on the other hand, is designed to test the user-facing behavior through 'git repack --cruft', which is porcelain (under the "ancillary manipulators" sub-section). At some point a handful of tests which should have been added to the latter script were instead written to the former. This isn't a huge deal, but rectifying it is straightforward. Move a handful of 'repack'-related tests out of t5329 and into their rightful home in t7704. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:42:05 -07:00
Justin Tobler	340e7523c0	rev-list: support NUL-delimited --missing option The `--missing={print,print-info}` option for git-rev-list(1) prints missing objects found while performing the object walk in the form: $ git rev-list --missing=print-info <rev> ?<oid> [SP <token>=<value>]... LF Add support for printing missing objects in a NUL-delimited format when the `-z` option is enabled. $ git rev-list -z --missing=print-info <rev> <oid> NUL missing=yes NUL [<token>=<value> NUL]... In this mode, values containing special characters or spaces are printed as-is without being escaped or quoted. Instead of prefixing the missing OID with '?', a separate `missing=yes` token/value pair is appended. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:40:03 -07:00
Justin Tobler	1c3c1ab3d2	rev-list: support NUL-delimited --boundary option The `--boundary` option for git-rev-list(1) prints boundary objects found while performing the object walk in the form: $ git rev-list --boundary <rev> -<oid> LF Add support for printing boundary objects in a NUL-delimited format when the `-z` option is enabled. $ git rev-list -z --boundary <rev> <oid> NUL boundary=yes NUL In this mode, instead of prefixing the boundary OID with '-', a separate `boundary=yes` token/value pair is appended. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:40:02 -07:00
Justin Tobler	c3d59c2e70	rev-list: support delimiting objects with NUL bytes When walking objects, git-rev-list(1) prints each object entry on a separate line. Some options, such as `--objects`, may print additional information about tree and blob object on the same line in the form: $ git rev-list --objects <rev> <tree/blob oid> SP [<path>] LF Note that in this form the SP is appended regardless of whether the tree or blob object has path information available. Paths containing a newline are also truncated at the newline. Introduce the `-z` option for git-rev-list(1) which reformats the output to use NUL-delimiters between objects and associated info in the following form: $ git rev-list -z --objects <rev> <oid> NUL [path=<path> NUL] In this form, the start of each record is signaled by an OID entry that is all hexidecimal and does not contain any '='. Additional path info from `--objects` is appended to the record as a token/value pair `path=<path>` as-is without any truncation. For now, the `--objects` flag is the only options that can be used in combination with `-z`. In a subsequent commit, NUL-delimited support for other options is added. Other options that do not make sense when used in combination with `-z` are rejected. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:40:02 -07:00
Justin Tobler	c9907a1916	rev-list: refactor early option parsing Before invoking `setup_revisions()`, the `--missing` and `--exclude-promisor-objects` options are parsed early. In a subsequent commit, another option is added that must be parsed early. Refactor the code to parse both options in a single early pass. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:40:02 -07:00
Justin Tobler	1481e29112	rev-list: inline `show_object_with_name()` in `show_object()` The `show_object_with_name()` function only has a single call site. Inline call to `show_object_with_name()` in `show_object()` so the explicit function can be cleaned up and live closer to where it is used. While at it, factor out the code that prints the OID and newline for both objects with and without a name. In a subsequent commit, `show_object()` is modified to support printing object information in a NUL-delimited format. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:40:02 -07:00
Elijah Newren	5633aa3af1	treewide: replace assert() with ASSERT() in special cases When the compiler/linker cannot verify that an assert() invocation is free of side effects for us (e.g. because the assertion includes some kind of function call), replace the use of assert() with ASSERT(). Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:32:10 -07:00
Elijah Newren	85e4f762c2	ci: add build checking for side-effects in assert() calls It is a big no-no to have side-effects in an assertion, because if the assert() is compiled out, you don't get that side-effect, leading to the code behaving differently. That can be a large headache to debug. We have roughly 566 assert() calls in our codebase (my grep might have picked up things that aren't actually assert() calls, but most appeared to be). All but 9 of them can be determined by gcc to be free of side effects with a clever redefine of assert() provided by Bruno De Fraine (from https://stackoverflow.com/questions/10593492/catching-assert-with-side-effects), who upon request has graciously placed his two-liner into the public domain without warranty of any kind. The current 9 assert() calls flagged by this clever redefinition of assert() appear to me to be free of side effects as well, but are too complicated for a compiler/linker to figure that since each assertion involves some kind of function call. Add a CI job which will find and report these possibly problematic assertions, and have the job suggest to the user that they replace these with ASSERT() calls. Example output from running: ``` ERROR: The compiler could not verify the following assert() calls are free of side-effects. Please replace with ASSERT() calls. /home/newren/floss/git/diffcore-rename.c:1409 assert(!dir_rename_count \|\| strmap_empty(dir_rename_count)); /home/newren/floss/git/merge-ort.c:1645 assert(renames->deferred[side].trivial_merges_okay && !strset_contains(&renames->deferred[side].target_dirs, path)); /home/newren/floss/git/merge-ort.c:794 assert(omittable_hint == (!starts_with(type_short_descriptions[type], "CONFLICT") && !starts_with(type_short_descriptions[type], "ERROR")) \|\| type == CONFLICT_DIR_RENAME_SUGGESTED); /home/newren/floss/git/merge-recursive.c:1200 assert(!merge_remote_util(commit)); /home/newren/floss/git/object-file.c:2709 assert(would_convert_to_git_filter_fd(istate, path)); /home/newren/floss/git/parallel-checkout.c:280 assert(is_eligible_for_parallel_checkout(pc_item->ce, &pc_item->ca)); /home/newren/floss/git/scalar.c:244 assert(have_fsmonitor_support()); /home/newren/floss/git/scalar.c:254 assert(have_fsmonitor_support()); /home/newren/floss/git/sequencer.c:4968 assert(!(opts->signoff \|\| opts->no_commit \|\| opts->record_origin \|\| should_edit(opts) \|\| opts->committer_date_is_author_date \|\| opts->ignore_date)); ``` Note that if there are possibly problematic assertions, not necessarily all of them will be shown in a single run, because the compiler errors may include something like "ld: ... more undefined references to `not_supposed_to_survive' follow" instead of listing each individually. But in such cases, once you clean up a few that are shown in your first run, subsequent runs will show (some of) the ones that remain, allowing you to iteratively remove them all. Helped-by: Bruno De Fraine <defraine@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:32:04 -07:00
Elijah Newren	07fbc15c20	git-compat-util: introduce ASSERT() macro Create a ASSERT() macro which is similar to assert(), but will not be compiled out when NDEBUG is defined, and is thus safe to use even if its argument has side-effects. We will use this new macro in a subsequent commit to convert a few existing assert() invocations to ASSERT(). In particular, we'll convert the handful of invocations which cannot be proven to be free of side effects with a simple compiler/linker hack. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 03:31:47 -07:00
Meet Soni	0e1b9c5eed	reftable: adapt write_object_record() to propagate block_writer_add() errors Previously, write_object_record() would flush the current block and retry appending the record whenever block_writer_add() returned any nonzero error. This forced an assumption that every failure meant the block was full, even when errors such as memory allocation or I/O failures occurred. Update the write_object_record() to inspect the error code returned by block_writer_add() and flush and reinitialize the writer iff the error is REFTABLE_ENTRY_TOO_BIG_ERROR. For any other error, immediately propagate it. If the flush and reinitialization still fail with REFTABLE_ENTRY_TOO_BIG_ERROR, reset the record's offset length to zero before a final attempt. All call sites now handle various error codes returned by block_writer_add(). Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:51:08 -07:00
Meet Soni	9ce297239b	reftable: adapt writer_add_record() to propagate block_writer_add() errors Previously, writer_add_record() would flush the current block and retry appending the record whenever block_writer_add() returned any nonzero error. This forced an assumption that every failure meant the block was full, even when errors such as memory allocation or I/O failures occurred. Update the writer_add_record() to inspect the error code returned by block_writer_add() and only flush and reinitialize the writer when the error is REFTABLE_ENTRY_TOO_BIG_ERROR. For any other error, immediately propagate it. Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:51:07 -07:00
Meet Soni	27571684dd	reftable: propagate specific error codes in block_writer_add() Previously, functions block_writer_add() and related functions returned -1 when the record did not fit, forcing the caller to assume that any failure meant the entry was too big. Replace these generic -1 returns with defined error codes. This prepares the codebase for finer-grained error handling so that callers can distinguish between a block-full condition and other errors. Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:51:07 -07:00
Taylor Blau	c000918eb7	pseudo-merge.h: fix a typo The comment added in `7252d9a036` (pseudo-merge: implement support for finding existing merges, 2024-05-23) misspells 'bitmap' as 'bitamp'. Correct that so that we no longer have any stray "bitamps" lurking throughout the tree: $ git grep -ci bitamp \| wc -l 0 Noticed-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:47:07 -07:00
Taylor Blau	459e54b549	refspec: replace `refspec_item_init()` with fetch/push variants For similar reasons as in the previous refactoring of `refspec_init()` into `refspec_init_fetch()` and `refspec_init_push()`, apply the same refactoring to `refspec_item_init()`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:45:16 -07:00
Taylor Blau	ec6829e484	refspec: remove refspec_item_init_or_die() There are two callers of this function, which ensures that a dispatched call to refspec_item_init() does not fail. In the following commit, we're going to add fetch/push-specific variants of refspec_item_init(), which will turn one function into two. To avoid introducing yet another pair of new functions (such as refspec_item_init_push_or_die() and refspec_item_init_fetch_or_die()), let's remove the thin wrapper entirely. This duplicates a single line of code among two callers, but thins the refspec.h API by one function, and prevents introducing two more in the following commit. Note that we still have a trailing Boolean argument in the function `refspec_item_init()`. The following commit will address this. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:45:16 -07:00
Taylor Blau	0baad1f3ae	refspec: replace `refspec_init()` with fetch/push variants To avoid having a Boolean argument in the refspec_init() function, replace it with two variants: - `refspec_init_fetch()` - `refspec_init_push()` to codify the meaning of that Boolean into the function's name itself. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:45:16 -07:00
Taylor Blau	3809633d0a	refspec: treat 'fetch' as a Boolean value Since `6d4c057859` (refspec: introduce struct refspec, 2018-05-16), we have macros called REFSPEC_FETCH and REFSPEC_PUSH. This confusingly suggests that we might introduce other modes in the future, which, while possible, is highly unlikely. But these values are treated as a Boolean, and stored in a struct field called 'fetch'. So the following: if (refspec->fetch == REFSPEC_FETCH) { ... } , and if (refspec->fetch) { ... } are equivalent. Let's avoid renaming the Boolean values "true" and "false" here and remove the two REFSPEC_ macros mentioned above. Since this value is truly a Boolean and will only ever take on a value of 0 or 1, we can declare it as a single bit unsigned field. In practice this won't shrink the size of 'struct refspec', but it more clearly indicates the intent. Note that this introduces some awkwardness like: refspec_item_init_or_die(&spec, refspec, 1); , where it's unclear what the final "1" does. This will be addressed in the following commits. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:45:15 -07:00
Junio C Hamano	f543202a16	Merge branch 'jk/fetch-ref-prefix-cleanup' into tb/refspec-fetch-cleanup * jk/fetch-ref-prefix-cleanup: fetch: use ref prefix list to skip ls-refs fetch: avoid ls-refs only to ask for HEAD symref update fetch: stop protecting additions to ref-prefix list fetch: ask server to advertise HEAD for config-less fetch refspec_ref_prefixes(): clean up refspec_item logic t5516: beef up exact-oid ref prefixes test t5516: drop NEEDSWORK about v2 reachability behavior t5516: prefer "oid" to "sha1" in some test titles t5702: fix typo in test name	2025-03-21 01:43:22 -07:00
Taylor Blau	46e6f9af3e	http.c: allow custom TCP keepalive behavior via config curl supports a few options to control when and how often it should instruct the OS to send TCP keepalives, like KEEPIDLE, KEEPINTVL, and KEEPCNT. Until this point, there hasn't been a way for users to change what values are used for these options, forcing them to rely on curl's defaults. But we do unconditionally enable TCP keepalives without giving users an ability to tweak any fine-grained parameters. Ordinarily this isn't a problem, particularly for users that have fast-enough connections, and/or are talking to a server that has generous or nonexistent thresholds for killing a connection it hasn't heard from in a while. But it can present a problem when one or both of those assumptions fail. For instance, I can reliably get an in-progress clone to be killed from the remote end when cloning from some forges while using trickle to limit my clone's bandwidth. For those users and others who wish to more finely tune the OS's keepalive behavior, expose configuration and environment variables which allow setting curl's KEEPIDLE, KEEPINTVL, and KEEPCNT options. Note that while KEEPIDLE and KEEPINTVL were added in curl 7.25.0, KEEPCNT was added much more recently in curl 8.9.0. Per `f7c094060c` (git-curl-compat: remove check for curl 7.25.0, 2024-10-23), both KEEPIDLE and KEEPINTVL are set unconditionally. But since we may be compiled with a curl that isn't as new as 8.9.0, only set KEEPCNT when we have CURLOPT_TCP_KEEPCNT to begin with. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:38:52 -07:00
Taylor Blau	bfdd2591b0	http.c: inline `set_curl_keepalive()` At the end of `get_curl_handle()` we call `set_curl_keepalive()` to enable TCP keepalive probes on our CURL handle. `set_curl_keepalive()` dates back to `47ce115370` (http: use curl's tcp keepalive if available, 2013-10-14), which conditionally compiled different variants of `set_curl_keepalive()` depending on what version of curl we were compiled with[^1]. As of `f7c094060c` (git-curl-compat: remove check for curl 7.25.0, 2024-10-23), we no longer conditionally compile `set_curl_keepalive()` since we no longer support pre-7.25.0 versions of curl. But the version of that function that we kept is really just a thin wrapper around setting the TCP_KEEPALIVE option, so there's no reason to keep it in its own function. Inline the definition of `set_curl_keepalive()` to within `get_curl_handle()` so that the setup of our CURL handle is self-contained. [1]: The details are spelled out in `47ce115370`, but the gist is curl 7.25.0 and newer use CURLOPT_TCP_KEEPALIVE, older versions use CURLOPT_SOCKOPTFUNCTION with a custom callback, and older versions that predate even that option do nothing. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:38:47 -07:00
Taylor Blau	572795cff9	http.c: introduce `set_long_from_env()` for convenience In `7059cd99fc` (http_init(): Fix config file parsing, 2009-03-09), http.c gained a new "set_from_env()" function as a convenience function around conditionally assigning an environment variable to some variable if and only if the environment variable was set to begin with. But prior to `7059cd99fc`, there were two spots which need to first strtol() whatever is set in the environment before assigning it to a long pointer. Both instances stored the result of getenv() in a temporary variable, and conditionally strtol() it depending on whether or not getenv() returned NULL. Replace those two instances with a new cousin of 'set_from_env()' called 'set_long_from_env()', which does what its name suggests. This allows us to remove the temporary variables and clean up some minor code duplication while also adding more robust error handling. More importantly, however, it prepares us for a future commit which will introduce more instances of assigning an environment variable to a long. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:38:41 -07:00
Taylor Blau	894221d2af	http.c: remove unnecessary casts to long When parsing 'http.lowSpeedLimit' and 'http.lowSpeedTime', we explicitly cast the result of 'git_config_int()' to a long before assignment. This cast has been in place since all the way back in `58e60dd203` (Add support for pushing to a remote repository using HTTP/DAV, 2005-11-02). But that cast has always been unnecessary, since long is guaranteed to be at least as wide as int. Let's drop the cast accordingly. Noticed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-21 01:38:27 -07:00
Karthik Nayak	ee89f7c79d	ci/github: add missing 'CI_JOB_IMAGE' env variable The CI setups of GitLab and GitHub use a common dependency management script 'ci/install-dependencies.sh'. The script install the necessary packages based on a combination of the "$distro" and "$jobname" env variables. The "$distro" variable is derived from the "CI_JOB_IMAGE" env variable set by the CI configs. In the GitHub CI config, some of the jobs are missing this variable. For the 'Documentation' job which depends on 'meson' being installed, this raises an error since the 'meson' dependency is never installed. Fix this by adding the 'CI_JOB_IMAGE' variable to all missing jobs. We don't add it the windows jobs, since they manager their dependency as part of the CI config and no further dependency management is needed. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-20 19:46:12 -07:00
Jean-Noël Avila	7b399322a2	doc: apply new format to git-branch man page - Switch the synopsis to a synopsis block which automatically formats placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine applies synopsis rules to these spans. Possible values for some variables, that were mentioned in the description prose, are now made into enumerated list. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-20 19:27:30 -07:00
Jean-Noël Avila	e1b81f54da	completion: take into account the formatting backticks for options With the modern formatting of the manpages, the options and commands are now backticked in their definition lists. This patch updates the generation of the completion list to take into account this new format. The script `generate-configlist.sh` is updated to get rid of extraneous commands and fit everything in a single sed script. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-20 19:27:29 -07:00
Jensen Huang	d39f04b638	index-pack, unpack-objects: restore missing ->init_fn Commit `0578f1e66a` ("global: adapt callers to use generic hash context helpers") accidentally removed `->init_fn`, which is required for OpenSSL 3+ SHA1. This fixes the following error on fetch: fatal: fetch-pack: invalid index-pack output Signed-off-by: Jensen Huang <hmz007@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 12:27:33 -07:00
Christian Couder	2c0dcb9754	promisor-remote: compare remote names case sensitively Because the "[remote "nick"] fetch = ..." configuration variables have the nickname in the second part, the nicknames are case sensitive, unlike the first and the third component (i.e. "remote.origin.fetch" and "Remote.origin.FETCH" are the same thing, but "remote.Origin.fetch" and "remote.origin.fetch" are different). Let's follow the way Git works in general and compare the remote names case sensitively when processing advertised remotes. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 12:22:34 -07:00
Christian Couder	caed258323	promisor-remote: fix possible issue when no URL is advertised In the 'KnownUrl' case, in should_accept_remote(), let's check that `remote_url` is not NULL before we use strcmp() to compare it with the local URL. This could avoid crashes if a server starts to not advertise any URL in the future. If `remote_url` is NULL, we should reject the URL. Let's also warn in this case because we warn otherwise when a remote is rejected to try to help diagnose things at the end of the function. And while we are checking that remote_url is not NULL and warning if it is, it makes sense to also help diagnose the case where remote_url is empty. Also while at it, let's spell "URL" with uppercase letters in all the warnings. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 12:22:33 -07:00
Christian Couder	b059339bb3	promisor-remote: fix segfault when remote URL is missing Using strvec_push() to push `NULL` into a 'strvec' results in a segfault, because `xstrdup(NULL)` crashes. So when an URL is missing from the config, let's not push the remote name and URL into the 'strvec's. While at it, let's also not push them in case the URL is empty. It's just not worth the trouble and it's consistent with how Git otherwise treats missing and empty URLs in the same way. Note that in case of missing or empty URL, Git uses the remote name to fetch, which can work if the remote is on the same filesystem. So configurations where the client, server and remote are all on the same filesystem may need URLs to be configured even if they are the same as the remote names. But this is a rare case, and the work around is easy enough. We leave improving the strvec API and/or xstrdup() for a future separate effort. While at it, let's also use git_config_get_string_tmp() instead of git_config_get_string() to simplify memory management. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 12:22:33 -07:00
Christian Couder	9e05fbe61b	t5710: arrange to delete the client before cloning If `test_when_finished "rm -rf client"` is run after we clone, it will not run if the clone failed, so the "client" directory might not be removed at the end of the test. `git clone` does try to remove the directory when it fails, but let's be safe and try to protect against possibly weird clone failures by moving `test_when_finished "rm -rf client"` before the clone. It just makes more sense this way around. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 12:22:33 -07:00
Jeff King	aab0f899d9	fetch: don't ask for remote HEAD if followRemoteHEAD is "never" When we are going to consider updating the refs/remotes/*/HEAD symref, we have to ask the remote side where its HEAD points. But if we know that the feature is disabled by config, we don't need to bother! This saves a little bit of work and network communication for the server. And even a little bit of effort on the client, as our local set_head() function did a bit of work matching the remote HEAD before realizing that we're not going to do anything with it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 12:21:26 -07:00
Jeff King	c834d1a7ce	fetch: only respect followRemoteHEAD with configured refspecs The new followRemoteHEAD feature is triggered for almost every fetch, causing us to ask the server about the remote "HEAD" and to consider updating our local tracking HEAD symref. This patch limits the feature only to the case when we are fetching a remote using its configured refspecs (typically into its refs/remotes/ hierarchy). There are two reasons for this. One is efficiency. E.g., the fixes in `6c915c3f85` (fetch: do not ask for HEAD unnecessarily, 2024-12-06) and `20010b8c20` (fetch: avoid ls-refs only to ask for HEAD symref update, 2025-03-08) were aimed at reducing the work we do when we would not be able to update HEAD anyway. But they do not quite cover all cases. The remaining one is: git fetch origin refs/heads/foo:refs/remotes/origin/foo which _sometimes_ can update HEAD, but usually not. And that leads us to the second point, which is being simple and explainable. The code for updating the tracking HEAD symref requires both that we learned which ref the remote HEAD points at, and that the server advertised that ref to us. But because the v2 protocol narrows the server's advertisement, the command above would not typically update HEAD at all, unless it happened to point to the "foo" branch. Or even weirder, it probably _would_ update if the server is very old and supports only the v0 protocol, which always gives a full advertisement. This creates confusing behavior for the user: sometimes we may try to update HEAD and sometimes not, depending on vague rules. One option here would be to loosen the update code to accept the remote HEAD even if the server did not advertise that ref. I think that could work, but it may also lead to interesting corner cases (e.g., creating a dangling symref locally, even though the branch is not unborn on the server, if we happen not to have fetched it). So let's instead simplify the rules: we'll only consider updating the tracking HEAD symref when we're doing a full fetch of the remote's configured refs. This is easy to implement; we can just set a flag at the moment we realize we're using the configured refspecs. And we can drop the special case code added by `6c915c3f85` and `20010b8c20`, since this covers those cases. The existing tests from those commits still pass. In t5505, an incidental call to "git fetch <remote> <refspec>" updated HEAD, which caused us to adjust the test in `3f763ddf28` (fetch: set remote/HEAD if it does not exist, 2024-11-22). We can now adjust that back to how it was before the feature was added. Even though t5505 is incidentally testing our new desired behavior, we'll add an explicit test in t5510 to make sure it is covered. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 12:21:25 -07:00
Junio C Hamano	1a0413a850	Merge branch 'jk/fetch-ref-prefix-cleanup' into jk/fetch-follow-remote-head-fix * jk/fetch-ref-prefix-cleanup: fetch: use ref prefix list to skip ls-refs fetch: avoid ls-refs only to ask for HEAD symref update fetch: stop protecting additions to ref-prefix list fetch: ask server to advertise HEAD for config-less fetch refspec_ref_prefixes(): clean up refspec_item logic t5516: beef up exact-oid ref prefixes test t5516: drop NEEDSWORK about v2 reachability behavior t5516: prefer "oid" to "sha1" in some test titles t5702: fix typo in test name	2025-03-18 12:21:20 -07:00
Phillip Wood	ae85116f18	docs: add BreakingChanges to TECH_DOCS target When BreakingChanges.txt was added in `57ec9254eb` (docs: introduce document to announce breaking changes, 2024-06-14) there was no corresponding change to the Makefile to build it. Fix that by adding it to the TECH_DOCS target. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 10:39:23 -07:00
Phillip Wood	ee434e1807	pack-refs doc: fix indentation for --exclude Separate the paragraphs in the description of `--exclude` with a `+` rather than an empty line to indent the whole description rather than just the first paragraph. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 10:38:06 -07:00
Elijah Newren	947e219fb6	am: switch from merge_recursive_generic() to merge_ort_generic() Switch from merge-recursive to merge-ort. Adjust the following testcases due to the switch: * t4151: This test left an untracked file in the way of the merge. merge-recursive could only sometimes tell when untracked files were in the way, and by the time it discovers others, it has already made too many changes to back out of the merge. So, instead of writing the results to e.g. 'file1' it would instead write them to 'file1~branch1'. This is confusing for users, because they might not notice 'file1~branch1' and accidentally add and commit 'file1'. In contrast, merge-ort correctly notices the file in the way before making any changes and aborts. Since this test didn't care about the file in the way, just remove it before calling git-am. * t4255: Usage of merge-ort allows us to change two known failures into successes. * t6427: As noted a few commits ago, the choice of conflict label for diff3 markers for the ancestor commit was previously handled by merge-recursive.c rather than by callers. Since that has now changed, `git am` needs to specify that label. Although the previous conflict label ("constructed merge base") was already fairly somewhat slanted towards `git am`, let's use wording more along the lines of the related command-line flag from `git apply` and function involved to tie it more closely to `git am`. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 09:49:08 -07:00
Elijah Newren	a16e8efe5c	merge-ort: fix merge.directoryRenames=false There are two issues here. First, when merge.directoryRenames is set to false, there are a few code paths that should be turned off. I missed one; collect_renames() was still doing some directory rename detection logic unconditionally. It ended up not having much effect because get_provisional_directory_renames() was skipped earlier and not setting up renames->dir_renames, but the code should still be skipped. Second, the larger issue is that sometimes we get a cached_pair rename from a previous commit being replayed mapping A->B, but in a subsequent commit but collect_merge_info() doesn't even recurse into the directory containing B because there are no source pairings for that rename that are relevant; we can merge that commit fine without knowing the rename. But since the cached renames are added to the normal renames, when we go to process it and find that B is not part of opt->priv->paths, we hit the assertion error process_renames: Assertion `newinfo && ~newinfo->merged.clean` failed. I think we could fix this at the beginning of detect_regular_renames() by pruning from cached_pairs any entry whose destination isn't in opt->priv->paths, but it's suboptimal in that we'd kind of like the cached_pair to be restored afterwards so that it can help the subsequent commit, but more importantly since it sits at the intersection of the caching renames optimization and the relevant renames optimization, and the trivial directory resolution optimization, and I don't currently have Documentation/technical/remembering-renames.txt fully paged in, I'm not sure if that's a full solution or a bandaid for the current testcase. However, since the remembering renames optimization was the weakest of the set, and the optimization is far less important when directory rename detection is off (as that implies far fewer potential renames), let's just use a bigger hammer to ensure this special case is fixed: turn off the rename caching. We do the same thing already when we encounter rename/rename(1to1) cases (as per `git grep -3 disabling.the.optimization`, though it uses a slightly different triggering mechanism since it's trying to affect the next time that merge_check_renames_reusable() is called), and I think it makes sense to do the same here. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-18 09:49:04 -07:00

... 11 12 13 14 15 ...

77252 Commits (v2.50.0-rc1) All Branches Search

77252 Commits (v2.50.0-rc1)

All Branches