Commit Graph

946 Commits (03eae9afb447ad4add2f18a1adb3589f050d596f)

Author SHA1 Message Date
Junio C Hamano 839b808325 Merge branch 'cp/unit-test-reftable-block'
Another test for reftable library ported to the unit test framework.

* cp/unit-test-reftable-block:
  t-reftable-block: mark unused argv/argc
  t-reftable-block: add tests for index blocks
  t-reftable-block: add tests for obj blocks
  t-reftable-block: add tests for log blocks
  t-reftable-block: remove unnecessary variable 'j'
  t-reftable-block: use xstrfmt() instead of xstrdup()
  t-reftable-block: use block_iter_reset() instead of block_iter_close()
  t-reftable-block: use reftable_record_key() instead of strbuf_addstr()
  t-reftable-block: use reftable_record_equal() instead of check_str()
  t-reftable-block: release used block reader
  t: harmonize t-reftable-block.c with coding guidelines
  t: move reftable/block_test.c to the unit testing framework
2024-08-29 11:08:16 -07:00
Junio C Hamano d4d677704d Merge branch 'ps/reftable-drop-generic'
The code in the reftable library has been cleaned up by discarding
unused "generic" interface.

* ps/reftable-drop-generic:
  reftable: mark unused parameters in empty iterator functions
  reftable/generic: drop interface
  t/helper: refactor to not use `struct reftable_table`
  t/helper: use `hash_to_hex_algop()` to print hashes
  t/helper: inline printing of reftable records
  t/helper: inline `reftable_table_print()`
  t/helper: inline `reftable_stack_print_directory()`
  t/helper: inline `reftable_reader_print_file()`
  t/helper: inline `reftable_dump_main()`
  reftable/dump: drop unused `compact_stack()`
  reftable/generic: move generic iterator code into iterator interface
  reftable/iter: drop double-checking logic
  reftable/stack: open-code reading refs
  reftable/merged: stop using generic tables in the merged table
  reftable/merged: rename `reftable_new_merged_table()`
  reftable/merged: expose functions to initialize iterators
2024-08-29 11:08:16 -07:00
Junio C Hamano a9bc27fb18 Merge branch 'gt/unit-test-urlmatch-normalization'
Another rewrite of test.

* gt/unit-test-urlmatch-normalization:
  t: migrate t0110-urlmatch-normalization to the new framework
2024-08-28 10:31:27 -07:00
Junio C Hamano 3222718ad7 Merge branch 'ds/for-each-ref-is-base'
'git for-each-ref' learned a new "--format" atom to find the branch
that the history leading to a given commit "%(is-base:<commit>)" is
likely based on.

* ds/for-each-ref-is-base:
  p1500: add is-base performance tests
  for-each-ref: add 'is-base' token
  commit: add gentle reference lookup method
  commit-reach: add get_branch_base_for_tip
2024-08-26 11:32:24 -07:00
Junio C Hamano 2b30d66c43 Merge branch 'jk/mark-unused-parameters'
Mark unused parameters as UNUSED to squelch -Wunused warnings.

* jk/mark-unused-parameters:
  t-hashmap: stop calling setup() for t_intern() test
  scalar: mark unused parameters in dummy function
  daemon: mark unused parameters in non-posix fallbacks
  setup: mark unused parameter in config callback
  test-mergesort: mark unused parameters in trivial callback
  t-hashmap: mark unused parameters in callback function
  reftable: mark unused parameters in virtual functions
  reftable: drop obsolete test function declarations
  reftable: ignore unused argc/argv in test functions
  unit-tests: ignore unused argc/argv
  t/helper: mark more unused argv/argc arguments
  oss-fuzz: mark unused argv/argc argument
  refs: mark unused parameters in do_for_each_reflog_helper()
  refs: mark unused parameters in ref_store fsck callbacks
  update-ref: mark more unused parameters in parser callbacks
  imap-send: mark unused parameter in ssl_socket_connect() fallback
2024-08-26 11:32:23 -07:00
Junio C Hamano 668843e6d8 Merge branch 'cp/unit-test-reftable-readwrite'
* cp/unit-test-reftable-readwrite:
  t-reftable-readwrite: add test for known error
  t-reftable-readwrite: use 'for' in place of infinite 'while' loops
  t-reftable-readwrite: use free_names() instead of a for loop
  t: move reftable/readwrite_test.c to the unit testing framework
2024-08-23 09:02:35 -07:00
Junio C Hamano 5e56a39e6a Merge branch 'ps/config-wo-the-repository'
Use of API functions that implicitly depend on the_repository
object in the config subsystem has been rewritten to pass a
repository object through the callchain.

* ps/config-wo-the-repository:
  config: hide functions using `the_repository` by default
  global: prepare for hiding away repo-less config functions
  config: don't depend on `the_repository` with branch conditions
  config: don't have setters depend on `the_repository`
  config: pass repo to functions that rename or copy sections
  config: pass repo to `git_die_config()`
  config: pass repo to `git_config_get_expiry_in_days()`
  config: pass repo to `git_config_get_expiry()`
  config: pass repo to `git_config_get_max_percent_split_change()`
  config: pass repo to `git_config_get_split_index()`
  config: pass repo to `git_config_get_index_threads()`
  config: expose `repo_config_clear()`
  config: introduce missing setters that take repo as parameter
  path: hide functions using `the_repository` by default
  path: stop relying on `the_repository` in `worktree_git_path()`
  path: stop relying on `the_repository` when reporting garbage
  hooks: remove implicit dependency on `the_repository`
  editor: do not rely on `the_repository` for interactive edits
  path: expose `do_git_common_path()` as `repo_common_pathv()`
  path: expose `do_git_path()` as `repo_git_pathv()`
2024-08-23 09:02:34 -07:00
Patrick Steinhardt 89191232b8 t/helper: refactor to not use `struct reftable_table`
The `struct reftable_table` interface in our "reftable" test helper gets
used such that we can easily print either a single table, or a merged
stack. This generic interface is about to go away.

Prepare the code for this change by using merged tables instead. When
printing the stack we've already got one. When using a single table, we
can create a merged table from it to adapt.

This removes the last user of the generic interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
Patrick Steinhardt 1f39dd2ae5 t/helper: use `hash_to_hex_algop()` to print hashes
The "reftable" test helper uses a hand-crafted version to convert from a
raw hash to its hex variant. This was done because this code used to be
part of the reftable library, where we do not use most functions from
the Git core.

Now that the code is integrated into the "dump-reftable" helper though,
that limitation went away. Let's thus use `hash_to_hex_algop()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
Patrick Steinhardt 42c424d69d t/helper: inline printing of reftable records
Move printing of reftable records into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt 64a5b7a8ca t/helper: inline `reftable_table_print()`
Move `reftable_table_print()` into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt ca74ef6ffb t/helper: inline `reftable_stack_print_directory()`
Move `reftable_stack_print_directory()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.

Note that this requires us to remove the tests for this functionality in
`reftable/stack_test.c`. The test does not really add much anyway,
because all it verifies is that we do not crash or run into an error,
and it specifically doesn't check the outputted data. Also, as the code
is now part of the test helper, it doesn't make much sense to have a
unit test for it in the first place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt 22f519a9a0 t/helper: inline `reftable_reader_print_file()`
Move `reftable_reader_print_file()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt 2b06b28fd6 t/helper: inline `reftable_dump_main()`
The printing functionality part of `reftable/dump.c` is really only used
by our "dump-reftable" test helper. It is certainly not generic logic
that is useful to anybody outside of Git, and the format it generates is
quite specific. Still, parts of it are used in our test suite and the
output may be useful to take a peek into reftable stacks, tables and
blocks. So while it does not make sense to expose this as part of the
reftable library, it does make sense to keep it around.

Inline the `reftable_dump_main()` function into the "dump-reftable" test
helper. This clarifies that its format is subject to change and not part
of our public interface. Furthermore, this allows us to iterate on the
implementation in subsequent patches.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Chandra Pratap 546cc0d64e t: move reftable/block_test.c to the unit testing framework
reftable/block_test.c exercises the functions defined in
reftable/block.{c, h}. Migrate reftable/block_test.c to the unit
testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests to follow the unit-tests'
naming conventions.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Ghanshyam Thakkar 05026637f3 t: migrate t0110-urlmatch-normalization to the new framework
helper/test-urlmatch-normalization along with
t0110-urlmatch-normalization test the `url_normalize()` function from
'urlmatch.h'. Migrate them to the unit testing framework for better
performance. And also add different test_msg()s for better debugging.

In the migration, last two of the checks from `t_url_general_escape()`
were slightly changed compared to the shell script. This involves
changing

'\'' -> '
'\!' -> !

in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
pasted verbatim from the shell script.

Another change is the removal of a MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a
Unicode string, it is not possible to pass arbitrary bytes to a
program. But in unit tests we don't have this limitation.

And since we can construct strings with arbitrary bytes in C, let's
also remove the test files which contain URLs with arbitrary bytes in
the 't/t0110' directory and instead embed those URLs in the unit test
code itself.

[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 10:08:28 -07:00
Junio C Hamano b9497848df Merge branch 'tb/incremental-midx-part-1'
Incremental updates of multi-pack index files.

* tb/incremental-midx-part-1:
  midx: implement support for writing incremental MIDX chains
  t/t5313-pack-bounds-checks.sh: prepare for sub-directories
  t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  midx: implement verification support for incremental MIDXs
  midx: support reading incremental MIDX chains
  midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs
  midx: teach `midx_preferred_pack()` about incremental MIDXs
  midx: teach `midx_contains_pack()` about incremental MIDXs
  midx: remove unused `midx_locate_pack()`
  midx: teach `fill_midx_entry()` about incremental MIDXs
  midx: teach `nth_midxed_offset()` about incremental MIDXs
  midx: teach `bsearch_midx()` about incremental MIDXs
  midx: introduce `bsearch_one_midx()`
  midx: teach `nth_bitmapped_pack()` about incremental MIDXs
  midx: teach `nth_midxed_object_oid()` about incremental MIDXs
  midx: teach `prepare_midx_pack()` about incremental MIDXs
  midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs
  midx: add new fields for incremental MIDX chains
  Documentation: describe incremental MIDX format
2024-08-19 11:07:37 -07:00
Junio C Hamano 4dbca805e0 Merge branch 'rs/unit-tests-test-run'
Unit-test framework has learned a simple control structure to allow
embedding test statements in-line instead of having to create a new
function to contain them.

* rs/unit-tests-test-run:
  t-strvec: use if_test
  t-reftable-basics: use if_test
  t-ctype: use if_test
  unit-tests: add if_test
  unit-tests: show location of checks outside of tests
  t0080: use here-doc test body
2024-08-19 11:07:36 -07:00
Jeff King f288a57789 test-mergesort: mark unused parameters in trivial callback
The mode_copy() function does nothing, but since it's used as a function
pointer within "struct mode", it has to conform to the interface. Mark
it to quiet -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:23 -07:00
Jeff King 7046c85cce t/helper: mark more unused argv/argc arguments
This is a continuation of 126e3b3d2a (t/helper: mark unused argv/argc
arguments, 2023-03-28) to cover a few new cases:

 - test-example-tap was added since that commit

 - test-hashmap used to accept the "ignorecase" argument on the command
   line. But since most of its logic was moved to a unit-test in
   3469a23659 (t: port helper/test-hashmap.c to unit-tests/t-hashmap.c,
   2024-08-03), it now ignores its argv entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:11 -07:00
Junio C Hamano e7f86cb69d Merge branch 'jc/refs-symref-referent'
The refs API has been taught to give symref target information to
the users of ref iterators, allowing for-each-ref and friends to
avoid an extra ref_resolve_* API call per a symbolic ref.

* jc/refs-symref-referent:
  ref-filter: populate symref from iterator
  refs: add referent to each_ref_fn
  refs: keep track of unresolved reference value in iterators
2024-08-15 13:22:15 -07:00
Junio C Hamano a3d71f2076 Merge branch 'gt/unit-test-hashmap'
An existing test of hashmap API has been rewritten with the
unit-test framework.

* gt/unit-test-hashmap:
  t: port helper/test-hashmap.c to unit-tests/t-hashmap.c
2024-08-15 13:22:12 -07:00
Junio C Hamano 7b11e20bff Merge branch 'cp/unit-test-reftable-tree'
A test in reftable library has been rewritten using the unit test
framework.

* cp/unit-test-reftable-tree:
  t-reftable-tree: improve the test for infix_walk()
  t-reftable-tree: add test for non-existent key
  t-reftable-tree: split test_tree() into two sub-test functions
  t: move reftable/tree_test.c to the unit testing framework
  reftable: remove unnecessary curly braces in reftable/tree.c
2024-08-14 14:54:56 -07:00
Junio C Hamano 61fd5de05f Merge branch 'kl/test-fixes'
A flakey test and incorrect calls to strtoX() functions have been
fixed.

* kl/test-fixes:
  t6421: fix test to work when repo dir contains d0
  set errno=0 before strtoX calls
2024-08-14 14:54:55 -07:00
Junio C Hamano d65332f241 Merge branch 'cp/unit-test-reftable-pq'
The tests for "pq" part of reftable library got rewritten to use
the unit test framework.

* cp/unit-test-reftable-pq:
  t-reftable-pq: add tests for merged_iter_pqueue_top()
  t-reftable-pq: add test for index based comparison
  t-reftable-pq: make merged_iter_pqueue_check() callable by reference
  t-reftable-pq: make merged_iter_pqueue_check() static
  t: move reftable/pq_test.c to the unit testing framework
  reftable: change the type of array indices to 'size_t' in reftable/pq.c
  reftable: remove unnecessary curly braces in reftable/pq.c
2024-08-14 14:54:48 -07:00
Junio C Hamano 4385f8a52d Merge branch 'ps/leakfixes-part-3'
More leakfixes.

* ps/leakfixes-part-3: (24 commits)
  commit-reach: fix trivial memory leak when computing reachability
  convert: fix leaking config strings
  entry: fix leaking pathnames during delayed checkout
  object-name: fix leaking commit list items
  t/test-repository: fix leaking repository
  builtin/credential-cache: fix trivial leaks
  builtin/worktree: fix leaking derived branch names
  builtin/shortlog: fix various trivial memory leaks
  builtin/rerere: fix various trivial memory leaks
  builtin/credential-store: fix leaking credential
  builtin/show-branch: fix several memory leaks
  builtin/rev-parse: fix memory leak with `--parseopt`
  builtin/stash: fix various trivial memory leaks
  builtin/remote: fix various trivial memory leaks
  builtin/remote: fix leaking strings in `branch_list`
  builtin/ls-remote: fix leaking `pattern` strings
  builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
  builtin/submodule--helper: fix leaking clone depth parameter
  builtin/name-rev: fix various trivial memory leaks
  builtin/describe: fix trivial memory leak when describing blob
  ...
2024-08-14 14:54:47 -07:00
Derrick Stolee e32eaf73b0 commit-reach: add get_branch_base_for_tip
Add a new reachability algorithm that intends to discover (from a heuristic)
which branch was used as the starting point for a given commit. Add focused
tests using the 'test-tool reach' command.

In repositories that use pull requests (or merge requests) to advance one or
more "protected" branches, the history of that reference can be recovered by
following the first-parent history in most cases. Most are completed using
no-fast-forward merges, though squash merges are quite common. Less common
is rebase-and-merge, which still validates this assumption. Finally, the
case that breaks this assumption is the fast-forward update (with potential
rebasing).  Even in this case, the previous commit commonly appears in the
first-parent history of the branch.

Similar assumptions can be made for a topic branch created by a single user
with the intention to merge back into another branch. Using 'git commit',
'git merge', and 'git cherry-pick' from HEAD will default to having the
first-parent commit be the previous commit at HEAD. This history changes
only with commands such as 'git reset' or 'git rebase', where the command
names also imply that the branch is starting from a new location.

With this movement of branches in mind, the following heuristic is proposed
as a way to determine the base branch for a given source branch:

  Among a list of candidate base branches, select the candidate that
  minimizes the number of commits in the first-parent history of the source
  that are not in the first-parent history of the candidate.

Prior third-party solutions to this problem have used this optimization
criteria, but have relied upon extracting the first-parent history and
comparing those lists as tables instead of using commit-graph walks.

Given current command-line interface options, this optimization criteria is
not easy to detect directly. Even using the command

  git rev-list --count --first-parent <base>..<source>

does not measure this count, as it uses full reachability from <base> to
determine which commits to remove from the range '<base>..<source>'. This
may lead to one asking if we should instead be using the full reachability
of the candidate and only the first-parent history of the source. This,
unfortunately, does not work for repositories that use long-lived branches
and automation to merge across those branches.

In extremely large repositories, merging into a single trunk may not be
feasible.  This is usually due to the desired frequency of updates
(thousands of engineers doing daily work) combined with the time required to
perform a validation build.  These factors combine to create significant
risk of semantic merge conflicts, leading to build breaks on the trunk. In
response, repository maintainers can create a single Level Zero (L0) trunk
and multiple Level One (L1) branches. By partitioning the engineers by
organization, these engineers may see lower risk of semantic merge conflicts
as well as be protected against build breaks in other L1 branches. The key
to making this system work is a semi-automated process of merging L1
branches into the L0 trunk and vice-versa.  In a large enough organization,
these L1 branches may further split into L2 or L3 branches, but the same
principles apply for merging across deeper levels.

If these automated merges use a typical merge with the second parent
bringing in the "new" content, then each L0 and L1 branch can track its
previous positions by following first-parent history, which appear as
parallel paths (until reaching the first place where the branches diverged).
If we also walk to second parents, then the histories overlap significantly
and cannot be distinguished except for very-recent changes.

For this reason, the first-parent condition should be symmetrical across the
base and source branches.

Another common case for desiring the result of this optimization method is
the use of release branches. When releasing a version of a repository, a
branch can be used to track that release. Any updates that are worth fixing
in that release can be merged to the release branch and shipped with only
the necessary fixes without any new features introduced in the trunk branch.
The 'maint-2.<X>' branches represent this pattern in the Git project. The
microsoft/git fork uses 'vfs-2.<X>.<Y>' branches to track the changes that
are custom to that fork on top of each upstream Git release 2.<X>.<Y>. This
application doesn't need the symmetrical first-parent condition, but the use
of first-parent histories does not change the results for these branches.

To determine the base branch from a list of candidates, create a new method
in commit-reach.c that performs a single* commit-graph walk. The core
concept is to walk first-parents starting at the candidate bases and the
source, tracking the "best" base to reach a given commit. Use generation
numbers to ensure that a commit is walked at most once and all children have
been explored before visiting it.  When reaching a commit that is reachable
from both a base and the source, we will then have a guarantee that this is
the closest intersection of first-parent histories. Track the best base to
reach that commit and return it as a result. In rare cases involving
multiple root commits, the first-parent history of the source may never
intersect any of the candidates and thus a null result is returned.

* There are up to two walks, since we require all commits to have a computed
  generation number in order to avoid incorrect results. This is similar to
  the need for computed generation numbers in ahead_behind() as implemented
  in fd67d149bd (commit-reach: implement ahead_behind() logic, 2023-03-20).

In order to track the "best" base, use a new commit slab that stores an
integer.  This value defaults to zero upon initialization, so use -1 to
track that the source commit can reach this commit and use 'i + 1' to track
that the ith base can reach this commit. When multiple bases can reach a
commit, minimize the index to break ties. This allows the caller to specify
an order to the bases that determines some amount of preference when the
heuristic does not result in a unique result.

The trickiest part of the integer slab is what happens when reaching a
collision among the histories of the bases and the history of the source.
This is noticed when viewing the first parent and seeing that it has a slab
value that differs in sign (negative or positive). In this case, the
collision commit is stored in the method variable 'branch_point' and its
slab value is set to -1. The index of the best base (so far) is stored in
the method variable 'best_index'. It is possible that there are multiple
commits that have the branch_point as its first parent, leading to multiple
updates of best_index.  The result is determined when 'branch_point' is
visited in the commit walk, giving the guarantee that all commits that could
reach 'branch_point' were visited.

Several interesting cases of collisions and different results are tested in
the t6600-test-reach.sh script. Recall that this script also tests the
algorithm in three possible states involving the commit-graph file and how
many commits are written in the file. This provides some coverage of the
need (and lack of need) for the ensure_generations_valid() method.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:10:05 -07:00
Chandra Pratap 5b539a5361 t: move reftable/readwrite_test.c to the unit testing framework
reftable/readwrite_test.c exercises the functions defined in
reftable/reader.{c,h} and reftable/writer.{c,h}. Migrate
reftable/readwrite_test.c to the unit testing framework. Migration
involves refactoring the tests to use the unit testing framework
instead of reftable's test framework and renaming the tests to
align with unit-tests' naming conventions.

Since some tests in reftable/readwrite_test.c use the functions
set_test_hash(), noop_flush() and strbuf_add_void() defined in
reftable/test_framework.{c,h} but these files are not #included
in the ported unit test, copy these functions in the new test file.

While at it, ensure structs are 0-initialized with '= { 0 }'
instead of '= { NULL }'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:08:02 -07:00
Patrick Steinhardt 219de841d9 global: prepare for hiding away repo-less config functions
We're about to hide config functions that implicitly depend on
`the_repository` behind the `USE_THE_REPOSITORY_VARIABLE` macro. This
will uncover a bunch of dependents that transitively relied on the
global variable, but didn't define the macro yet.

Adapt them such that we define the macro to prepare for this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:05 -07:00
John Cai e8207717f1 refs: add referent to each_ref_fn
Add a parameter to each_ref_fn so that callers to the ref APIs
that use this function as a callback can have acess to the
unresolved value of a symbolic ref.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:47:34 -07:00
Taylor Blau b80236d0e3 midx: support reading incremental MIDX chains
Now that the MIDX machinery's internals have been taught to understand
incremental MIDXs over the previous handful of commits, the MIDX
machinery itself can begin reading incremental MIDXs.

(Note that while the on-disk format for incremental MIDXs has been
defined, the writing end has not been implemented. This will take place
in the commit after next.)

The core of this change involves following the order specified in the
MIDX chain in reverse and opening up MIDXs in the chain one-by-one,
adding them to the previous layer's `->base_midx` pointer at each step.

In order to implement this, the `load_multi_pack_index()` function is
taught to call a new `load_multi_pack_index_chain()` function if loading
a non-incremental MIDX failed via `load_multi_pack_index_one()`.

When loading a MIDX chain, `load_midx_chain_fd_st()` reads each line in
the file one-by-one and dispatches calls to
`load_multi_pack_index_one()` to read each layer of the MIDX chain. When
a layer was successfully read, it is added to the MIDX chain by calling
`add_midx_to_chain()` which validates the contents of the `BASE` chunk,
performs some bounds checks on the number of combined packs and objects,
and attaches the new MIDX by assigning its `base_midx` pointer to the
existing part of the chain.

As a supplement to this, introduce a new mode in the test-read-midx
test-tool which allows us to read the information for a specific MIDX in
the chain by specifying its trailing checksum via the command-line
arguments like so:

    $ test-tool read-midx .git/objects [checksum]

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
Ghanshyam Thakkar 3469a23659 t: port helper/test-hashmap.c to unit-tests/t-hashmap.c
helper/test-hashmap.c along with t0011-hashmap.sh test the hashmap.h
library. Migrate them to the unit testing framework for better
debugging, runtime performance and concise code.

Along with the migration, make 'add' tests from the shell script order
agnostic in unit tests, since they iterate over entries with the same
keys and we do not guarantee the order. This was already done for the
'iterate' tests[1].

The helper/test-hashmap.c is still not removed because it contains a
performance test meant to be run by the user directly (not used in
t/perf). And it makes sense for such a utility to be a helper.

[1]: e1e7a77141 (t: sort output of hashmap iteration, 2019-07-30)

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Josh Steadmon <steadmon@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 09:25:54 -07:00
Kyle Lippincott b928d57ca9 set errno=0 before strtoX calls
To detect conversion failure after calls to functions like `strtod`, one
can check `errno == ERANGE`. These functions are not guaranteed to set
`errno` to `0` on successful conversion, however. Manual manipulation of
`errno` can likely be avoided by checking that the output pointer
differs from the input pointer, but that's not how other locations, such
as parse.c:139, handle this issue; they set errno to 0 prior to
executing the function.

For every place I could find a strtoX function with an ERANGE check
following it, set `errno = 0;` prior to executing the conversion
function.

Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-05 10:59:20 -07:00
Chandra Pratap ec9c0704fc t: move reftable/tree_test.c to the unit testing framework
reftable/tree_test.c exercises the functions defined in
reftable/tree.{c, h}. Migrate reftable/tree_test.c to the unit
testing framework. Migration involves refactoring the tests to use
the unit testing framework instead of reftable's test framework and
renaming the tests to align with unit-tests' standards.

Also add a comment to help understand the test routine.

Note that this commit mostly moves the test from reftable/ to
t/unit-tests/ and most of the refactoring is performed by the
trailing commits.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-04 09:50:26 -07:00
Chandra Pratap a08ea27cd0 t: move reftable/pq_test.c to the unit testing framework
reftable/pq_test.c exercises a priority queue defined by
reftable/pq.{c, h}. Migrate reftable/pq_test.c to the unit testing
framework. Migration involves refactoring the tests to use the unit
testing framework instead of reftable's test framework, and
renaming the tests to align with unit-tests' standards.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 09:07:29 -07:00
Patrick Steinhardt 11f841c1cc t/test-repository: fix leaking repository
The test-repository test helper zeroes out `the_repository` such that it
can be sure that our codebase only ends up using the supplied repository
that we initialize in the respective helper functions. This does cause
memory leaks though as the data that `the_repository` has been holding
onto is not referenced anymore.

Fix this by calling `repo_clear()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Junio C Hamano 6c70d65712 Merge branch 'cp/unit-test-reftable-merged'
Another reftable test has been ported to use the unit test framework.

* cp/unit-test-reftable-merged:
  t-reftable-merged: add test for REFTABLE_FORMAT_ERROR
  t-reftable-merged: use reftable_ref_record_equal to compare ref records
  t-reftable-merged: add tests for reftable_merged_table_max_update_index
  t-reftable-merged: improve the const-correctness of helper functions
  t-reftable-merged: improve the test t_merged_single_record()
  t: harmonize t-reftable-merged.c with coding guidelines
  t: move reftable/merged_test.c to the unit testing framework
2024-07-31 13:34:19 -07:00
René Scharfe 96c6304c18 unit-tests: add if_test
The macro TEST only allows defining a test that consists of a single
expression.  Add a new macro, if_test, which provides a way to define
unit tests that are made up of one or more statements.

if_test allows defining self-contained tests en bloc, a bit like
test_expect_success does for regular tests.  It acts like a conditional;
the test body is executed if test_skip_all() had not been called before.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:32:43 -07:00
René Scharfe 1f452d6c68 unit-tests: show location of checks outside of tests
Checks outside of tests are caught at runtime and reported like this:

 Assertion failed: (ctx.running), function test_assert, file test-lib.c, line 267.

The assert() call aborts the unit test and doesn't reveal the location
or even the type of the offending check, as test_assert() is called by
all of them.

Handle it like the opposite case, a test without any checks: Don't
abort, but report the location of the actual check, along with a message
explaining the situation.  The output for example above becomes:

 # BUG: check outside of test at t/helper/test-example-tap.c:75

... and the unit test program continues and indicates the error in its
exit code at the end.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30 13:32:43 -07:00
Junio C Hamano 9118e46e81 Merge branch 'cp/unit-test-reftable-record'
A test in reftable library has been rewritten using the unit test
framework.

* cp/unit-test-reftable-record:
  t-reftable-record: add tests for reftable_log_record_compare_key()
  t-reftable-record: add tests for reftable_ref_record_compare_name()
  t-reftable-record: add index tests for reftable_record_is_deletion()
  t-reftable-record: add obj tests for reftable_record_is_deletion()
  t-reftable-record: add log tests for reftable_record_is_deletion()
  t-reftable-record: add ref tests for reftable_record_is_deletion()
  t-reftable-record: add comparison tests for obj records
  t-reftable-record: add comparison tests for index records
  t-reftable-record: add comparison tests for ref records
  t-reftable-record: add reftable_record_cmp() tests for log records
  t: move reftable/record_test.c to the unit testing framework
2024-07-15 10:11:44 -07:00
Junio C Hamano cda729581b Merge branch 'gt/unit-test-oidmap'
An existing test of oidmap API has been rewritten with the
unit-test framework.

* gt/unit-test-oidmap:
  t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
2024-07-15 10:11:40 -07:00
Chandra Pratap 9cdfd1d7df t: move reftable/merged_test.c to the unit testing framework
reftable/merged_test.c exercises the functions defined in
reftable/merged.{c, h}. Migrate reftable/merged_test.c to the unit
testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests according to unit-tests' naming
conventions.

Also, move strbuf_add_void() and noop_flush() from
reftable/test_framework.c to the ported test. This is because
both these functions are used in the merged tests and
reftable/test_framework.{c, h} is not #included in the ported test.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:55:39 -07:00
Junio C Hamano 3997614c24 Merge branch 'ps/leakfixes-more'
More memory leaks have been plugged.

* ps/leakfixes-more: (29 commits)
  builtin/blame: fix leaking ignore revs files
  builtin/blame: fix leaking prefixed paths
  blame: fix leaking data for blame scoreboards
  line-range: plug leaking find functions
  merge: fix leaking merge bases
  builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
  sequencer: fix memory leaks in `make_script_with_merges()`
  builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
  apply: fix leaking string in `match_fragment()`
  sequencer: fix leaking string buffer in `commit_staged_changes()`
  commit: fix leaking parents when calling `commit_tree_extended()`
  config: fix leaking "core.notesref" variable
  rerere: fix various trivial leaks
  builtin/stash: fix leak in `show_stash()`
  revision: free diff options
  builtin/log: fix leaking commit list in git-cherry(1)
  merge-recursive: fix memory leak when finalizing merge
  builtin/merge-recursive: fix leaking object ID bases
  builtin/difftool: plug memory leaks in `run_dir_diff()`
  object-name: free leaking object contexts
  ...
2024-07-08 14:53:10 -07:00
Junio C Hamano ecf7fc600a Merge branch 'tb/path-filter-fix'
The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.

* tb/path-filter-fix:
  bloom: introduce `deinit_bloom_filters()`
  commit-graph: reuse existing Bloom filters where possible
  object.h: fix mis-aligned flag bits table
  commit-graph: new Bloom filter version that fixes murmur3
  commit-graph: unconditionally load Bloom filters
  bloom: prepare to discard incompatible Bloom filters
  bloom: annotate filters with hash version
  repo-settings: introduce commitgraph.changedPathsVersion
  t4216: test changed path filters with high bit paths
  t/helper/test-read-graph: implement `bloom-filters` mode
  bloom.h: make `load_bloom_filter_from_graph()` public
  t/helper/test-read-graph.c: extract `dump_graph_info()`
  gitformat-commit-graph: describe version 2 of BDAT
  commit-graph: ensure Bloom filters are read with consistent settings
  revision.c: consult Bloom filters for root commits
  t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
2024-07-08 14:53:10 -07:00
Ghanshyam Thakkar 28c1c07700 t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
helper/test-oidmap.c along with t0016-oidmap.sh test the oidmap.h
library which is built on top of hashmap.h.

Migrate them to the unit testing framework for better performance,
concise code and better debugging. Along with the migration also plug
memory leaks and make the test logic independent for all the tests.
The migration removes 'put' tests from t0016, because it is used as
setup to all the other tests, so testing it separately does not yield
any benefit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-03 09:12:14 -07:00
Junio C Hamano ca463101c8 Merge branch 'jk/remote-wo-url'
Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.

* jk/remote-wo-url:
  remote: drop checks for zero-url case
  remote: always require at least one url in a remote
  t5801: test remote.*.vcs config
  t5801: make remote-testgit GIT_DIR setup more robust
  remote: allow resetting url list
  config: document remote.*.url/pushurl interaction
  remote: simplify url/pushurl selection
  remote: use strvecs to store remote url/pushurl
  remote: transfer ownership of memory in add_url(), etc
  remote: refactor alias_url() memory ownership
  archive: fix check for missing url
2024-07-02 09:59:01 -07:00
Junio C Hamano 7b472da915 Merge branch 'ps/use-the-repository'
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.

* ps/use-the-repository:
  hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
  t/helper: remove dependency on `the_repository` in "proc-receive"
  t/helper: fix segfault in "oid-array" command without repository
  t/helper: use correct object hash in partial-clone helper
  compat/fsmonitor: fix socket path in networked SHA256 repos
  replace-object: use hash algorithm from passed-in repository
  protocol-caps: use hash algorithm from passed-in repository
  oidset: pass hash algorithm when parsing file
  http-fetch: don't crash when parsing packfile without a repo
  hash-ll: merge with "hash.h"
  refs: avoid include cycle with "repository.h"
  global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
  hash: require hash algorithm in `empty_tree_oid_hex()`
  hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
  hash: make `is_null_oid()` independent of `the_repository`
  hash: convert `oidcmp()` and `oideq()` to compare whole hash
  global: ensure that object IDs are always padded
  hash: require hash algorithm in `oidread()` and `oidclr()`
  hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
  hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
2024-07-02 09:59:00 -07:00
Chandra Pratap ba9661b457 t: move reftable/record_test.c to the unit testing framework
reftable/record_test.c exercises the functions defined in
reftable/record.{c, h}. Migrate reftable/record_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework, and renaming the tests to fit unit-tests' naming scheme.

While at it, change the type of index variable 'i' to 'size_t'
from 'int'. This is because 'i' is used in comparison against
'ARRAY_SIZE(x)' which is of type 'size_t'.

Also, use set_hash() which is defined locally in the test file
instead of set_test_hash() which is defined by
reftable/test_framework.{c, h}. This is fine to do as both these
functions are similarly implemented, and
reftable/test_framework.{c, h} is not #included in the ported test.

Get rid of reftable_record_print() from the tests as well, because
it clutters the test framework's output and we have no way of
verifying the output.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 08:12:24 -07:00
Taylor Blau ba5a81d52b commit-graph: new Bloom filter version that fixes murmur3
The murmur3 implementation in bloom.c has a bug when converting series
of 4 bytes into network-order integers when char is signed (which is
controllable by a compiler option, and the default signedness of char is
platform-specific). When a string contains characters with the high bit
set, this bug causes results that, although internally consistent within
Git, does not accord with other implementations of murmur3 (thus,
the changed path filters wouldn't be readable by other off-the-shelf
implementatios of murmur3) and even with Git binaries that were compiled
with different signedness of char. This bug affects both how Git writes
changed path filters to disk and how Git interprets changed path filters
on disk.

Therefore, introduce a new version (2) of changed path filters that
corrects this problem. The existing version (1) is still supported and
is still the default, but users should migrate away from it as soon
as possible.

Because this bug only manifests with characters that have the high bit
set, it may be possible that some (or all) commits in a given repo would
have the same changed path filter both before and after this fix is
applied. However, in order to determine whether this is the case, the
changed paths would first have to be computed, at which point it is not
much more expensive to just compute a new changed path filter.

So this patch does not include any mechanism to "salvage" changed path
filters from repositories. There is also no "mixed" mode - for each
invocation of Git, reading and writing changed path filters are done
with the same version number; this version number may be explicitly
stated (typically if the user knows which version they need) or
automatically determined from the version of the existing changed path
filters in the repository.

There is a change in write_commit_graph(). graph_read_bloom_data()
makes it possible for chunk_bloom_data to be non-NULL but
bloom_filter_settings to be NULL, which causes a segfault later on. I
produced such a segfault while developing this patch, but couldn't find
a way to reproduce it neither after this complete patch (or before),
but in any case it seemed like a good thing to include that might help
future patch authors.

The value in t0095 was obtained from another murmur3 implementation
using the following Go source code:

  package main

  import "fmt"
  import "github.com/spaolacci/murmur3"

  func main() {
          fmt.Printf("%x\n", murmur3.Sum32([]byte("Hello world!")))
          fmt.Printf("%x\n", murmur3.Sum32([]byte{0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff}))
  }

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00
Taylor Blau 57982b8f2a t/helper/test-read-graph: implement `bloom-filters` mode
Implement a mode of the "read-graph" test helper to dump out the
hexadecimal contents of the Bloom filter(s) contained in a commit-graph.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-25 13:52:06 -07:00