The code in the reftable library has been cleaned up by discarding
unused "generic" interface.
* ps/reftable-drop-generic:
reftable: mark unused parameters in empty iterator functions
reftable/generic: drop interface
t/helper: refactor to not use `struct reftable_table`
t/helper: use `hash_to_hex_algop()` to print hashes
t/helper: inline printing of reftable records
t/helper: inline `reftable_table_print()`
t/helper: inline `reftable_stack_print_directory()`
t/helper: inline `reftable_reader_print_file()`
t/helper: inline `reftable_dump_main()`
reftable/dump: drop unused `compact_stack()`
reftable/generic: move generic iterator code into iterator interface
reftable/iter: drop double-checking logic
reftable/stack: open-code reading refs
reftable/merged: stop using generic tables in the merged table
reftable/merged: rename `reftable_new_merged_table()`
reftable/merged: expose functions to initialize iterators
'git for-each-ref' learned a new "--format" atom to find the branch
that the history leading to a given commit "%(is-base:<commit>)" is
likely based on.
* ds/for-each-ref-is-base:
p1500: add is-base performance tests
for-each-ref: add 'is-base' token
commit: add gentle reference lookup method
commit-reach: add get_branch_base_for_tip
Mark unused parameters as UNUSED to squelch -Wunused warnings.
* jk/mark-unused-parameters:
t-hashmap: stop calling setup() for t_intern() test
scalar: mark unused parameters in dummy function
daemon: mark unused parameters in non-posix fallbacks
setup: mark unused parameter in config callback
test-mergesort: mark unused parameters in trivial callback
t-hashmap: mark unused parameters in callback function
reftable: mark unused parameters in virtual functions
reftable: drop obsolete test function declarations
reftable: ignore unused argc/argv in test functions
unit-tests: ignore unused argc/argv
t/helper: mark more unused argv/argc arguments
oss-fuzz: mark unused argv/argc argument
refs: mark unused parameters in do_for_each_reflog_helper()
refs: mark unused parameters in ref_store fsck callbacks
update-ref: mark more unused parameters in parser callbacks
imap-send: mark unused parameter in ssl_socket_connect() fallback
* cp/unit-test-reftable-readwrite:
t-reftable-readwrite: add test for known error
t-reftable-readwrite: use 'for' in place of infinite 'while' loops
t-reftable-readwrite: use free_names() instead of a for loop
t: move reftable/readwrite_test.c to the unit testing framework
Use of API functions that implicitly depend on the_repository
object in the config subsystem has been rewritten to pass a
repository object through the callchain.
* ps/config-wo-the-repository:
config: hide functions using `the_repository` by default
global: prepare for hiding away repo-less config functions
config: don't depend on `the_repository` with branch conditions
config: don't have setters depend on `the_repository`
config: pass repo to functions that rename or copy sections
config: pass repo to `git_die_config()`
config: pass repo to `git_config_get_expiry_in_days()`
config: pass repo to `git_config_get_expiry()`
config: pass repo to `git_config_get_max_percent_split_change()`
config: pass repo to `git_config_get_split_index()`
config: pass repo to `git_config_get_index_threads()`
config: expose `repo_config_clear()`
config: introduce missing setters that take repo as parameter
path: hide functions using `the_repository` by default
path: stop relying on `the_repository` in `worktree_git_path()`
path: stop relying on `the_repository` when reporting garbage
hooks: remove implicit dependency on `the_repository`
editor: do not rely on `the_repository` for interactive edits
path: expose `do_git_common_path()` as `repo_common_pathv()`
path: expose `do_git_path()` as `repo_git_pathv()`
It was recently reported that concurrent reads and writes may cause the
reftable backend to segfault. The root cause of this is that we do not
properly keep track of reftable readers across reloads.
Suppose that you have a reftable iterator and then decide to reload the
stack while iterating through the iterator. When the stack has been
rewritten since we have created the iterator, then we would end up
discarding a subset of readers that may still be in use by the iterator.
The consequence is that we now try to reference deallocated memory,
which of course segfaults.
One way to trigger this is in t5616, where some background maintenance
jobs have been leaking from one test into another. This leads to stack
traces like the following one:
+ git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
AddressSanitizer:DEADLYSIGNAL
=================================================================
==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
==657994==The signal is caused by a READ memory access.
#0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
#1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
#2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
#3 0x55f23e54e72e in block_iter_next reftable/block.c:398
#4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
#5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
#6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
#7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
#8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
#9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
#10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
#11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
#12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
#13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
#14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
#15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
#16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
#17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
#18 0x55f23dba7764 in run_builtin git.c:484
#19 0x55f23dba7764 in handle_builtin git.c:741
#20 0x55f23dbab61e in run_argv git.c:805
#21 0x55f23dbab61e in cmd_main git.c:1000
#22 0x55f23dba4781 in main common-main.c:64
#23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
#25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)
While it is somewhat awkward that the maintenance processes survive
tests in the first place, it is totally expected that reftables should
work alright with concurrent writers. Seemingly they don't.
The only underlying resource that we need to care about in this context
is the reftable reader, which is responsible for reading a single table
from disk. These readers get discarded immediately (unless reused) when
calling `reftable_stack_reload()`, which is wrong. We can only close
them once we know that there are no iterators using them anymore.
Prepare for a fix by converting the reftable readers to be refcounted.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the `reftable_new_reader()` function to `reftable_reader_new()`
to match our coding guidelines.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several of the subcommands of `test-helper read-midx` do not close the
MIDX that they have opened, leading to memory leaks. Fix those.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `struct reftable_table` interface in our "reftable" test helper gets
used such that we can easily print either a single table, or a merged
stack. This generic interface is about to go away.
Prepare the code for this change by using merged tables instead. When
printing the stack we've already got one. When using a single table, we
can create a merged table from it to adapt.
This removes the last user of the generic interface.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "reftable" test helper uses a hand-crafted version to convert from a
raw hash to its hex variant. This was done because this code used to be
part of the reftable library, where we do not use most functions from
the Git core.
Now that the code is integrated into the "dump-reftable" helper though,
that limitation went away. Let's thus use `hash_to_hex_algop()` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move printing of reftable records into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move `reftable_table_print()` into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move `reftable_stack_print_directory()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.
Note that this requires us to remove the tests for this functionality in
`reftable/stack_test.c`. The test does not really add much anyway,
because all it verifies is that we do not crash or run into an error,
and it specifically doesn't check the outputted data. Also, as the code
is now part of the test helper, it doesn't make much sense to have a
unit test for it in the first place.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move `reftable_reader_print_file()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The printing functionality part of `reftable/dump.c` is really only used
by our "dump-reftable" test helper. It is certainly not generic logic
that is useful to anybody outside of Git, and the format it generates is
quite specific. Still, parts of it are used in our test suite and the
output may be useful to take a peek into reftable stacks, tables and
blocks. So while it does not make sense to expose this as part of the
reftable library, it does make sense to keep it around.
Inline the `reftable_dump_main()` function into the "dump-reftable" test
helper. This clarifies that its format is subject to change and not part
of our public interface. Furthermore, this allows us to iterate on the
implementation in subsequent patches.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/block_test.c exercises the functions defined in
reftable/block.{c, h}. Migrate reftable/block_test.c to the unit
testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests to follow the unit-tests'
naming conventions.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
helper/test-urlmatch-normalization along with
t0110-urlmatch-normalization test the `url_normalize()` function from
'urlmatch.h'. Migrate them to the unit testing framework for better
performance. And also add different test_msg()s for better debugging.
In the migration, last two of the checks from `t_url_general_escape()`
were slightly changed compared to the shell script. This involves
changing
'\'' -> '
'\!' -> !
in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
pasted verbatim from the shell script.
Another change is the removal of a MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a
Unicode string, it is not possible to pass arbitrary bytes to a
program. But in unit tests we don't have this limitation.
And since we can construct strings with arbitrary bytes in C, let's
also remove the test files which contain URLs with arbitrary bytes in
the 't/t0110' directory and instead embed those URLs in the unit test
code itself.
[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unit-test framework has learned a simple control structure to allow
embedding test statements in-line instead of having to create a new
function to contain them.
* rs/unit-tests-test-run:
t-strvec: use if_test
t-reftable-basics: use if_test
t-ctype: use if_test
unit-tests: add if_test
unit-tests: show location of checks outside of tests
t0080: use here-doc test body
The mode_copy() function does nothing, but since it's used as a function
pointer within "struct mode", it has to conform to the interface. Mark
it to quiet -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a continuation of 126e3b3d2a (t/helper: mark unused argv/argc
arguments, 2023-03-28) to cover a few new cases:
- test-example-tap was added since that commit
- test-hashmap used to accept the "ignorecase" argument on the command
line. But since most of its logic was moved to a unit-test in
3469a23659 (t: port helper/test-hashmap.c to unit-tests/t-hashmap.c,
2024-08-03), it now ignores its argv entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The refs API has been taught to give symref target information to
the users of ref iterators, allowing for-each-ref and friends to
avoid an extra ref_resolve_* API call per a symbolic ref.
* jc/refs-symref-referent:
ref-filter: populate symref from iterator
refs: add referent to each_ref_fn
refs: keep track of unresolved reference value in iterators
An existing test of hashmap API has been rewritten with the
unit-test framework.
* gt/unit-test-hashmap:
t: port helper/test-hashmap.c to unit-tests/t-hashmap.c
A test in reftable library has been rewritten using the unit test
framework.
* cp/unit-test-reftable-tree:
t-reftable-tree: improve the test for infix_walk()
t-reftable-tree: add test for non-existent key
t-reftable-tree: split test_tree() into two sub-test functions
t: move reftable/tree_test.c to the unit testing framework
reftable: remove unnecessary curly braces in reftable/tree.c
A flakey test and incorrect calls to strtoX() functions have been
fixed.
* kl/test-fixes:
t6421: fix test to work when repo dir contains d0
set errno=0 before strtoX calls
The tests for "pq" part of reftable library got rewritten to use
the unit test framework.
* cp/unit-test-reftable-pq:
t-reftable-pq: add tests for merged_iter_pqueue_top()
t-reftable-pq: add test for index based comparison
t-reftable-pq: make merged_iter_pqueue_check() callable by reference
t-reftable-pq: make merged_iter_pqueue_check() static
t: move reftable/pq_test.c to the unit testing framework
reftable: change the type of array indices to 'size_t' in reftable/pq.c
reftable: remove unnecessary curly braces in reftable/pq.c
Add a new reachability algorithm that intends to discover (from a heuristic)
which branch was used as the starting point for a given commit. Add focused
tests using the 'test-tool reach' command.
In repositories that use pull requests (or merge requests) to advance one or
more "protected" branches, the history of that reference can be recovered by
following the first-parent history in most cases. Most are completed using
no-fast-forward merges, though squash merges are quite common. Less common
is rebase-and-merge, which still validates this assumption. Finally, the
case that breaks this assumption is the fast-forward update (with potential
rebasing). Even in this case, the previous commit commonly appears in the
first-parent history of the branch.
Similar assumptions can be made for a topic branch created by a single user
with the intention to merge back into another branch. Using 'git commit',
'git merge', and 'git cherry-pick' from HEAD will default to having the
first-parent commit be the previous commit at HEAD. This history changes
only with commands such as 'git reset' or 'git rebase', where the command
names also imply that the branch is starting from a new location.
With this movement of branches in mind, the following heuristic is proposed
as a way to determine the base branch for a given source branch:
Among a list of candidate base branches, select the candidate that
minimizes the number of commits in the first-parent history of the source
that are not in the first-parent history of the candidate.
Prior third-party solutions to this problem have used this optimization
criteria, but have relied upon extracting the first-parent history and
comparing those lists as tables instead of using commit-graph walks.
Given current command-line interface options, this optimization criteria is
not easy to detect directly. Even using the command
git rev-list --count --first-parent <base>..<source>
does not measure this count, as it uses full reachability from <base> to
determine which commits to remove from the range '<base>..<source>'. This
may lead to one asking if we should instead be using the full reachability
of the candidate and only the first-parent history of the source. This,
unfortunately, does not work for repositories that use long-lived branches
and automation to merge across those branches.
In extremely large repositories, merging into a single trunk may not be
feasible. This is usually due to the desired frequency of updates
(thousands of engineers doing daily work) combined with the time required to
perform a validation build. These factors combine to create significant
risk of semantic merge conflicts, leading to build breaks on the trunk. In
response, repository maintainers can create a single Level Zero (L0) trunk
and multiple Level One (L1) branches. By partitioning the engineers by
organization, these engineers may see lower risk of semantic merge conflicts
as well as be protected against build breaks in other L1 branches. The key
to making this system work is a semi-automated process of merging L1
branches into the L0 trunk and vice-versa. In a large enough organization,
these L1 branches may further split into L2 or L3 branches, but the same
principles apply for merging across deeper levels.
If these automated merges use a typical merge with the second parent
bringing in the "new" content, then each L0 and L1 branch can track its
previous positions by following first-parent history, which appear as
parallel paths (until reaching the first place where the branches diverged).
If we also walk to second parents, then the histories overlap significantly
and cannot be distinguished except for very-recent changes.
For this reason, the first-parent condition should be symmetrical across the
base and source branches.
Another common case for desiring the result of this optimization method is
the use of release branches. When releasing a version of a repository, a
branch can be used to track that release. Any updates that are worth fixing
in that release can be merged to the release branch and shipped with only
the necessary fixes without any new features introduced in the trunk branch.
The 'maint-2.<X>' branches represent this pattern in the Git project. The
microsoft/git fork uses 'vfs-2.<X>.<Y>' branches to track the changes that
are custom to that fork on top of each upstream Git release 2.<X>.<Y>. This
application doesn't need the symmetrical first-parent condition, but the use
of first-parent histories does not change the results for these branches.
To determine the base branch from a list of candidates, create a new method
in commit-reach.c that performs a single* commit-graph walk. The core
concept is to walk first-parents starting at the candidate bases and the
source, tracking the "best" base to reach a given commit. Use generation
numbers to ensure that a commit is walked at most once and all children have
been explored before visiting it. When reaching a commit that is reachable
from both a base and the source, we will then have a guarantee that this is
the closest intersection of first-parent histories. Track the best base to
reach that commit and return it as a result. In rare cases involving
multiple root commits, the first-parent history of the source may never
intersect any of the candidates and thus a null result is returned.
* There are up to two walks, since we require all commits to have a computed
generation number in order to avoid incorrect results. This is similar to
the need for computed generation numbers in ahead_behind() as implemented
in fd67d149bd (commit-reach: implement ahead_behind() logic, 2023-03-20).
In order to track the "best" base, use a new commit slab that stores an
integer. This value defaults to zero upon initialization, so use -1 to
track that the source commit can reach this commit and use 'i + 1' to track
that the ith base can reach this commit. When multiple bases can reach a
commit, minimize the index to break ties. This allows the caller to specify
an order to the bases that determines some amount of preference when the
heuristic does not result in a unique result.
The trickiest part of the integer slab is what happens when reaching a
collision among the histories of the bases and the history of the source.
This is noticed when viewing the first parent and seeing that it has a slab
value that differs in sign (negative or positive). In this case, the
collision commit is stored in the method variable 'branch_point' and its
slab value is set to -1. The index of the best base (so far) is stored in
the method variable 'best_index'. It is possible that there are multiple
commits that have the branch_point as its first parent, leading to multiple
updates of best_index. The result is determined when 'branch_point' is
visited in the commit walk, giving the guarantee that all commits that could
reach 'branch_point' were visited.
Several interesting cases of collisions and different results are tested in
the t6600-test-reach.sh script. Recall that this script also tests the
algorithm in three possible states involving the commit-graph file and how
many commits are written in the file. This provides some coverage of the
need (and lack of need) for the ensure_generations_valid() method.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/readwrite_test.c exercises the functions defined in
reftable/reader.{c,h} and reftable/writer.{c,h}. Migrate
reftable/readwrite_test.c to the unit testing framework. Migration
involves refactoring the tests to use the unit testing framework
instead of reftable's test framework and renaming the tests to
align with unit-tests' naming conventions.
Since some tests in reftable/readwrite_test.c use the functions
set_test_hash(), noop_flush() and strbuf_add_void() defined in
reftable/test_framework.{c,h} but these files are not #included
in the ported unit test, copy these functions in the new test file.
While at it, ensure structs are 0-initialized with '= { 0 }'
instead of '= { NULL }'.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're about to hide config functions that implicitly depend on
`the_repository` behind the `USE_THE_REPOSITORY_VARIABLE` macro. This
will uncover a bunch of dependents that transitively relied on the
global variable, but didn't define the macro yet.
Adapt them such that we define the macro to prepare for this change.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a parameter to each_ref_fn so that callers to the ref APIs
that use this function as a callback can have acess to the
unresolved value of a symbolic ref.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the MIDX machinery's internals have been taught to understand
incremental MIDXs over the previous handful of commits, the MIDX
machinery itself can begin reading incremental MIDXs.
(Note that while the on-disk format for incremental MIDXs has been
defined, the writing end has not been implemented. This will take place
in the commit after next.)
The core of this change involves following the order specified in the
MIDX chain in reverse and opening up MIDXs in the chain one-by-one,
adding them to the previous layer's `->base_midx` pointer at each step.
In order to implement this, the `load_multi_pack_index()` function is
taught to call a new `load_multi_pack_index_chain()` function if loading
a non-incremental MIDX failed via `load_multi_pack_index_one()`.
When loading a MIDX chain, `load_midx_chain_fd_st()` reads each line in
the file one-by-one and dispatches calls to
`load_multi_pack_index_one()` to read each layer of the MIDX chain. When
a layer was successfully read, it is added to the MIDX chain by calling
`add_midx_to_chain()` which validates the contents of the `BASE` chunk,
performs some bounds checks on the number of combined packs and objects,
and attaches the new MIDX by assigning its `base_midx` pointer to the
existing part of the chain.
As a supplement to this, introduce a new mode in the test-read-midx
test-tool which allows us to read the information for a specific MIDX in
the chain by specifying its trailing checksum via the command-line
arguments like so:
$ test-tool read-midx .git/objects [checksum]
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
helper/test-hashmap.c along with t0011-hashmap.sh test the hashmap.h
library. Migrate them to the unit testing framework for better
debugging, runtime performance and concise code.
Along with the migration, make 'add' tests from the shell script order
agnostic in unit tests, since they iterate over entries with the same
keys and we do not guarantee the order. This was already done for the
'iterate' tests[1].
The helper/test-hashmap.c is still not removed because it contains a
performance test meant to be run by the user directly (not used in
t/perf). And it makes sense for such a utility to be a helper.
[1]: e1e7a77141 (t: sort output of hashmap iteration, 2019-07-30)
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Josh Steadmon <steadmon@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To detect conversion failure after calls to functions like `strtod`, one
can check `errno == ERANGE`. These functions are not guaranteed to set
`errno` to `0` on successful conversion, however. Manual manipulation of
`errno` can likely be avoided by checking that the output pointer
differs from the input pointer, but that's not how other locations, such
as parse.c:139, handle this issue; they set errno to 0 prior to
executing the function.
For every place I could find a strtoX function with an ERANGE check
following it, set `errno = 0;` prior to executing the conversion
function.
Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/tree_test.c exercises the functions defined in
reftable/tree.{c, h}. Migrate reftable/tree_test.c to the unit
testing framework. Migration involves refactoring the tests to use
the unit testing framework instead of reftable's test framework and
renaming the tests to align with unit-tests' standards.
Also add a comment to help understand the test routine.
Note that this commit mostly moves the test from reftable/ to
t/unit-tests/ and most of the refactoring is performed by the
trailing commits.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
reftable/pq_test.c exercises a priority queue defined by
reftable/pq.{c, h}. Migrate reftable/pq_test.c to the unit testing
framework. Migration involves refactoring the tests to use the unit
testing framework instead of reftable's test framework, and
renaming the tests to align with unit-tests' standards.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test-repository test helper zeroes out `the_repository` such that it
can be sure that our codebase only ends up using the supplied repository
that we initialize in the respective helper functions. This does cause
memory leaks though as the data that `the_repository` has been holding
onto is not referenced anymore.
Fix this by calling `repo_clear()` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Another reftable test has been ported to use the unit test framework.
* cp/unit-test-reftable-merged:
t-reftable-merged: add test for REFTABLE_FORMAT_ERROR
t-reftable-merged: use reftable_ref_record_equal to compare ref records
t-reftable-merged: add tests for reftable_merged_table_max_update_index
t-reftable-merged: improve the const-correctness of helper functions
t-reftable-merged: improve the test t_merged_single_record()
t: harmonize t-reftable-merged.c with coding guidelines
t: move reftable/merged_test.c to the unit testing framework
The macro TEST only allows defining a test that consists of a single
expression. Add a new macro, if_test, which provides a way to define
unit tests that are made up of one or more statements.
if_test allows defining self-contained tests en bloc, a bit like
test_expect_success does for regular tests. It acts like a conditional;
the test body is executed if test_skip_all() had not been called before.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Checks outside of tests are caught at runtime and reported like this:
Assertion failed: (ctx.running), function test_assert, file test-lib.c, line 267.
The assert() call aborts the unit test and doesn't reveal the location
or even the type of the offending check, as test_assert() is called by
all of them.
Handle it like the opposite case, a test without any checks: Don't
abort, but report the location of the actual check, along with a message
explaining the situation. The output for example above becomes:
# BUG: check outside of test at t/helper/test-example-tap.c:75
... and the unit test program continues and indicates the error in its
exit code at the end.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A test in reftable library has been rewritten using the unit test
framework.
* cp/unit-test-reftable-record:
t-reftable-record: add tests for reftable_log_record_compare_key()
t-reftable-record: add tests for reftable_ref_record_compare_name()
t-reftable-record: add index tests for reftable_record_is_deletion()
t-reftable-record: add obj tests for reftable_record_is_deletion()
t-reftable-record: add log tests for reftable_record_is_deletion()
t-reftable-record: add ref tests for reftable_record_is_deletion()
t-reftable-record: add comparison tests for obj records
t-reftable-record: add comparison tests for index records
t-reftable-record: add comparison tests for ref records
t-reftable-record: add reftable_record_cmp() tests for log records
t: move reftable/record_test.c to the unit testing framework
An existing test of oidmap API has been rewritten with the
unit-test framework.
* gt/unit-test-oidmap:
t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
reftable/merged_test.c exercises the functions defined in
reftable/merged.{c, h}. Migrate reftable/merged_test.c to the unit
testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests according to unit-tests' naming
conventions.
Also, move strbuf_add_void() and noop_flush() from
reftable/test_framework.c to the ported test. This is because
both these functions are used in the merged tests and
reftable/test_framework.{c, h} is not #included in the ported test.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.
* tb/path-filter-fix:
bloom: introduce `deinit_bloom_filters()`
commit-graph: reuse existing Bloom filters where possible
object.h: fix mis-aligned flag bits table
commit-graph: new Bloom filter version that fixes murmur3
commit-graph: unconditionally load Bloom filters
bloom: prepare to discard incompatible Bloom filters
bloom: annotate filters with hash version
repo-settings: introduce commitgraph.changedPathsVersion
t4216: test changed path filters with high bit paths
t/helper/test-read-graph: implement `bloom-filters` mode
bloom.h: make `load_bloom_filter_from_graph()` public
t/helper/test-read-graph.c: extract `dump_graph_info()`
gitformat-commit-graph: describe version 2 of BDAT
commit-graph: ensure Bloom filters are read with consistent settings
revision.c: consult Bloom filters for root commits
t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
helper/test-oidmap.c along with t0016-oidmap.sh test the oidmap.h
library which is built on top of hashmap.h.
Migrate them to the unit testing framework for better performance,
concise code and better debugging. Along with the migration also plug
memory leaks and make the test logic independent for all the tests.
The migration removes 'put' tests from t0016, because it is used as
setup to all the other tests, so testing it separately does not yield
any benefit.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.
* jk/remote-wo-url:
remote: drop checks for zero-url case
remote: always require at least one url in a remote
t5801: test remote.*.vcs config
t5801: make remote-testgit GIT_DIR setup more robust
remote: allow resetting url list
config: document remote.*.url/pushurl interaction
remote: simplify url/pushurl selection
remote: use strvecs to store remote url/pushurl
remote: transfer ownership of memory in add_url(), etc
remote: refactor alias_url() memory ownership
archive: fix check for missing url
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.
* ps/use-the-repository:
hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
t/helper: remove dependency on `the_repository` in "proc-receive"
t/helper: fix segfault in "oid-array" command without repository
t/helper: use correct object hash in partial-clone helper
compat/fsmonitor: fix socket path in networked SHA256 repos
replace-object: use hash algorithm from passed-in repository
protocol-caps: use hash algorithm from passed-in repository
oidset: pass hash algorithm when parsing file
http-fetch: don't crash when parsing packfile without a repo
hash-ll: merge with "hash.h"
refs: avoid include cycle with "repository.h"
global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
hash: require hash algorithm in `empty_tree_oid_hex()`
hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
hash: make `is_null_oid()` independent of `the_repository`
hash: convert `oidcmp()` and `oideq()` to compare whole hash
global: ensure that object IDs are always padded
hash: require hash algorithm in `oidread()` and `oidclr()`
hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
reftable/record_test.c exercises the functions defined in
reftable/record.{c, h}. Migrate reftable/record_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework, and renaming the tests to fit unit-tests' naming scheme.
While at it, change the type of index variable 'i' to 'size_t'
from 'int'. This is because 'i' is used in comparison against
'ARRAY_SIZE(x)' which is of type 'size_t'.
Also, use set_hash() which is defined locally in the test file
instead of set_test_hash() which is defined by
reftable/test_framework.{c, h}. This is fine to do as both these
functions are similarly implemented, and
reftable/test_framework.{c, h} is not #included in the ported test.
Get rid of reftable_record_print() from the tests as well, because
it clutters the test framework's output and we have no way of
verifying the output.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The murmur3 implementation in bloom.c has a bug when converting series
of 4 bytes into network-order integers when char is signed (which is
controllable by a compiler option, and the default signedness of char is
platform-specific). When a string contains characters with the high bit
set, this bug causes results that, although internally consistent within
Git, does not accord with other implementations of murmur3 (thus,
the changed path filters wouldn't be readable by other off-the-shelf
implementatios of murmur3) and even with Git binaries that were compiled
with different signedness of char. This bug affects both how Git writes
changed path filters to disk and how Git interprets changed path filters
on disk.
Therefore, introduce a new version (2) of changed path filters that
corrects this problem. The existing version (1) is still supported and
is still the default, but users should migrate away from it as soon
as possible.
Because this bug only manifests with characters that have the high bit
set, it may be possible that some (or all) commits in a given repo would
have the same changed path filter both before and after this fix is
applied. However, in order to determine whether this is the case, the
changed paths would first have to be computed, at which point it is not
much more expensive to just compute a new changed path filter.
So this patch does not include any mechanism to "salvage" changed path
filters from repositories. There is also no "mixed" mode - for each
invocation of Git, reading and writing changed path filters are done
with the same version number; this version number may be explicitly
stated (typically if the user knows which version they need) or
automatically determined from the version of the existing changed path
filters in the repository.
There is a change in write_commit_graph(). graph_read_bloom_data()
makes it possible for chunk_bloom_data to be non-NULL but
bloom_filter_settings to be NULL, which causes a segfault later on. I
produced such a segfault while developing this patch, but couldn't find
a way to reproduce it neither after this complete patch (or before),
but in any case it seemed like a good thing to include that might help
future patch authors.
The value in t0095 was obtained from another murmur3 implementation
using the following Go source code:
package main
import "fmt"
import "github.com/spaolacci/murmur3"
func main() {
fmt.Printf("%x\n", murmur3.Sum32([]byte("Hello world!")))
fmt.Printf("%x\n", murmur3.Sum32([]byte{0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff}))
}
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement a mode of the "read-graph" test helper to dump out the
hexadecimal contents of the Bloom filter(s) contained in a commit-graph.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare for the 'read-graph' test helper to perform other tasks besides
dumping high-level information about the commit-graph by extracting its
main routine into a separate function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pseudo-merge reachability bitmap to help more efficient storage
of the reachability bitmap in a repository with too many refs has
been added.
* tb/pseudo-merge-reachability-bitmap: (26 commits)
pack-bitmap.c: ensure pseudo-merge offset reads are bounded
Documentation/technical/bitmap-format.txt: add missing position table
t/perf: implement performance tests for pseudo-merge bitmaps
pseudo-merge: implement support for finding existing merges
ewah: `bitmap_equals_ewah()`
pack-bitmap: extra trace2 information
pack-bitmap.c: use pseudo-merges during traversal
t/test-lib-functions.sh: support `--notick` in `test_commit_bulk()`
pack-bitmap: implement test helpers for pseudo-merge
ewah: implement `ewah_bitmap_popcount()`
pseudo-merge: implement support for reading pseudo-merge commits
pack-bitmap.c: read pseudo-merge extension
pseudo-merge: scaffolding for reads
pack-bitmap: extract `read_bitmap()` function
pack-bitmap-write.c: write pseudo-merge table
pseudo-merge: implement support for selecting pseudo-merge commits
config: introduce `git_config_double()`
pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public
pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()`
pack-bitmap-write: support storing pseudo-merge commits
...
Building with "-Werror -Wwrite-strings" is now supported.
* ps/no-writable-strings: (27 commits)
config.mak.dev: enable `-Wwrite-strings` warning
builtin/merge: always store allocated strings in `pull_twohead`
builtin/rebase: always store allocated string in `options.strategy`
builtin/rebase: do not assign default backend to non-constant field
imap-send: fix leaking memory in `imap_server_conf`
imap-send: drop global `imap_server_conf` variable
mailmap: always store allocated strings in mailmap blob
revision: always store allocated strings in output encoding
remote-curl: avoid assigning string constant to non-const variable
send-pack: always allocate receive status
parse-options: cast long name for OPTION_ALIAS
http: do not assign string constant to non-const field
compat/win32: fix const-correctness with string constants
pretty: add casts for decoration option pointers
object-file: make `buf` parameter of `index_mem()` a constant
object-file: mark cached object buffers as const
ident: add casts for fallback name and GECOS
entry: refactor how we remove items for delayed checkouts
line-log: always allocate the output prefix
line-log: stop assigning string constant to file parent buffer
...
A new command has been added to migrate a repository that uses the
files backend for its ref storage to use the reftable backend, with
limitations.
* ps/ref-storage-migration:
builtin/refs: new command to migrate ref storage formats
refs: implement logic to migrate between ref storage formats
refs: implement removal of ref storages
worktree: don't store main worktree twice
reftable: inline `merged_table_release()`
refs/files: fix NULL pointer deref when releasing ref store
refs/files: extract function to iterate through root refs
refs/files: refactor `add_pseudoref_and_head_entries()`
refs: allow to skip creation of reflog entries
refs: pass storage format to `ref_store_init()` explicitly
refs: convert ref storage format to an enum
setup: unset ref storage when reinitializing repository version
The "proc-receive" test helper implicitly relies on `the_repository` via
`parse_oid_hex()`. This isn't necessary though, and in fact the whole
command does not depend on `the_repository` at all.
Stop setting up `the_repository` and use `parse_oid_hex_any()` to parse
object IDs.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "oid-array" test helper can supposedly work without a Git
repository, but will in fact crash because `the_repository->hash_algo`
is not initialized. This is because `oid_pos()`, which is used by
`oid_array_lookup()`, depends on `the_hash_algo->rawsz`.
Ideally, we'd adapt `oid_pos()` to not depend on `the_hash_algo`
anymore. That is a bigger untertaking though, so instead we fall back to
SHA1 when there is no repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `object_info()` function of the partial-clone helper is responsible
for checking the object ID of a repository other than `the_repository`.
We use `parse_oid_hex()` in this function though, which means that we
still depend on `the_repository->hash_algo`.
Fix this by using the object hash of the function-local repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split
out of hash.h to remove dependency on repository.h, 2023-04-22) to make
explicit the split between hash-related functions that rely on the
global `the_repository`, and those that don't. This split is no longer
necessary now that we we have removed the reliance on `the_repository`.
Merge "hash-ll.h" back into "hash.h". This causes some code units to not
include "repository.h" anymore, which requires us to add some forward
declarations.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.
It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.
Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit
For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).
Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash
function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the previous commit removed the possibility that a "struct
remote" will ever have zero url fields, we can drop a number of
redundant checks and untriggerable code paths.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the url/pushurl fields of "struct remote" own their strings, we
can switch from bare arrays to strvecs. This has a few advantages:
- push/clear are now one-liners
- likewise the free+assigns in alias_all_urls() can use
strvec_replace()
- we now use size_t for storage, avoiding possible overflow
- this will enable some further cleanups in future patches
There's quite a bit of fallout in the code that reads these fields, as
it tends to access these arrays directly. But it's mostly a mechanical
replacement of "url_nr" with "url.nr", and "url[i]" with "url.v[i]",
with a few variations (e.g. "*url" could become "*url.v", but I used
"url.v[0]" for consistency).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/ref-storage-migration:
builtin/refs: new command to migrate ref storage formats
refs: implement logic to migrate between ref storage formats
refs: implement removal of ref storages
worktree: don't store main worktree twice
reftable: inline `merged_table_release()`
refs/files: fix NULL pointer deref when releasing ref store
refs/files: extract function to iterate through root refs
refs/files: refactor `add_pseudoref_and_head_entries()`
refs: allow to skip creation of reflog entries
refs: pass storage format to `ref_store_init()` explicitly
refs: convert ref storage format to an enum
setup: unset ref storage when reinitializing repository version
A test helper that essentially is unit tests on the "decorate"
logic has been rewritten using the unit-tests framework.
* gt/decorate-unit-test:
t/: migrate helper/test-example-decorate to the unit testing framework
Basic unit tests for reftable have been reimplemented under the
unit test framework.
* cp/reftable-unit-test:
t: improve the test-case for parse_names()
t: add test for put_be16()
t: move tests from reftable/record_test.c to the new unit test
t: move tests from reftable/stack_test.c to the new unit test
t: move reftable/basics_test.c to the unit testing framework
helper/test-oidtree.c along with t0069-oidtree.sh test the oidtree.h
library, which is a wrapper around crit-bit tree. Migrate them to
the unit testing framework for better debugging and runtime
performance. Along with the migration, add an extra check for
oidtree_each() test, which showcases how multiple expected matches can
be given to check_each() helper.
To achieve this, introduce a new library called 'lib-oid.h'
exclusively for the unit tests to use. It currently mainly includes
utility to generate object_id from an arbitrary hex string
(i.e. '12a' -> '12a0000000000000000000000000000000000000'). This also
handles the hash algo selection based on GIT_TEST_DEFAULT_HASH.
This library will also be helpful when we port other unit tests such
as oid-array, oidset etc.
Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
[jc: small fixlets squashed in]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `OPT_FILENAME()` option will, if set, put an allocated string into
the user-provided variable. Consequently, that variable thus needs to be
free'd by the caller of `parse_options()`. Some callsites don't though
and thus leak memory. Fix those.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're about to enable `-Wwrite-strings`, which changes the type of
string constants to `const char[]`. Fix various sites where we assign
such constants to non-const variables.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.
Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust jc/fix-2.45.1-and-friends-for-2.39 for more recent
maintenance track.
* jc/fix-2.45.1-and-friends-for-maint:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
The strcmp-offset tests have been rewritten using the unit test
framework.
* gt/unit-test-strcmp-offset:
t/: port helper/test-strcmp-offset.c to unit-tests/t-strcmp-offset.c
Further clean-up the refs subsystem to stop relying on
the_repository, and instead use the repository associated to the
ref_store object.
* ps/refs-without-the-repository-updates:
refs/packed: remove references to `the_hash_algo`
refs/files: remove references to `the_hash_algo`
refs/files: use correct repository
refs: remove `dwim_log()`
refs: drop `git_default_branch_name()`
refs: pass repo when peeling objects
refs: move object peeling into "object.c"
refs: pass ref store when detecting dangling symrefs
refs: convert iteration over replace refs to accept ref store
refs: retrieve worktree ref stores via associated repository
refs: refactor `resolve_gitlink_ref()` to accept a repository
refs: pass repo when retrieving submodule ref store
refs: track ref stores via strmap
refs: implement releasing ref storages
refs: rename `init_db` callback to avoid confusion
refs: adjust names for `init` and `init_db` callbacks
reftable/basics_test.c exercise the functions defined in
reftable/basics.{c, h}. Migrate reftable/basics_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
helper/test-example-decorate.c along with t9004-example.sh provide
an example of how to use the functions in decorate.h (which provides
a data structure that associates Git objects to void pointers) and
also test their output.
Migrate them to the new unit testing framework for better debugging
and runtime performance.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* fixes/2.45.1/2.44:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
* fixes/2.45.1/2.43:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
* fixes/2.45.1/2.42:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
* fixes/2.45.1/2.41:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
* fixes/2.45.1/2.40:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Revert overly aggressive "layered defence" that went into 2.45.1
and friends, which broke "git-lfs", "git-annex", and other use
cases, so that we can rebuild necessary counterparts in the open.
* jc/fix-2.45.1-and-friends-for-2.39:
Revert "fsck: warn about symlink pointing inside a gitdir"
Revert "Add a helper function to compare file contents"
clone: drop the protections where hooks aren't run
tests: verify that `clone -c core.hooksPath=/dev/null` works again
Revert "core.hooksPath: add some protection while cloning"
init: use the correct path of the templates directory again
hook: plug a new memory leak
ci: stop installing "gcc-13" for osx-gcc
ci: avoid bare "gcc" for osx-gcc job
ci: drop mention of BREW_INSTALL_PACKAGES variable
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Implement three new sub-commands for the "bitmap" test-helper:
- t/helper test-tool bitmap dump-pseudo-merges
- t/helper test-tool bitmap dump-pseudo-merge-commits <n>
- t/helper test-tool bitmap dump-pseudo-merge-objects <n>
These three helpers dump the list of pseudo merges, the "parents" of the
nth pseudo-merges, and the set of objects reachable from those parents,
respectively.
These helpers will be useful in subsequent patches when we add test
coverage for pseudo-merge bitmaps.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that during a `git clone`, the hooks' contents are no longer
compared to the templates' files', the caller for which the
`do_files_match()` function was introduced is gone, and therefore this
function can be retired, too.
This reverts commit 584de0b4c2 (Add a helper function to compare file
contents, 2024-03-30).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the recent codebase update (8bf6fbd (Merge branch
'js/doc-unit-tests', 2023-12-09)), a new unit testing framework was
merged, providing a standardized approach for testing C code. Prior to
this update, some unit tests relied on the test helper mechanism,
lacking a dedicated unit testing framework. It's more natural to perform
these unit tests using the new unit test framework.
Let's migrate the unit tests for strcmp-offset functionality from the
legacy approach using the test-tool command `test-tool strcmp-offset` in
helper/test-strcmp-offset.c to the new unit testing framework
(t/unit-tests/test-lib.h).
The migration involves refactoring the tests to utilize the testing
macros provided by the framework (TEST() and check_*()).
Helped-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Co-authored-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Updates to symbolic refs can now be made as a part of ref
transaction.
* kn/ref-transaction-symref:
refs: remove `create_symref` and associated dead code
refs: rename `refs_create_symref()` to `refs_update_symref()`
refs: use transaction in `refs_create_symref()`
refs: add support for transactional symref updates
refs: move `original_update_refname` to 'refs.c'
refs: support symrefs in 'reference-transaction' hook
files-backend: extract out `create_symref_lock()`
refs: accept symref values in `ref_transaction_update()`
Both `peel_object()` and `peel_iterated_oid()` implicitly rely on
`the_repository` to look up objects. Despite the fact that we want to
get rid of `the_repository`, it also leads to some restrictions in our
ref iterators when trying to retrieve the peeled value for a repository
other than `the_repository`.
Refactor these functions such that both take a repository as argument
and remove the now-unnecessary restrictions.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Looking up submodule ref stores has two deficiencies:
- The initialized subrepo will be attributed to `the_repository`.
- The submodule ref store will be tracked in a global map.
This makes it impossible to have submodule ref stores for a repository
other than `the_repository`.
Modify the function to accept the parent repository as parameter and
move the global map into `struct repository`. Like this it becomes
possible to look up submodule ref stores for arbitrary repositories.
Note that this also adds a new reference to `the_repository` in
`resolve_gitlink_ref()`, which is part of the refs interfaces. This will
get adjusted in the next patch.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "test-tool" has been taught to run testsuite tests in parallel,
bypassing the need to use the "prove" tool.
* js/unit-test-suite-runner:
cmake: let `test-tool` run the unit tests, too
ci: use test-tool as unit test runner on Windows
t/Makefile: run unit tests alongside shell tests
unit tests: add rule for running with test-tool
test-tool run-command testsuite: support unit tests
test-tool run-command testsuite: remove hardcoded filter
test-tool run-command testsuite: get shell from env
t0080: turn t-basic unit test into a helper
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmYxBJ0ACgkQsLXohpav
5suE6A//RTmt/rsMCDvpHEYSvox0ln5oMWyXrqKiHLxesMc0uLWRHAUDrHGCg7JP
OoZkf1cV2yOcD4lhO4YrlcHR3n1xdAyGrhc5vyLI4DFAAxdOLl4VDHRazXm51u+p
8GLxQY/1xu9bvde1PDYL2qtjDMskMgqb2Rfvv6ULpfICJrioy+CO5wud7BYIX4qB
oFZQnFLrQnSW9XT3r2+hKJKP4cHXQX5tYY0mkiy3bjbscNGyjdrkqMjJ2QEIWqhj
SUCujS5Clx6WKr0uLxoKs1IemdV0lkg2IbsxMZ5yYxLH2P9O7jQHvjgOx5NgfRlu
NtYMWsrkYhylWUxLiTFgLbJ8DE6sjN+emYOqCDRlr7XPvsvVX6eucX9YRxS4C/XP
izoOhAHJOFRaI/nMuG7iOOmnobKJKy0PbVFgA4W8MtNKZ+4taKF24aSK3TZpArhX
Z3gMQwSWoO6KVPJ7+Et2x/WV5BmVAbpMMufX2ErwOhMDMO9jlvYy0q2OeCaiMg1c
xZGGxC441IsYPVwSrJFU/U+Pl190PEazgmclkaqdothbjeMPb/gBV4j46Rznjld4
68n3h1rW2S5AQbMKie+/Yygi0O087VAvTMsYPxDKsDmbeUHvCEd148dKgdeU59ct
IXkrf2UW7dUWwZv2lv8NMdLue2M5bB9Yeufg3GJkfOaTy+1S5TM=
=g/43
-----END PGP SIGNATURE-----
Sync with Git 2.45.1
* tag 'v2.45.1': (42 commits)
Git 2.45.1
Git 2.44.1
Git 2.43.4
Git 2.42.2
Git 2.41.1
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
...
The singleton index_state instance "the_index" has been eliminated
by always instantiating "the_repository" and replacing references
to "the_index" with references to its .index member.
* ps/the-index-is-no-more:
repository: drop `initialize_the_repository()`
repository: drop `the_index` variable
builtin/clone: stop using `the_index`
repository: initialize index in `repo_init()`
builtin: stop using `the_index`
t/helper: stop using `the_index`
Code to write out reftable has seen some optimization and
simplification.
* ps/reftable-write-optim:
reftable/block: reuse compressed array
reftable/block: reuse zstream when writing log blocks
reftable/writer: reset `last_key` instead of releasing it
reftable/writer: unify releasing memory
reftable/writer: refactorings for `writer_flush_nonempty_block()`
reftable/writer: refactorings for `writer_add_record()`
refs/reftable: don't recompute committer ident
reftable: remove name checks
refs/reftable: skip duplicate name checks
refs/reftable: perform explicit D/F check when writing symrefs
refs/reftable: fix D/F conflict error message on ref copy
The `refs_create_symref()` function is used to update/create a symref.
But it doesn't check the old target of the symref, if existing. It force
updates the symref. In this regard, the name `refs_create_symref()` is a
bit misleading. So let's rename it to `refs_update_symref()`. This is
akin to how 'git-update-ref(1)' also allows us to create apart from
update.
While we're here, rename the arguments in the function to clarify what
they actually signify and reduce confusion.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the testsuite runner in `test-tool run-command testsuite` how to
run unit tests: if TEST_SHELL_PATH is not set, run the programs directly
from CWD, rather than defaulting to "sh" as an interpreter.
With this change, you can now use test-tool to run the unit tests:
$ make
$ cd t/unit-tests/bin
$ ../../helper/test-tool run-command testsuite
This should be helpful on Windows to allow running tests without
requiring Perl (for `prove`), as discussed in [1] and [2].
This again breaks backwards compatibility, as it is now required to set
TEST_SHELL_PATH properly for executing shell scripts, but again, as
noted in [2], there are no longer any such invocations in our codebase.
[1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet/
[2] https://lore.kernel.org/git/850ea42c-f103-68d5-896b-9120e2628686@gmx.de/
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`test-tool run-command testsuite` currently assumes that it will only be
running the shell test suite, and therefore filters out anything that
does not match a hardcoded pattern of "t[0-9][0-9][0-9][0-9]-*.sh".
Later in this series, we'll adapt `test-tool run-command testsuite` to
also support unit tests, which do not follow the same naming conventions
as the shell tests, so this hardcoded pattern is inconvenient.
Since `testsuite` also allows specifying patterns on the command-line,
let's just remove this pattern. As noted in [1], there are no longer any
uses of `testsuite` in our codebase, it should be OK to break backwards
compatibility in this case. We also add a new filter to avoid trying to
execute "." and "..", so that users who wish to execute every test in a
directory can do so without specifying a pattern.
[1] https://lore.kernel.org/git/850ea42c-f103-68d5-896b-9120e2628686@gmx.de/
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running tests through `test-tool run-command testsuite`, we
currently hardcode `sh` as the command interpreter. As discussed in [1],
this is incorrect, and we should be using the shell set in
TEST_SHELL_PATH instead.
Add a shell_path field in struct testsuite so that we can pass this to
the task runner callback. If this is non-null, we'll use it as the
argv[0] of the subprocess. Otherwise, we'll just execute the test
program directly. We will use this feature in a later commit to enable
running binary executable unit tests.
However, for now when setting up the struct testsuite in testsuite(),
use the value of TEST_SHELL_PATH if it's set, otherwise keep the
original behavior by defaulting to `sh`.
[1] https://lore.kernel.org/git/20240123005913.GB835964@coredump.intra.peff.net/
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While t/unit-tests/t-basic.c uses the unit-test framework added in
e137fe3b29 (unit tests: add TAP unit test framework, 2023-11-09), it is
not a true unit test in that it intentionally fails in order to exercise
various codepaths in the unit-test framework. Thus, we intentionally
exclude it when running unit tests through the various t/Makefile
targets. Instead, it is executed by t0080-unit-test-output.sh, which
verifies its output follows the TAP format expected for the various
pass, skip, or fail cases.
As such, it makes more sense for t-basic to be a helper item for
t0080-unit-test-output.sh, so let's move it to
t/helper/test-example-tap.c and adjust Makefiles as necessary.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.44: (41 commits)
Git 2.44.1
Git 2.43.4
Git 2.42.2
Git 2.41.1
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
...
* maint-2.43: (40 commits)
Git 2.43.4
Git 2.42.2
Git 2.41.1
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
...
* maint-2.42: (39 commits)
Git 2.42.2
Git 2.41.1
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
has_dir_name(): do not get confused by characters < '/'
...
* maint-2.41: (38 commits)
Git 2.41.1
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
has_dir_name(): do not get confused by characters < '/'
docs: document security issues around untrusted .git dirs
...
* maint-2.40: (39 commits)
Git 2.40.2
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
has_dir_name(): do not get confused by characters < '/'
docs: document security issues around untrusted .git dirs
upload-pack: disable lazy-fetching by default
...
* maint-2.39: (38 commits)
Git 2.39.4
fsck: warn about symlink pointing inside a gitdir
core.hooksPath: add some protection while cloning
init.templateDir: consider this config setting protected
clone: prevent hooks from running during a clone
Add a helper function to compare file contents
init: refactor the template directory discovery into its own function
find_hook(): refactor the `STRIP_EXTENSION` logic
clone: when symbolic links collide with directories, keep the latter
entry: report more colliding paths
t5510: verify that D/F confusion cannot lead to an RCE
submodule: require the submodule path to contain directories only
clone_submodule: avoid using `access()` on directories
submodules: submodule paths must not contain symlinks
clone: prevent clashing git dirs when cloning submodule in parallel
t7423: add tests for symlinked submodule directories
has_dir_name(): do not get confused by characters < '/'
docs: document security issues around untrusted .git dirs
upload-pack: disable lazy-fetching by default
fetch/clone: detect dubious ownership of local repositories
...
In the next commit, Git will learn to disallow hooks during `git clone`
operations _except_ when those hooks come from the templates (which are
inherently supposed to be trusted). To that end, we add a function to
compare the contents of two files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Now that we have dropped `the_index`, `initialize_the_repository()`
doesn't really do a lot anymore except for setting up the pointer for
`the_repository` and then calling `initialize_repository()`. The former
can be replaced by statically initializing the pointer though, which
basically makes this function moot.
Convert callers to instead call `initialize_repository(the_repository)`
and drop `initialize_thee_repository()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert test-helper tools to use `the_repository->index` instead of
`the_index`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepaths that reach date_mode_from_type() have been updated to
pass "struct date_mode" by value to make them thread safe.
* rs/date-mode-pass-by-value:
date: make DATE_MODE thread-safe
"git pack-refs" learned the "--auto" option, which is a useful
addition to be triggered from "git gc --auto".
Acked-by: Karthik Nayak <karthik.188@gmail.com>
cf. <CAOLa=ZRAEA7rSUoYL0h-2qfEELdbPHbeGpgBJRqesyhHi9Q6WQ@mail.gmail.com>
* ps/pack-refs-auto:
builtin/gc: pack refs when using `git maintenance run --auto`
builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs
t6500: extract objects with "17" prefix
builtin/gc: move `struct maintenance_run_opts`
builtin/pack-refs: introduce new "--auto" flag
builtin/pack-refs: release allocated memory
refs/reftable: expose auto compaction via new flag
refs: remove `PACK_REFS_ALL` flag
refs: move `struct pack_refs_opts` to where it's used
t/helper: drop pack-refs wrapper
refs/reftable: print errors on compaction failure
reftable/stack: gracefully handle failed auto-compaction due to locks
reftable/stack: use error codes when locking fails during compaction
reftable/error: discern locked/outdated errors
reftable/stack: fix error handling in `reftable_stack_init_addition()`
In the preceding commit we have disabled name checks in the "reftable"
backend. These checks were responsible for verifying multiple things
when writing records to the reftable stack:
- Detecting file/directory conflicts. Starting with the preceding
commits this is now handled by the reftable backend itself via
`refs_verify_refname_available()`.
- Validating refnames. This is handled by `check_refname_format()` in
the generic ref transacton layer.
The code in the reftable library is thus not used anymore and likely to
bitrot over time. Remove it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
date_mode_from_type() modifies a static variable and returns a pointer
to it. This is not thread-safe. Most callers of date_mode_from_type()
use it via the macro DATE_MODE and pass its result on to functions like
show_date(), which take a const pointer and don't modify the struct.
Avoid the static storage by putting the variable on the stack and
returning the whole struct date_mode. Change functions that take a
constant pointer to expect the whole struct instead.
Reduce the cost of passing struct date_mode around on 64-bit systems
by reordering its members to close the hole between the 32-bit wide
.type and the 64-bit aligned .strftime_fmt as well as the alignment
hole at the end. sizeof reports 24 before and 16 with this change
on x64. Keep .type at the top to still allow initialization without
designator -- though that's only done in a single location, in
builtin/blame.c.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Work to support a repository that work with both SHA-1 and SHA-256
hash algorithms has started.
* eb/hash-transition: (30 commits)
t1016-compatObjectFormat: add tests to verify the conversion between objects
t1006: test oid compatibility with cat-file
t1006: rename sha1 to oid
test-lib: compute the compatibility hash so tests may use it
builtin/ls-tree: let the oid determine the output algorithm
object-file: handle compat objects in check_object_signature
tree-walk: init_tree_desc take an oid to get the hash algorithm
builtin/cat-file: let the oid determine the output algorithm
rev-parse: add an --output-object-format parameter
repository: implement extensions.compatObjectFormat
object-file: update object_info_extended to reencode objects
object-file-convert: convert commits that embed signed tags
object-file-convert: convert commit objects when writing
object-file-convert: don't leak when converting tag objects
object-file-convert: convert tag objects when writing
object-file-convert: add a function to convert trees between algorithms
object: factor out parse_mode out of fast-import and tree-walk into in object.h
cache: add a function to read an OID of a specific algorithm
tag: sign both hashes
commit: export add_header_signature to support handling signatures on tags
...
The test helper provides a "ref-store <store> pack-refs" wrapper that
more or less directly invokes `refs_pack_refs()`. This helper is only
used in a single test with the "PACK_REFS_PRUNE" and "PACK_REFS_ALL"
flags. Both of these flags can directly be accessed via git-pack-refs(1)
though via the `--all` and `--prune` flags, which makes the helper
superfluous.
Refactor the test to use git-pack-refs(1) instead of the test helper.
Drop the now-unused test helper command.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many()` function is aware of
that, too.
Naturally, there are a lot of callers that need to be adjusted now, too.
Next stop: `repo_get_merge_bases_dirty()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently this function treats unrelated commit histories the same way
as commit histories with missing commit objects.
Typically, missing commit objects constitute a corrupt repository,
though, and should be reported as such. The next commits will make it
so, but there is one exception: In `git fetch --update-shallow` we
_expect_ commit objects to be missing, and we do want to treat the
now-incomplete commit histories as unrelated.
To allow for that, let's introduce an additional parameter that is
passed to `repo_in_merge_bases_many()` to trigger this behavior, and use
it in the two callers in `shallow.c`.
This commit changes behavior slightly: unless called from the
`shallow.c` functions that set the `ignore_missing_commits` bit, any
non-existing tip commit that is passed to `repo_in_merge_bases_many()`
will now result in an error.
Note: When encountering missing commits while traversing the commit
history in search for merge bases, with this commit there won't be a
change in behavior just yet, their children will still be interpreted as
root commits. This bug will get fixed by follow-up commits.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref and reflog iterators share much of the same underlying code to
iterate over the corresponding entries. This results in some weird code
because the reflog iterator also exposes an object ID as well as a flag
to the callback function. Neither of these fields do refer to the reflog
though -- they refer to the corresponding ref with the same name. This
is quite misleading. In practice at least the object ID cannot really be
implemented in any other way as a reflog does not have a specific object
ID in the first place. This is further stressed by the fact that none of
the callbacks except for our test helper make use of these fields.
Split up the infrastucture so that ref and reflog iterators use separate
callback signatures. This allows us to drop the nonsensical fields from
the reflog iterator.
Note that internally, the backends still use the same shared infra to
iterate over both types. As the backends should never end up being
called directly anyway, this is not much of a problem and thus kept
as-is for simplicity's sake.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sideband demultiplexer fixes.
* jx/sideband-chomp-newline-fix:
pkt-line: do not chomp newlines for sideband messages
pkt-line: memorize sideband fragment in reader
test-pkt-line: add option parser for unpack-sideband
Remove unused header "#include".
* en/header-cleanup:
treewide: remove unnecessary includes in source files
treewide: add direct includes currently only pulled in transitively
trace2/tr2_tls.h: remove unnecessary include
submodule-config.h: remove unnecessary include
pkt-line.h: remove unnecessary include
line-log.h: remove unnecessary include
http.h: remove unnecessary include
fsmonitor--daemon.h: remove unnecessary includes
blame.h: remove unnecessary includes
archive.h: remove unnecessary include
treewide: remove unnecessary includes in source files
treewide: remove unnecessary includes from header files
trace2 streams used to record the URLs that potentially embed
authentication material, which has been corrected.
* jh/trace2-redact-auth:
t0212: test URL redacting in EVENT format
t0211: test URL redacting in PERF format
trace2: redact passwords from https:// URLs by default
trace2: fix signature of trace2_def_param() macro
Stale URLs have been updated to their current counterparts (or
archive.org) and HTTP links are replaced with working HTTPS links.
* js/update-urls-in-doc-and-comment:
doc: refer to internet archive
doc: update links for andre-simon.de
doc: switch links to https
doc: update links to current pages
The priority queue test has been migrated to the unit testing
framework.
* cp/unit-test-prio-queue:
tests: move t0009-prio-queue.sh to the new unit testing framework
Tighten URL checks fsck makes in a URL recorded for submodules.
* vd/fsck-submodule-url-test:
submodule-config.c: strengthen URL fsck check
t7450: test submodule urls
test-submodule: remove command line handling for check-name
submodule-config.h: move check_submodule_url
t/t0009-prio-queue.sh along with t/helper/test-prio-queue.c unit
tests Git's implementation of a priority queue. Migrate the
test over to the new unit testing framework to simplify debugging
and reduce test run-time. Refactor the required logic and add
a new test case in addition to porting over the original ones in
shell.
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests to 't7450-bad-git-dotfiles.sh' to check the validity of different
submodule URLs. To verify this directly (without setting up test
repositories & submodules), add a 'check-url' subcommand to 'test-tool
submodule' that calls 'check_submodule_url' in the same way that
'check-name' calls 'check_submodule_name'.
Add two tests to separately address cases where the URL check correctly
filters out invalid URLs and cases where the check misses invalid URLs. Mark
the latter ("url check misses invalid cases") with 'test_expect_failure' to
indicate that this is currently broken, which will be fixed in the next step.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'check-name' subcommand to 'test-tool submodule' is documented as being
able to take a command line argument '<name>'. However, this does not work -
and has never worked - because 'argc > 0' triggers the usage message in
'cmd__submodule_check_name()'. To simplify the helper and avoid future
confusion around proper use of the subcommand, remove any references to
command line arguments for 'check-name' in usage strings and handling in
'check_name()'.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the recent codebase update (8bf6fbd00d (Merge branch
'js/doc-unit-tests', 2023-12-09)), a new unit testing framework was
merged, providing a standardized approach for testing C code. Prior to
this update, some unit tests relied on the test helper mechanism,
lacking a dedicated unit testing framework. It's more natural to perform
these unit tests using the new unit test framework.
This commit migrates the unit tests for C character classification
functions (isdigit(), isspace(), etc) from the legacy approach
using the test-tool command `test-tool ctype` in t/helper/test-ctype.c
to the new unit testing framework (t/unit-tests/test-lib.h).
The migration involves refactoring the tests to utilize the testing
macros provided by the framework (TEST() and check_*()).
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Acked-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sideband demultiplexer fixes.
* jx/sideband-chomp-newline-fix:
pkt-line: do not chomp newlines for sideband messages
pkt-line: memorize sideband fragment in reader
test-pkt-line: add option parser for unpack-sideband
Streaming spans of packfile data used to be done only from a
single, primary, pack in a repository with multiple packfiles. It
has been extended to allow reuse from other packfiles, too.
* tb/multi-pack-verbatim-reuse: (26 commits)
t/perf: add performance tests for multi-pack reuse
pack-bitmap: enable reuse from all bitmapped packs
pack-objects: allow setting `pack.allowPackReuse` to "single"
t/test-lib-functions.sh: implement `test_trace2_data` helper
pack-objects: add tracing for various packfile metrics
pack-bitmap: prepare to mark objects from multiple packs for reuse
pack-revindex: implement `midx_pair_to_pack_pos()`
pack-revindex: factor out `midx_key_to_pack_pos()` helper
midx: implement `midx_preferred_pack()`
git-compat-util.h: implement checked size_t to uint32_t conversion
pack-objects: include number of packs reused in output
pack-objects: prepare `write_reused_pack_verbatim()` for multi-pack reuse
pack-objects: prepare `write_reused_pack()` for multi-pack reuse
pack-objects: pass `bitmapped_pack`'s to pack-reuse functions
pack-objects: keep track of `pack_start` for each reuse pack
pack-objects: parameterize pack-reuse routines over a single pack
pack-bitmap: return multiple packs via `reuse_partial_packfile_from_bitmap()`
pack-bitmap: simplify `reuse_partial_packfile_from_bitmap()` signature
ewah: implement `bitmap_is_empty()`
pack-bitmap: pass `bitmapped_pack` struct to pack-reuse functions
...
Remove unused header "#include".
* en/header-cleanup:
treewide: remove unnecessary includes in source files
treewide: add direct includes currently only pulled in transitively
trace2/tr2_tls.h: remove unnecessary include
submodule-config.h: remove unnecessary include
pkt-line.h: remove unnecessary include
line-log.h: remove unnecessary include
http.h: remove unnecessary include
fsmonitor--daemon.h: remove unnecessary includes
blame.h: remove unnecessary includes
archive.h: remove unnecessary include
treewide: remove unnecessary includes in source files
treewide: remove unnecessary includes from header files
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though. Have those
source files explicitly include the headers they need.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though. Have those
source files explicitly include the headers they need.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these were checked with
gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).
...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file. These cases were:
* builtin/credential-cache.c
* builtin/pull.c
* builtin/send-pack.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
trace2 streams used to record the URLs that potentially embed
authentication material, which has been corrected.
* jh/trace2-redact-auth:
t0212: test URL redacting in EVENT format
t0211: test URL redacting in PERF format
trace2: redact passwords from https:// URLs by default
trace2: fix signature of trace2_def_param() macro
Stale URLs have been updated to their current counterparts (or
archive.org) and HTTP links are replaced with working HTTPS links.
* js/update-urls-in-doc-and-comment:
doc: refer to internet archive
doc: update links for andre-simon.de
doc: switch links to https
doc: update links to current pages
Introduce "git replay", a tool meant on the server side without
working tree to recreate a history.
* cc/git-replay:
replay: stop assuming replayed branches do not diverge
replay: add --contained to rebase contained branches
replay: add --advance or 'cherry-pick' mode
replay: use standard revision ranges
replay: make it a minimal server side command
replay: remove HEAD related sanity check
replay: remove progress and info output
replay: add an important FIXME comment about gpg signing
replay: change rev walking options
replay: introduce pick_regular_commit()
replay: die() instead of failing assert()
replay: start using parse_options API
replay: introduce new builtin
t6429: remove switching aspects of fast-rebase
We can use the test helper program "test-tool pkt-line" to test pkt-line
related functions. E.g.:
* Use "test-tool pkt-line send-split-sideband" to generate sideband
messages.
* Pipe these generated sideband messages to command "test-tool pkt-line
unpack-sideband" to test packet_reader_read() function.
In order to make a complete test of the packet_reader_read() function,
add option parser for command "test-tool pkt-line unpack-sideband".
* To remove newlines in sideband messages, we can use:
$ test-tool pkt-line unpack-sideband --chomp-newline
* To preserve newlines in sideband messages, we can use:
$ test-tool pkt-line unpack-sideband --no-chomp-newline
* To parse sideband messages using "demultiplex_sideband()" inside the
function "packet_reader_read()", we can use:
$ test-tool pkt-line unpack-sideband --reader-use-sideband
We also add new example sideband packets in send_split_sideband() and
add several new test cases in t0070. Among these test cases, we pipe
output of the "send-split-sideband" subcommand to the "unpack-sideband"
subcommand. We found two issues:
1. The two splitted sideband messages "Hello," and " world!\n" should
be concatenated together. But when we turn on use_sideband field of
reader to parse sideband messages, the first part of the splitted
message ("Hello,") is lost.
2. The newline characters in sideband 2 (progress info) and sideband 3
(error message) should be preserved, but they are both trimmed.
Will fix the above two issues in subsequent commits.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing a binary search over the objects in a MIDX's bitmap
(i.e. in pseudo-pack order), the reader reconstructs the pseudo-pack
ordering using a combination of (a) the preferred pack, (b) the pack's
lexical position in the MIDX based on pack names, and (c) the object
offset within the pack.
In order to perform this binary search, the reader must know the
identity of the preferred pack. This could be stored in the MIDX, but
isn't for historical reasons, mostly because it can easily be inferred
at read-time by looking at the object in the first bit position and
finding out which pack it was selected from in the MIDX, like so:
nth_midxed_pack_int_id(m, pack_pos_to_midx(m, 0));
In midx_to_pack_pos() which performs this binary search, we look up the
identity of the preferred pack before each search. This is relatively
quick, since it involves two table-driven lookups (one in the MIDX's
revindex for `pack_pos_to_midx()`, and another in the MIDX's object
table for `nth_midxed_pack_int_id()`).
But since the preferred pack does not change after the MIDX is written,
it is safe to cache this value on the MIDX itself.
Write a helper to do just that, and rewrite all of the existing
call-sites that care about the identity of the preferred pack in terms
of this new helper.
This will prepare us for a subsequent patch where we will need to binary
search through the MIDX's pseudo-pack order multiple times.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a multi-pack bitmap is used to implement verbatim pack reuse (that
is, when verbatim chunks from an on-disk packfile are copied
directly[^1]), it does so by using its "preferred pack" as the source
for pack-reuse.
This allows repositories to pack the majority of their objects into a
single (often large) pack, and then use it as the single source for
verbatim pack reuse. This increases the amount of objects that are
reused verbatim (and consequently, decrease the amount of time it takes
to generate many packs). But this performance comes at a cost, which is
that the preferred packfile must pace its growth with that of the entire
repository in order to maintain the utility of verbatim pack reuse.
As repositories grow beyond what we can reasonably store in a single
packfile, the utility of verbatim pack reuse diminishes. Or, at the very
least, it becomes increasingly more expensive to maintain as the pack
grows larger and larger.
It would be beneficial to be able to perform this same optimization over
multiple packs, provided some modest constraints (most importantly, that
the set of packs eligible for verbatim reuse are disjoint with respect
to the subset of their objects being sent).
If we assume that the packs which we treat as candidates for verbatim
reuse are disjoint with respect to any of their objects we may output,
we need to make only modest modifications to the verbatim pack-reuse
code itself. Most notably, we need to remove the assumption that the
bits in the reachability bitmap corresponding to objects from the single
reuse pack begin at the first bit position.
Future patches will unwind these assumptions and reimplement their
existing functionality as special cases of the more general assumptions
(e.g. that reuse bits can start anywhere within the bitset, but happen
to start at 0 for all existing cases).
This patch does not yet relax any of those assumptions. Instead, it
implements a foundational data-structure, the "Bitampped Packs" (`BTMP`)
chunk of the multi-pack index. The `BTMP` chunk's contents are described
in detail here. Importantly, the `BTMP` chunk contains information to
map regions of a multi-pack index's reachability bitmap to the packs
whose objects they represent.
For now, this chunk is only written, not read (outside of the test-tool
used in this patch to test the new chunk's behavior). Future patches
will begin to make use of this new chunk.
[^1]: Modulo patching any `OFS_DELTA`'s that cross over a region of the
pack that wasn't used verbatim.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update ref-related tests.
* ps/ref-tests-update:
t: mark several tests that assume the files backend with REFFILES
t7900: assert the absence of refs via git-for-each-ref(1)
t7300: assert exact states of repo
t4207: delete replace references via git-update-ref(1)
t1450: convert tests to remove worktrees via git-worktree(1)
t: convert tests to not access reflog via the filesystem
t: convert tests to not access symrefs via the filesystem
t: convert tests to not write references via the filesystem
t: allow skipping expected object ID in `ref-store update-ref`
For now, this is just a rename from `t/helper/test-fast-rebase.c` into
`builtin/replay.c` with minimal changes to make it build appropriately.
Let's add a stub documentation and a stub test script though.
Subsequent commits will flesh out the capabilities of the new command
and make it a more standard regular builtin.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the time t6429 was written, merge-ort was still under development,
did not have quite as many tests, and certainly was not widely deployed.
Since t6429 was exercising some codepaths just a little differently, we
thought having them also test the "merge_switch_to_result()" bits of
merge-ort was useful even though they weren't intrinsic to the real
point of these tests.
However, the value provided by doing extra testing of the
"merge_switch_to_result()" bits has decreased a bit over time, and it's
actively making it harder to refactor `test-tool fast-rebase` into `git
replay`, which we are going to do in following commits. Dispense with
these bits.
Co-authored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These sites offer https versions of their content.
Using the https versions provides some protection for users.
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the added tests cases, skip testing the `GIT_TRACE2_REDACT=0` case
because we would need to exactly model the full JSON event stream like
we did in the preceding basic tests and I do not think it is worth it.
Furthermore, the Trace2 routines print the same content in normal, perf,
or event format, and in t0210 and t0211 we already tested the basic
functionality, so no need to repeat it here.
In this test, we use the test-helper to unit test each of the event
messages where URLs can appear and confirm that they are redacted in
each event.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leakfix.
* ps/leakfixes:
setup: fix leaking repository format
setup: refactor `upgrade_repository_format()` to have common exit
shallow: fix memory leak when registering shallow roots
test-bloom: stop setting up Git directory twice
We're setting up the Git directory twice in the `test-tool bloom`
helper, once at the beginning of `cmd_bloom()` and once in the local
subcommand implementation `get_bloom_filter_for_commit()`. This can lead
to memory leaks as we'll overwrite variables of `the_repository` with
newly allocated data structures. On top of that it's simply unnecessary.
Fix this by only setting up the Git directory once.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>