Commit Graph

20 Commits (v2.50.0-rc1)

Author SHA1 Message Date
Patrick Steinhardt 6dcc05ffc3 reftable: fix formatting of the license header
The license headers used across the reftable library doesn't follow our
typical coding style for multi-line comments. Fix it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:09 -07:00
Junio C Hamano 82522a9e2c Merge branch 'kn/reflog-migration-fix-followup'
Code clean-up.

* kn/reflog-migration-fix-followup:
  reftable: prevent 'update_index' changes after adding records
  refs: use 'uint64_t' for 'ref_update.index'
  refs: mark `ref_transaction_update_reflog()` as static
2025-02-14 17:53:48 -08:00
Karthik Nayak 017bd89239 reftable: prevent 'update_index' changes after adding records
The function `reftable_writer_set_limits()` allows updating the
'min_update_index' and 'max_update_index' of a reftable writer. These
values are written to both the writer's header and footer.

Since the header is written during the first block write, any subsequent
changes to the update index would create a mismatch between the header
and footer values. The footer would contain the newer values while the
header retained the original ones.

To protect against this bug, prevent callers from updating these values
after any record is written. To do this, modify the function to return
an error whenever the limits are modified after any record adds. Check
for record adds within `reftable_writer_set_limits()` by checking the
`last_key` and `next` variable. The former is updated after each record
added, but is reset at certain points. The latter is set after writing
the first block.

Modify all callers of the function to anticipate a return type and
handle it accordingly. Add a unit test to also ensure the function
returns the error as expected.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 09:51:36 -08:00
Patrick Steinhardt b1e4b6f4dc reftable/block: adjust type of the restart length
The restart length is tracked as a positive integer even though it
cannot ever be negative. Furthermore, it is effectively capped via the
MAX_RESTARTS variable.

Adjust the type of the variable to be `uint32_t`. While this type is
excessive given that MAX_RESTARTS fits into an `uint16_t`, other places
already use 32 bit integers for restarts, so this type is being more
consistent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:30 -08:00
Patrick Steinhardt eb22c1b46b reftable/stack: add mechanism to notify callers on reload
Reftable stacks are reloaded in two cases:

  - When calling `reftable_stack_reload()`, if the stat-cache tells us
    that the stack has been modified.

  - When committing a reftable addition.

While callers can figure out the second case, they do not have a
mechanism to figure out whether `reftable_stack_reload()` led to an
actual reload of the on-disk data. All they can do is thus to assume
that data is always being reloaded in that case.

Improve the situation by introducing a new `on_reload()` callback to the
reftable options. If provided, the function will be invoked every time
the stack has indeed been reloaded. This allows callers to invalidate
data that depends on the current stack data.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 17:18:38 +09:00
Patrick Steinhardt 86b770b0bb reftable/stack: stop using `fsync_component()` directly
We're executing `fsync_component()` directly in the reftable library so
that we can fsync data to disk depending on "core.fsync". But as we're
in the process of converting the reftable library to become standalone
we cannot use that function in the library anymore.

Refactor the code such that users of the library can inject a custom
fsync function via the write options. This allows us to get rid of the
dependency on "write-or-die.h".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:23:10 +09:00
Patrick Steinhardt c2f08236ed reftable/system: stop depending on "hash.h"
We include "hash.h" in "reftable/system.h" such that we can use hash
format IDs as well as the raw size of SHA1 and SHA256. As we are in the
process of converting the reftable library to become standalone we of
course cannot rely on those constants anymore.

Introduce a new `enum reftable_hash` to replace internal uses of the
hash format IDs and new constants that replace internal uses of the hash
size. Adapt the reftable backend to set up the correct hash function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19 12:23:10 +09:00
Junio C Hamano 5575c713c2 Merge branch 'ps/reftable-alloc-failures'
The reftable library is now prepared to expect that the memory
allocation function given to it may fail to allocate and to deal
with such an error.

* ps/reftable-alloc-failures: (26 commits)
  reftable/basics: fix segfault when growing `names` array fails
  reftable/basics: ban standard allocator functions
  reftable: introduce `REFTABLE_FREE_AND_NULL()`
  reftable: fix calls to free(3P)
  reftable: handle trivial allocation failures
  reftable/tree: handle allocation failures
  reftable/pq: handle allocation failures when adding entries
  reftable/block: handle allocation failures
  reftable/blocksource: handle allocation failures
  reftable/iter: handle allocation failures when creating indexed table iter
  reftable/stack: handle allocation failures in auto compaction
  reftable/stack: handle allocation failures in `stack_compact_range()`
  reftable/stack: handle allocation failures in `reftable_new_stack()`
  reftable/stack: handle allocation failures on reload
  reftable/reader: handle allocation failures in `reader_init_iter()`
  reftable/reader: handle allocation failures for unindexed reader
  reftable/merged: handle allocation failures in `merged_table_init_iter()`
  reftable/writer: handle allocation failures in `reftable_new_writer()`
  reftable/writer: handle allocation failures in `writer_index_hash()`
  reftable/record: handle allocation failures when decoding records
  ...
2024-10-10 14:22:25 -07:00
Patrick Steinhardt 74d1c18757 reftable/writer: handle allocation failures in `reftable_new_writer()`
Handle allocation failures in `reftable_new_writer()`. Adapt the
function to return an error code to return such failures. While at it,
rename it to match our code style as we have to touch up every callsite
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:52 -07:00
Patrick Steinhardt bc39b6a796 refs/reftable: introduce "reftable.lockTimeout"
When multiple concurrent processes try to update references in a
repository they may try to lock the same lockfiles. This can happen even
when the updates are non-conflicting and can both be applied, so it
doesn't always make sense to abort the transaction immediately. Both the
"loose" and "packed" backends thus have a grace period that they wait
for the lock to be released that can be controlled via the config values
"core.filesRefLockTimeout" and "core.packedRefsTimeout", respectively.

The reftable backend doesn't have such a setting yet and instead fails
immediately when it sees such a lock. But the exact same concepts apply
here as they do apply to the other backends.

Introduce a new "reftable.lockTimeout" config that controls how long we
may wait for a "tables.list" lock to be released. The default value of
this config is 100ms, which is the same default as we have it for the
"loose" backend.

Note that even though we also lock individual tables, this config really
only applies to the "tables.list" file. This is because individual
tables are only ever locked when we already hold the "tables.list" lock
during compaction. When we observe such a lock we in fact do not want to
compact the table at all because it is already in the process of being
compacted by a concurrent process. So applying the same timeout here
would not make any sense and only delay progress.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 09:45:25 -07:00
Patrick Steinhardt f663d34306 reftable: make the compaction factor configurable
When auto-compacting, the reftable library packs references such that
the sizes of the tables form a geometric sequence. The factor for this
geometric sequence is hardcoded to 2 right now. We're about to expose
this as a config option though, so let's expose the factor via write
options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:39 -07:00
Patrick Steinhardt 8e9e136d61 reftable: use `uint16_t` to track restart interval
The restart interval can at most be `UINT16_MAX` as specified in the
technical documentation of the reftable format. Furthermore, it cannot
ever be negative. Regardless of that we use an `int` to track the
restart interval.

Change the type to use an `uint16_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
Patrick Steinhardt 799237852b reftable: pass opts as constant pointer
We sometimes pass the refatble write options as value and sometimes as a
pointer. This is quite confusing and makes the reader wonder whether the
options get modified sometimes.

In fact, `reftable_new_writer()` does cause the caller-provided options
to get updated when some values aren't set up. This is quite unexpected,
but didn't cause any harm until now.

Adapt the code so that we do not modify the caller-provided values
anymore. While at it, refactor the code to code to consistently pass the
options as a constant pointer to clarify that the caller-provided opts
will not ever get modified.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13 17:02:38 -07:00
Junio C Hamano 5aec7231c8 Merge branch 'ps/reftable-write-optim'
Code to write out reftable has seen some optimization and
simplification.

* ps/reftable-write-optim:
  reftable/block: reuse compressed array
  reftable/block: reuse zstream when writing log blocks
  reftable/writer: reset `last_key` instead of releasing it
  reftable/writer: unify releasing memory
  reftable/writer: refactorings for `writer_flush_nonempty_block()`
  reftable/writer: refactorings for `writer_add_record()`
  refs/reftable: don't recompute committer ident
  reftable: remove name checks
  refs/reftable: skip duplicate name checks
  refs/reftable: perform explicit D/F check when writing symrefs
  refs/reftable: fix D/F conflict error message on ref copy
2024-05-08 10:18:43 -07:00
Patrick Steinhardt 485c63cf5c reftable: remove name checks
In the preceding commit we have disabled name checks in the "reftable"
backend. These checks were responsible for verifying multiple things
when writing records to the reftable stack:

  - Detecting file/directory conflicts. Starting with the preceding
    commits this is now handled by the reftable backend itself via
    `refs_verify_refname_available()`.

  - Validating refnames. This is handled by `check_refname_format()` in
    the generic ref transacton layer.

The code in the reftable library is thus not used anymore and likely to
bitrot over time. Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00
Justin Tobler bc91330cec reftable/stack: expose option to disable auto-compaction
The reftable stack already has a variable to configure whether or not to
run auto-compaction, but it is inaccessible to users of the library.
There exist use cases where a caller may want to have more control over
auto-compaction.

Move the `disable_auto_compact` option into `reftable_write_options` to
allow external callers to disable auto-compaction. This will be used in
a subsequent commit.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 12:11:10 -07:00
John Cai 1df18a1c9a reftable: honor core.fsync
While the reffiles backend honors configured fsync settings, the
reftable backend does not. Address this by fsyncing reftable files using
the write-or-die api's fsync_component() in two places: when we
add additional entries into the table, and when we close the reftable
writer.

This commits adds a flush function pointer as a new member of
reftable_writer because we are not sure that the first argument to the
*write function pointer always contains a file descriptor. In the case of
strbuf_add_void, the first argument is a buffer. This way, we can pass
in a corresponding flush function that knows how to flush depending on
which writer is being used.

This patch does not contain tests as they will need to wait for another
patch to start to exercise the reftable backend. At that point, the
tests will be added to observe that fsyncs are happening when the
reftable is in use.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-23 13:45:27 -08:00
Han-Wen Nienhuys 73a4c188b7 reftable: rename writer_stats to reftable_writer_stats
This function is part of the reftable API, so it should use the
reftable_ prefix

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-23 13:36:26 -08:00
Han-Wen Nienhuys cd1799dea0 reftable: support preset file mode for writing
Create files with mode 0666, so umask works as intended. Provides an override,
which is useful to support shared repos (test t1301-shared-repo.sh).

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-23 12:28:36 -08:00
Han-Wen Nienhuys f14bd71934 reftable: write reftable files
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 10:45:48 -07:00