We use the tempfile subsystem to write temporary tables, but given that
we're in the process of converting the reftable library to become
standalone we cannot use this subsystem directly anymore. While we could
in theory convert the code to use mkstemp(3p) instead, we'd lose access
to our infrastructure that automatically prunes tempfiles via atexit(3p)
or signal handlers.
Provide a thin wrapper for the tempfile subsystem instead. Like this,
the compatibility shim is fully self-contained in "reftable/system.c".
Downstream users of the reftable library would have to implement their
own tempfile shims by replacing "system.c" with a custom version.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're executing `fsync_component()` directly in the reftable library so
that we can fsync data to disk depending on "core.fsync". But as we're
in the process of converting the reftable library to become standalone
we cannot use that function in the library anymore.
Refactor the code such that users of the library can inject a custom
fsync function via the write options. This allows us to get rid of the
dependency on "write-or-die.h".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We include "hash.h" in "reftable/system.h" such that we can use hash
format IDs as well as the raw size of SHA1 and SHA256. As we are in the
process of converting the reftable library to become standalone we of
course cannot rely on those constants anymore.
Introduce a new `enum reftable_hash` to replace internal uses of the
hash format IDs and new constants that replace internal uses of the hash
size. Adapt the reftable backend to set up the correct hash function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hash format IDs are used for two different things across the
reftable codebase:
- They are used as a 32 bit unsigned integer when reading and writing
the header in order to identify the hash function.
- They are used internally to identify which hash function is in use.
When one only considers the second usecase one might think that one can
easily change the representation of those hash IDs. But because those
IDs end up in the reftable header and footer on disk it is important
that those never change.
Create separate constants `REFTABLE_FORMAT_ID_*` and use them in
contexts where we read or write reftable headers. This serves multiple
purposes:
- It allows us to more easily discern cases where we actually use
those constants for the on-disk format.
- It detangles us from the same constants that are defined in
libgit.a, which is another required step to convert the reftable
library to become standalone.
- It makes the next step easier where we stop using `GIT_*_FORMAT_ID`
constants in favor of a custom enum.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We still include "dir.h" in "reftable/system.h" even though it is not
used by anything but by a single unit test. Move it over into that unit
test so that we don't accidentally use any functionality provided by it
in the reftable codebase.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An extra worktree attached to a repository points at each other to
allow finding the repository from the worktree and vice versa
possible. Turn this linkage to relative paths.
* cw/worktree-relative:
worktree: add test for path handling in linked worktrees
worktree: link worktrees with relative paths
worktree: refactor infer_backlink() to use *strbuf
worktree: repair copied repository and linked worktrees
Fail gracefully instead of crashing when attempting to write the
contents of a corrupt in-core index as a tree object.
* ps/cache-tree-w-broken-index-entry:
unpack-trees: detect mismatching number of cache-tree/index entries
cache-tree: detect mismatching number of index entries
cache-tree: refactor verification to return error codes
"git maintenance start" crashed due to an uninitialized variable
reference, which has been corrected.
* ps/maintenance-start-crash-fix:
builtin/gc: fix crash when running `git maintenance start`
"git rebase --rebase-merges" now uses branch names as labels when
able.
* ng/rebase-merges-branch-name-as-label:
rebase-merges: try and use branch names as labels
rebase-update-refs: extract load_branch_decorations
load_branch_decorations: fix memory leak with non-static filters
Convert the reftable library such that we handle failures with the
new `reftable_buf` interfaces.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The `stack_filename()` function cannot pass any errors to the caller as it
has a `void` return type. Adapt it and its callers such that we can
handle errors and start handling allocation failures.
There are two interesting edge cases in `reftable_stack_destroy()` and
`reftable_addition_close()`. Both of these are trying to tear down their
respective structures, and while doing so they try to unlink some of the
tables they have been keeping alive. Any earlier attempts to do that may
fail on Windows because it keeps us from deleting such tables while they
are still open, and thus we re-try on close. It's okay and even expected
that this can fail when the tables are still open by another process, so
we handle the allocation failures gracefully and just skip over any file
whose name we couldn't figure out.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The `reftable_record_key()` function cannot pass any errors to the
caller as it has a `void` return type. Adapt it and its callers such
that we can handle errors and start handling allocation failures.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The `format_name()` function cannot pass any errors to the caller as it
has a `void` return type. Adapt it and its callers such that we can
handle errors and start handling allocation failures.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Adapt the name of the `strbuf` block source to no longer relate to this
interface, but instead to the `reftable_buf` interface.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Convert the reftable library to use the `reftable_buf` interface instead
of the `strbuf` interface. This is mostly a mechanical change via sed(1)
with some manual fixes where functions for `strbuf` and `reftable_buf`
differ. The converted code does not yet handle allocation failures. This
will be handled in subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Implement a new `reftable_buf` interface that will replace Git's own
`strbuf` interface. This is done due to three reasons:
- The `strbuf` interfaces do not handle memory allocation failures and
instead causes us to die. This is okay in the context of Git, but is
not in the context of the reftable library, which is supposed to be
usable by third-party applications.
- The `strbuf` interface is quite deeply tied into Git, which makes it
hard to use the reftable library as a standalone library. Any
dependent would have to carefully extract the relevant parts of it
to make things work, which is not all that sensible.
- The `strbuf` interface does not use the pluggable allocators that
can be set up via `reftable_set_alloc()`.
So we have good reasons to use our own type, and the implementation is
rather trivial. Implement our own type. Conversion of the reftable
library will be handled in subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
We're about to introduce our own `reftable_buf` type to replace
`strbuf`. One function we'll have to convert is `strbuf_addf()`, which
is used in a handful of places. This function uses `snprintf()`
internally, which makes porting it a bit more involved:
- It is not available on all platforms.
- Some platforms like Windows have broken implementations.
So by using `snprintf()` we'd also push the burden on downstream users
of the reftable library to make available a properly working version of
it.
Most callsites of `strbuf_addf()` are trivial to convert to not using
it. We do end up using `snprintf()` in our unit tests, but that isn't
much of a problem for downstream users of the reftable library.
While at it, remove a useless call to `strbuf_reset()` in
`t_reftable_stack_auto_compaction_with_locked_tables()`. We don't write
to the buffer before this and initialize it with `STRBUF_INIT`, so there
is no need to reset anything.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
We're about to introduce our own `reftable_buf` type to replace
`strbuf`. Get rid of the seldomly-used `strbuf_addbuf()` function such
that we have to reimplement one less function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
On macOS, fsmonitor can fall into a race condition that results in
a client waiting forever to be notified for an event that have
already happened. This problem has been corrected.
* jk/fsmonitor-event-listener-race-fix:
fsmonitor: initialize fs event listener before accepting clients
simple-ipc: split async server initialization and running
A new configuration variable remote.<name>.serverOption makes the
transport layer act as if the --serverOption=<value> option is
given from the command line.
* xx/remote-server-option-config:
ls-remote: leakfix for not clearing server_options
fetch: respect --server-option when fetching multiple remotes
transport.c:🤝 make use of server options from remote
remote: introduce remote.<name>.serverOption configuration
transport: introduce parse_transport_option() method
Code clean-up.
* jk/output-prefix-cleanup:
diff: store graph prefix buf in git_graph struct
diff: return line_prefix directly when possible
diff: return const char from output_prefix callback
diff: drop line_prefix_length field
line-log: use diff_line_prefix() instead of custom helper
Use after free and double freeing at the end in "git log -L... -p"
had been identified and fixed.
* ds/line-log-asan-fix:
line-log: protect inner strbuf from free
Doc update to clarify how periodical maintenance are scheduled,
spread across time to avoid thundering hurds.
* sk/doc-maintenance-schedule:
doc: add a note about staggering of maintenance
The reftable library is now prepared to expect that the memory
allocation function given to it may fail to allocate and to deal
with such an error.
* ps/reftable-alloc-failures: (26 commits)
reftable/basics: fix segfault when growing `names` array fails
reftable/basics: ban standard allocator functions
reftable: introduce `REFTABLE_FREE_AND_NULL()`
reftable: fix calls to free(3P)
reftable: handle trivial allocation failures
reftable/tree: handle allocation failures
reftable/pq: handle allocation failures when adding entries
reftable/block: handle allocation failures
reftable/blocksource: handle allocation failures
reftable/iter: handle allocation failures when creating indexed table iter
reftable/stack: handle allocation failures in auto compaction
reftable/stack: handle allocation failures in `stack_compact_range()`
reftable/stack: handle allocation failures in `reftable_new_stack()`
reftable/stack: handle allocation failures on reload
reftable/reader: handle allocation failures in `reader_init_iter()`
reftable/reader: handle allocation failures for unindexed reader
reftable/merged: handle allocation failures in `merged_table_init_iter()`
reftable/writer: handle allocation failures in `reftable_new_writer()`
reftable/writer: handle allocation failures in `writer_index_hash()`
reftable/record: handle allocation failures when decoding records
...
The way AsciiDoc is used for SYNOPSIS part of the manual pages has
been revamped. The sources, at least for the simple cases, got
vastly pleasant to work with.
* ja/doc-synopsis-markup:
doc: apply synopsis simplification on git-clone and git-init
doc: update the guidelines to reflect the current formatting rules
doc: introduce a synopsis typesetting
We can only check out commits or branches, not refs in general. And the
problem here is if another worktree is using the branch that we want to
check out.
Let’s be more direct and just talk about branches instead of refs.
Also replace “be held” with “in use”. Further, “in use” is not
restricted to a branch being checked out (e.g. the branch could be busy
on a rebase), hence generalize to “or otherwise in use” in the option
description.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `unbundle_from_file()` has two memory leaks:
- We do not release the `struct bundle_header header` when hitting
errors because we return early without any cleanup.
- We do not release the `struct strbuf bundle_ref` at all.
Plug these leaks by creating a common exit path where both of these
variables are released.
While at it, refactor the code such that the variable assignments do not
happen inside the conditional statement itself according to our coding
style.
Signed-off-by: Toon Claes <toon@iotcl.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was reported on the mailing list that running `git maintenance start`
immediately segfaults starting with b6c3f8e12c (builtin/maintenance: fix
leak in `get_schedule_cmd()`, 2024-09-26). And indeed, this segfault is
trivial to reproduce up to a point where one is scratching their head
why we didn't catch this regression in our test suite.
The root cause of this error is `get_schedule_cmd()`, which does not
populate the `out` parameter in all cases anymore starting with the
mentioned commit. Callers do assume it to always be populated though and
will e.g. call `strvec_split()` on the returned value, which will of
course segfault when the variable is uninitialized.
So why didn't we catch this trivial regression? The reason is that our
tests always set up the "GIT_TEST_MAINT_SCHEDULER" environment variable
via "t/test-lib.sh", which allows us to override the scheduler command
with a custom one so that we don't accidentally modify the developer's
system. But the faulty code where we don't set the `out` parameter will
only get hit in case that environment variable is _not_ set, which is
never the case when executing our tests.
Fix the regression by again unconditionally allocating the value in the
`out` parameter, if provided. Add a test that unsets the environment
variable to catch future regressions in this area.
Reported-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We explicitly avoid saying "ref <src>" when introducing the source
side of a refspec, because it can be a fully-spelled hexadecimal
object name, and it also can be a pattern that is not quite a "ref".
But we are loose when we introduce <dst> and say "ref <dst>", even
though it can also be a pattern. Let's omit "ref" also from the
destination side.
Clarify that <src> can be a ref, a (limited glob) pattern, or an
object name.
Even though the very original design of refspec expected that '*'
was used only at the end (e.g., "refs/heads/*" was expected, but not
"refs/heads/*-wip"), the code and its use evolved to handle a single
'*' anywhere in the pattern. Update the text to remove the mention
of "the same prefix". Anything that matches the pattern are named
by such a (limited glob) pattern in <src>.
Also put a bit more stress on the fact that we accept only one '*'
in the pattern by saying "one and only one `*`".
Helped-by: Monika Kairaitytė <monika@kibit.lt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test script uses "test - [def]", but when a test fails because
the file passed to it does not exist,
it fails silently without an error message.
Use test_path_* helper functions, which are designed to give better
error messages when their expectations are not met.
I have added a mechanical validation that applies the same transformation
done in this patch, when the test script is passed to a sed script as shown
below.
sed -e 's/^\( *\)test -f /\1test_path_is_file /' \
-e 's/^\( *\)test -d /\1test_path_is_dir /' \
-e 's/^\( *\)test -e /\1test_path_exists /' \
-e 's/^\( *\)! test -[edf] /\1test_path_is_missing /' \
-e 's/^\( *\)test ! -[edf] /\1test_path_is_missing /' \
"$1" >foo.sh
Reviewers can use the sed script to tranform the original test script and
compare the result in foo.sh with the results of applying the patch.
You will see an instance of "!(test -e 3)" which was manually replaced with
""test_path_is_missing 3", and everything else should match.
Careful and deliberate observation was done to check instances where
"test ! - [df] foo" was used in the test script to make sure that the test
instances were expecting foo to EITHER be a file or a directory, and NOT a
possibility of being both as this would make replacing "test ! -f foo" with
"test_path_is_missing foo" unreasonable.
In the tests control flow, foo has been created as EITHER a
reguar file OR a directory and should NOT exist
after "git clean" or "git clean -d", as the case maybe, has been called.
This made it reasonable to replace
"test ! -[df] foo" with "test_path_is_missing foo".
Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `loose.c`, we rely on the global variable `the_hash_algo`. As such we
have guarded the file with the 'USE_THE_REPOSITORY_VARIABLE' definition.
Let's derive the hash algorithm from the available repository variable
and remove this guard. This brings us one step closer to removing the
global 'the_repository' variable.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When interactively rebasing merge commits, the commit message is parsed to
extract a probably meaningful label name. For instance if the merge commit
is “Merge branch 'feature0'”, then the rebase script will have thes lines:
```
label feature0
merge -C $sha feature0 # “Merge branch 'feature0'
```
This heuristic fails in the case of octopus merges or when the merge commit
message is actually unrelated to the parent commits.
An example that combines both is:
```
*---. 967bfa4 (HEAD -> integration) Integration
|\ \ \
| | | * 2135be1 (feature2, feat2) Feature 2
| |_|/
|/| |
| | * c88b01a Feature 1
| |/
|/|
| * 75f3139 (feat0) Feature 0
|/
* 25c86d0 (main) Initial commit
```
yields the labels Integration, Integration-2 and Integration-3.
Fix this by using a branch name for each merge commit's parent that is the
tip of at least one branch, and falling back to a label derived from the
merge commit message otherwise.
In the example above, the labels become feat0, Integration and feature2.
Signed-off-by: Nicolas Guichard <nicolas@guichard.eu>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>