Overhaul of the reftable API.
* ps/reftable-api-revamp:
reftable/table: move printing logic into test helper
reftable/constants: make block types part of the public interface
reftable/table: introduce iterator for table blocks
reftable/table: add `reftable_table` to the public interface
reftable/block: expose a generic iterator over reftable records
reftable/block: make block iterators reseekable
reftable/block: store block pointer in the block iterator
reftable/block: create public interface for reading blocks
git-zlib: use `struct z_stream_s` instead of typedef
reftable/block: rename `block_reader` to `reftable_block`
reftable/block: rename `block` to `block_data`
reftable/table: move reading block into block reader
reftable/block: simplify how we track restart points
reftable/blocksource: consolidate code into a single file
reftable/reader: rename data structure to "table"
reftable: fix formatting of the license header
Since a call to repo_config() can be called with repo set to NULL
these days, a command that is marked as RUN_SETUP in the builtin
command table does not have to check repo with NULL before making
the call.
* ua/call-repo-config-with-possibly-null-repository:
builtin/difftool: remove unnecessary if statement
builtin/add: remove unnecessary if statement
Various build tweaks, including CSPRNG selection on some platforms.
* rj/build-tweaks:
config.mak.uname: set CSPRNG_METHOD to getrandom on Linux
config.mak.uname: add arc4random to the cygwin build
config.mak.uname: add sysinfo() configuration for cygwin
builtin/gc.c: correct RAM calculation when using sysinfo
config.mak.uname: add clock_gettime() to the cygwin build
config.mak.uname: add HAVE_GETDELIM to the cygwin section
config.mak.uname: only set NO_REGEX on cygwin for v1.7
config.mak.uname: add a note about NO_STRLCPY for Linux
Makefile: remove NEEDS_LIBRT build variable
meson.build: set default help format to html on windows
meson.build: only set build variables for non-default values
Makefile: only set some BASIC_CFLAGS when RUNTIME_PREFIX is set
meson.build: remove -DCURL_DISABLE_TYPECHECK
Update parse-options API to catch mistakes to pass address of an
integral variable of a wrong type/size.
* ps/parse-options-integers:
parse-options: detect mismatches in integer signedness
parse-options: introduce precision handling for `OPTION_UNSIGNED`
parse-options: introduce precision handling for `OPTION_INTEGER`
parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()`
parse-options: support unit factors in `OPT_INTEGER()`
global: use designated initializers for options
parse: fix off-by-one for minimum signed values
Document the convention to disable hooks altogether by setting the
hooksPath configuration variable to /dev/nulll
* ds/doc-disable-hooks:
docs: document core.hooksPath=/dev/null
Code clean-up.
* ps/object-file-cleanup:
object-store: merge "object-store-ll.h" and "object-store.h"
object-store: remove global array of cached objects
object: split out functions relating to object store subsystem
object-file: drop `index_blob_stream()`
object-file: split up concerns of `HASH_*` flags
object-file: split out functions relating to object store subsystem
object-file: move `xmmap()` into "wrapper.c"
object-file: move `git_open_cloexec()` to "compat/open.c"
object-file: move `safe_create_leading_directories()` into "path.c"
object-file: move `mkdir_in_gitdir()` into "path.c"
Make sure outage of third-party sites that supply P4, Git-LFS, and
JGit we use for testing would not prevent our CI jobs from running
at all.
* jc/ci-skip-unavailable-external-software:
ci: skip unavailable external software
Ever since we issued 2.49, external forces broke our CI jobs in
various ways, and we had to adjust our code to work them around.
Backmerge them from the 'master' front to make it easier to test
real changes to the maintenance track.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make sure outage of third-party sites that supply P4, Git-LFS, and
JGit we use for testing would not prevent our CI jobs from running
at all.
* jc/ci-skip-unavailable-external-software:
ci: skip unavailable external software
The ci/install-dependencies.sh script used in a very early phase of
our CI jobs downloads Perforce, Git-LFS, and JGit, used for running
the test scripts. The test framework is prepared to properly skip
the tests that depend on these external software, but the CI script
is unnecessarily strict (due to its use of "set -e" in ci/lib.sh)
and fails the entire CI run before even starting to test the rest of
the system.
Notice a failure to download to any of these external software, but
keep going. We need to be careful about cleaning after a failed
wget, as a later part of the script that does:
if type jgit >/dev/null 2>&1
then
echo "$(tput setaf 6)JGit Version$(tput sgr0)"
jgit version
else
echo >&2 "WARNING: JGit wasn't installed, see above for clues why"
fi
will (surprise!) succeed running "type jgit", and then fail with
"jgit version", taking the whole thing down due to "set -e".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log --{left,right}-only A...B", when A and B does not share
any common ancestor, now behaves as expected.
* mh/left-right-limited:
revision: fix --left/right-only use with unrelated histories
Doc mark-up updates.
* ja/doc-reset-mv-rm-markup-updates:
doc: add markup for characters in Guidelines
doc: fix asciidoctor synopsis processing of triple-dots
doc: convert git-mv to new documentation format
doc: move synopsis git-mv commands in the synopsis section
doc: convert git-rm to new documentation format
doc: fix synopsis analysis logic
doc: convert git-reset to new documentation format
Optimize the code to dedup references recorded in a bundle file.
* kn/bundle-dedup-optim:
bundle: fix non-linear performance scaling with refs
t6020: test for duplicate refnames in bundle creation
"make perf" fixes.
* pb/perf-test-fixes:
p7821: fix instructions for testing with threads
p9210: fix 'scalar clone' when running from a detached HEAD
p7821: fix test_perf invocation for prereqs
When using the launchctl scheduler, the weekly job runs daily, and the
daily job runs on the first six days of each month. This appears to be
due to specifying "Day" in the calendar intervals, which according to
launchd.plist(5) is for specifying days of the month rather than days of
the week. The behaviour of running a job on the 0th day is undocumented,
but in my testing appears to be the same as not specifying "Day" in the
calendar interval, in which case the job will run daily.
Use "Weekday" in the calendar intervals, which is the correct way to
schedule jobs to run on specific days of the week.
Signed-off-by: Josh Heinrichs <joshiheinrichs@gmail.com>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we already teach the `repo_config()` in "f29f1990b5
(config: teach repo_config to allow `repo` to be NULL, 2025-03-08)"
to allow `repo` to be NULL, no need to check if `repo` is NULL
before calling `repo_config()`.
Suggested-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we already teach the `repo_config()` in "f29f1990b5
(config: teach repo_config to allow `repo` to be NULL, 2025-03-08)"
to allow `repo` to be NULL, no need to check if `repo` is NULL
before calling `repo_config()`.
Suggested-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common way to run Git's performance benchmarks on repositories other
than Git's own repository (which is not exactly large when compared to
actually large repositories) is to run them like this:
GIT_PERF_LARGE_REPO=/path/to/my/large/repo \
./p1234-*.sh -ivx
Contrary to developers' common expectations, this failed to work when
Git was built with a different `GIT_PERF_LARGE_REPO` value specified at
build time: That build-time option would have been written to the
`GIT-BUILD-OPTIONS` file, which in turn would have been sourced by
`test-lib.sh`, which in turn would have been sourced by `perf-lib.sh`,
which in turn would have been sourced by the perf test script,
_overriding_ the environment variable specified in the way illustrated
above.
Since perf tests are not run as part of the build, this most likely
unintended behavior was not caught and certainly not fixed, as the
`GIT_PERF_*` values would have been empty at build-time.
However, in 4638e8806e (Makefile: use common template for
GIT-BUILD-OPTIONS, 2024-12-06), a subtle change of behavior was
introduced: Whereas before, a couple of build-time options (the
`GIT_PERF_*` ones included) were written to `GIT-BUILD-OPTIONS` only
when their values were non-empty. With this commit, they are also
written when they are empty.
The consequence is that above-mentioned way to run the perf tests will
not only fail to pick up the desired `GIT_PERF_*` settings when they
were specified differently while building Git, instead the desired
settings will be only respected when specified _while building_ Git.
Let's work around the original issue, i.e. let `GIT_PERF_*` environment
variables override what is recorded in `GIT-BUILD-OPTIONS`.
Note that this is just the tip of the iceberg, there are a couple of
`GIT_TEST_*` options that may want a similar fix in `test-lib.sh`. Due
to time constraints on my side, this here patch focuses exclusively on
the `GIT_PERF_*` settings.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit started to insist TAG_F1_ONLY to be missing,
which was not in the original. Let's not be overly eager in the
conversion.
Also, the other hunk in the commit introduced a shell syntax error,
causing the test to fail. Fix it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Above the declaration of git_work_tree_cfg, we have:
/* This is set by setup_git_dir_gently() and/or git_default_config() */
char *git_work_tree_cfg;
It can be verified that there is no function called
'setup_git_dir_gently' by running grep on the codebase:
$ grep -R setup_git_dir_gently .
./environment.c:/* This is set by setup_git_dir_gently() and/or git_default_config() */
The comment, introduced in e90fdc39b6 (Clean up work-tree handling), is
the only occurrence of the name 'setup_git_dir_gently'.
It probably meant 'setup_git_directory_gently' as that is a name of a
real function in setup.c. Correct it.
Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 05cd988dce ("wrapper: add a helper to generate numbers from a
CSPRNG", 2022-01-17) added a csprng_bytes() function which used one
of several interfaces to provide a source of cryptographically secure
pseudorandom numbers. The CSPRNG_METHOD make variable was provided to
determine the choice of available 'backends' for the source of random
bytes.
Commit 05cd988dce did not set CSPRNG_METHOD in the Linux section of
the config.mak.uname file, so it defaults to using '/dev/urandom' as
the source of random bytes. The 'backend' values which could be used
on Linux are 'arc4random', 'getrandom' or 'getentropy' ('openssl' is
an option, but seems to be discouraged).
The arc4random routines (arc4random_buf() is the one actually used) were
added to glibc in version 2.36, while both getrandom() and getentropy()
were included in 2.25. So, some of the more up-to-date distributions of
Linux (eg Debian 12, Ubuntu 24.04) would be able to use the 'arc4random'
setting. All currently supported distributions have glibc 2.25 or later
(RHEL 8 has v2.28) and, therefore, have support for the 'getrandom' and
'getentropy' settings.
The arc4random routines on the *BSDs (along with cygwin) implement the
ChaCha20 stream cipher algorithm (see RFC8439) in userspace, rather than
as a system call, and are thus somewhat faster (having avoided a context
switch to the kernel). In contrast, on Linux all three functions are
simple wrappers around the same kernel CSPRNG syscall.
If the meson build system is used on a newer platform, then they will be
configured to use 'arc4random', whereas the make build will currently
default to using '/dev/urandom' on Linux. Since there is no advantage,
in terms of performance, to the 'arc4random' setting, the 'getrandom'
setting should be preferred from an availability perspective. (Also, the
current uses of csprng_bytes() are not in any hot path).
In order to set an appropriate default, set the CSPRNG_METHOD build
variable to 'getrandom' in the Linux section of the 'config.mak.uname'
file.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Incorrect sorting of refs with bytes with high-bit set on platforms
with signed char led to a BUG, which has been corrected.
* ps/refname-avail-check-optim:
refs/packed: fix BUG when seeking refs with UTF-8 characters
Remove remnants of the recursive merge strategy backend, which was
superseded by the ort merge strategy.
* en/merge-recursive-debug:
builtin/{merge,rebase,revert}: remove GIT_TEST_MERGE_ALGORITHM
tests: remove GIT_TEST_MERGE_ALGORITHM and test_expect_merge_algorithm
merge-recursive.[ch]: thoroughly debug these
merge, sequencer: switch recursive merges over to ort
sequencer: switch non-recursive merges over to ort
merge-ort: enable diff-algorithms other than histogram
builtin/merge-recursive: switch to using merge_ort_generic()
checkout: replace merge_trees() with merge_ort_nonrecursive()
"git blame --porcelain" mode now talks about unblamable lines and
lines that are blamed to an ignored commit.
* kn/blame-porcelain-unblamable:
blame: print unblamable and ignored commits in porcelain mode
"git fetch [<remote>]" with only the configured fetch refspec
should be the only thing to update refs/remotes/<remote>/HEAD,
but the code was overly eager to do so in other cases.
* jk/fetch-follow-remote-head-fix:
fetch: make set_head() call easier to read
fetch: don't ask for remote HEAD if followRemoteHEAD is "never"
fetch: only respect followRemoteHEAD with configured refspecs
It was reported that "t5620-backfill.sh" fails on s390x and sparc64 in a
test that exercises the "--min-batch-size" command line option. The
symptom was that the option didn't seem to have an effect: we didn't
fetch objects with a batch size of 20, but instead fetched all objects
at once.
As it turns out, the root cause is that `--min-batch-size` uses
`OPT_INTEGER()` to parse the command line option. While this macro
expects the caller to pass a pointer to an integer, we instead pass a
pointer to a `size_t`. This coincidentally works on most platforms, but
it breaks apart on the mentioned platforms because they are big endian.
This issue isn't specific to git-backfill(1): there are a couple of
other places where we have the same type confusion going on. This
indicates that the issue really is the interface that the parse-options
subsystem provides -- it is simply too easy to get this wrong as there
isn't any kind of compiler warning, and things just work on the most
common systems.
Address the systemic issue by introducing two new build asserts
`BARF_UNLESS_SIGNED()` and `BARF_UNLESS_UNSIGNED()`. As the names
already hint at, those macros will cause a compiler error when passed a
value that is not signed or unsigned, respectively.
Adapt `OPT_INTEGER()`, `OPT_UNSIGNED()` as well as `OPT_MAGNITUDE()` to
use those asserts. This uncovers a small set of sites where we indeed
have the same bug as in git-backfill(1). Adapt all of them to use the
correct option.
Reported-by: Todd Zullinger <tmz@pobox.com>
Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is the equivalent to the preceding commit, but instead of
introducing precision handling for `OPTION_INTEGER` we introduce it for
`OPTION_UNSIGNED`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `OPTION_INTEGER` option type accepts a signed integer. The type of
the underlying integer is a simple `int`, which restricts the range of
values accepted by such options. But there is a catch: because the
caller provides a pointer to the value via the `.value` field, which is
a simple void pointer. This has two consequences:
- There is no check whether the passed value is sufficiently long to
store the entire range of `int`. This can lead to integer wraparound
in the best case and out-of-bounds writes in the worst case.
- Even when a caller knows that they want to store a value larger than
`INT_MAX` they don't have a way to do so.
In practice this doesn't tend to be a huge issue because users typically
don't end up passing huge values to most commands. But the parsing logic
is demonstrably broken, and it is too easy to get the calling convention
wrong.
Improve the situation by introducing a new `precision` field into the
structure. This field gets assigned automatically by `OPT_INTEGER_F()`
and tracks the size of the passed value. Like this it becomes possible
for the caller to pass arbitrarily-sized integers and the underlying
logic knows to handle it correctly by doing range checks. Furthermore,
convert the code to use `strtoimax()` intstead of `strtol()` so that we
can also parse values larger than `LONG_MAX`.
Note that we do not yet assert signedness of the passed variable, which
is another source of bugs. This will be handled in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the preceding commit, `OPT_INTEGER()` has learned to support unit
factors. Consequently, the major differencen between `OPT_INTEGER()` and
`OPT_MAGNITUDE()` isn't the support of unit factors anymore, as both of
them do support them now. Instead, the difference is that one handles
signed and the other handles unsigned integers.
Adapt the name of `OPT_MAGNITUDE()` accordingly by renaming it to
`OPT_UNSIGNED()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two main differences between `OPT_INTEGER()` and
`OPT_MAGNITUDE()`:
- The former parses signed integers whereas the latter parses unsigned
integers.
- The latter parses unit factors like 'k', 'm' or 'g'.
While the first difference makes obvious sense, there isn't really a
good reason why signed integers shouldn't support unit factors, too.
This inconsistency will also become a bit of a problem with subsequent
commits, where we will fix a couple of callsites that pass an unsigned
integer to `OPT_INTEGER()`. There are three options:
- We could adapt those users to instead pass a signed integer, but
this would needlessly extend the range of accepted integer values.
- We could convert them to use `OPT_MAGNITUDE()`, as it only accepts
unsigned integers. But now we have the inconsistency that we also
start to accept unit factors.
- We could introduce `OPT_UNSIGNED()` as equivalent to `OPT_INTEGER()`
so that it knows to only accept unsigned integers without unit
suffix.
Introducing a whole new option type feels a bit excessive. There also
isn't really a good reason why `OPT_INTEGER()` cannot be extended to
also accept unit factors: all valid values passed to such options cannot
have a unit factors right now, so there wouldn't be any ambiguity.
Refactor `OPT_INTEGER()` to use `git_parse_int()`, which knows to
interpret unit factors. This removes the inconsistency between the
signed and unsigned options so that we can easily fix up callsites that
pass the wrong integer type right now.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we expose macros for most of our different option types understood
by the "parse-options" subsystem, not every combination of fields that
has one as that would otherwise quickly lead to an explosion of macros.
Instead, we just initialize structures manually for those variants of
fields that don't have a macro.
Callsites that open-code these structure initialization don't use
designated initializers though and instead just provide values for each
of the fields that they want to initialize. This has three significant
downsides:
- Callsites need to specify all values up to the last field that they
care about. This often includes fields that should simply be left at
their default zero-initialized state, which adds distraction.
- Any reader not deeply familiar with the layout of the structure
has a hard time figuring out what the respective initializers mean.
- Reordering or introducing new fields in the middle of the structure
is impossible without adapting all callsites.
Convert all sites to instead use designated initializers, which we have
started using in our codebase quite a while ago. This allows us to skip
any default-initialized fields, gives the reader context by specifying
the field names and allows us to reorder or introduce new fields where
we want to.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We accept a maximum value in `git_parse_signed()` that restricts the
range of accepted integers. As the intent is to pass `INT*_MAX` values
here, this maximum doesn't only act as the upper bound, but also as the
implicit lower bound of the accepted range.
This lower bound is calculated by negating the maximum. But given that
the maximum value of a signed integer with N bits is `2^(N-1)-1` whereas
the minimum value is `-2^(N-1)` we have an off-by-one error in the lower
bound.
Fix this off-by-one error by using `-max - 1` as lower bound instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The arc4random_buf() function has been available in cygwin since
about 2016 (somewhere in the v2.x branch). Set the CSPRNG_METHOD
build variable to 'arc4random', in the cygwin section, to enable
the use of this cryptographically-secure pseudorandom number
function. Note that the autoconf and new meson builds also enable
this function.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>