Check the final reallocation for adding the terminating NULL and handle
it just like those in the loop. Simply use REFTABLE_ALLOC_GROW instead
of keeping the REFTABLE_REALLOC_ARRAY call and adding code to preserve
the original pointer value around it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When realloc(3) fails, it returns NULL and keeps the original allocation
intact. REFTABLE_ALLOC_GROW overwrites both the original pointer and
the allocation count variable in that case, simultaneously leaking the
original allocation and misrepresenting the number of storable items.
parse_names() avoids the leak by keeping the original pointer if
reallocation fails, but still increase the allocation count in such a
case as if it succeeded. That's OK, because the error handling code
just frees everything and doesn't look at names_cap anymore.
reftable_buf_add() does the same, but here it is a problem as it leaves
the reftable_buf in a broken state, with ->alloc being roughly twice as
big as the actually allocated memory, allowing out-of-bounds writes in
subsequent calls.
Reimplement REFTABLE_ALLOC_GROW to avoid leaks, keep allocation counts
in sync and still signal failures to callers while avoiding code
duplication in callers. Make it an expression that evaluates to 0 if no
reallocation is needed or it succeeded and 1 on failure while keeping
the original pointer and allocation counter values.
Adjust REFTABLE_ALLOC_GROW_OR_NULL to the new calling convention for
REFTABLE_ALLOC_GROW, but keep its support for non-size_t alloc variables
for now.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent reftable updates mistook a NULL return from a request for
0-byte allocation as OOM and died unnecessarily, which has been
corrected.
* ps/reftable-alloc-failures-zalloc-fix:
reftable/basics: return NULL on zero-sized allocations
reftable/stack: fix zero-sized allocation when there are no readers
reftable/merged: fix zero-sized allocation when there are no readers
reftable/stack: don't perform auto-compaction with less than two tables
In the preceding commits we have fixed a couple of issues when
allocating zero-sized objects. These issues were masked by
implementation-defined behaviour. Quoting malloc(3p):
If size is 0, either:
* A null pointer shall be returned and errno may be set to an
implementation-defined value, or
* A pointer to the allocated space shall be returned. The
application shall ensure that the pointer is not used to access an
object.
So it is perfectly valid that implementations of this function may or
may not return a NULL pointer in such a case.
Adapt both `reftable_malloc()` and `reftable_realloc()` so that they
return NULL pointers on zero-sized allocations. This should remove any
implementation-defined behaviour in our allocators and thus allows us to
detect such platform-specific issues more easily going forward.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We include "hash.h" in "reftable/system.h" such that we can use hash
format IDs as well as the raw size of SHA1 and SHA256. As we are in the
process of converting the reftable library to become standalone we of
course cannot rely on those constants anymore.
Introduce a new `enum reftable_hash` to replace internal uses of the
hash format IDs and new constants that replace internal uses of the hash
size. Adapt the reftable backend to set up the correct hash function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the reftable library to use the `reftable_buf` interface instead
of the `strbuf` interface. This is mostly a mechanical change via sed(1)
with some manual fixes where functions for `strbuf` and `reftable_buf`
differ. The converted code does not yet handle allocation failures. This
will be handled in subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Implement a new `reftable_buf` interface that will replace Git's own
`strbuf` interface. This is done due to three reasons:
- The `strbuf` interfaces do not handle memory allocation failures and
instead causes us to die. This is okay in the context of Git, but is
not in the context of the reftable library, which is supposed to be
usable by third-party applications.
- The `strbuf` interface is quite deeply tied into Git, which makes it
hard to use the reftable library as a standalone library. Any
dependent would have to carefully extract the relevant parts of it
to make things work, which is not all that sensible.
- The `strbuf` interface does not use the pluggable allocators that
can be set up via `reftable_set_alloc()`.
So we have good reasons to use our own type, and the implementation is
rather trivial. Implement our own type. Conversion of the reftable
library will be handled in subsequent commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When growing the `names` array fails we would end up with a `NULL`
pointer. This causes two problems:
- We would run into a segfault because we try to free names that we
have assigned to the array already.
- We lose track of the old array and cannot free its contents.
Fix this issue by using a temporary variable. Like this we do not
clobber the old array that we tried to reallocate, which will remain
valid when a call to realloc(3P) fails.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable library uses pluggable allocators, which means that we
shouldn't ever use the standard allocator functions. But it is an easy
mistake to make to accidentally use e.g. free(3P) instead of the
reftable-specific `reftable_free()` function, and we do not have any
mechanism to detect this misuse right now.
Introduce a couple of macros that ban the standard allocators, similar
to how we do it in "banned.h".
Note that we do not ban the following two classes of functions:
- Macros like `FREE_AND_NULL()` or `REALLOC_ARRAY()`. As those expand
to code that contains already-banned functions we'd get a compiler
error even without banning those macros explicitly.
- Git-specific allocators like `xmalloc()` and friends. The primary
reason is that there are simply too many of them, so we're rather
aiming for best effort here. Furthermore, the eventual goal is to
make them unavailable in the reftable library place by not pulling
them in via "git-compat-utils.h" anymore.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handle allocation failures in `parse_names()` by returning `NULL` in
case any allocation fails. While at it, refactor the function to return
the array directly instead of assigning it to an out-pointer.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handle allocation failures in `reftable_calloc()`.
While at it, remove our use of `st_mult()` that would cause us to die on
an overflow. From the caller's point of view there is not much of a
difference between arguments that are too large to be multiplied and a
request that is too big to handle by the allocator: in both cases the
allocation cannot be fulfilled. And in neither of these cases do we want
the reftable library to die.
While we could use `unsigned_mult_overflows()` to handle the overflow
gracefully, we instead open-code it to further our goal of converting
the reftable codebase to become a standalone library that can be reused
by external projects.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable library provides the ability to swap out allocators. There
is a gap here though, because we continue to use `xstrdup()` even in the
case where all the other allocators have been swapped out.
Introduce `reftable_strdup()` that uses `reftable_malloc()` to do the
allocation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The split between "basics" and "publicbasics" is somewhat arbitrary and
not in line with how we typically structure code in the reftable
library. While we do indeed split up headers into a public and internal
part, we don't do that for the compilation unit itself. Furthermore, the
declarations for "publicbasics.c" are in "reftable-malloc.h", which
isn't in line with our naming schema, either.
Fix these inconsistencies by:
- Merging "publicbasics.c" into "basics.c".
- Renaming "reftable-malloc.h" to "reftable-basics.h" as the public
header.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're about to enable `-Wwrite-strings`, which changes the type of
string constants to `const char[]`. Fix various sites where we assign
such constants to non-const variables.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing the binary search over restart points in a block we need to
decode the record keys. This decoding step can result in an error when
the block is corrupted, which we indicate to the caller of the binary
search by setting `args.error = 1`. But the only caller that exists
mishandles this because it in fact performs the error check before
calling `binsearch()`.
Fix this bug by checking for errors at the right point in time.
Furthermore, refactor `binsearch()` so that it aborts the search in case
the callback function returns a negative value so that we don't
needlessly continue to search the block.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `binsearch()` function can be used to find the first element for
which a callback functions returns a truish value. But while the array
size is of type `size_t`, the function in fact returns an `int` that is
supposed to index into that array.
Fix the function signature to return a `size_t`. This conversion does
not change any semantics given that the function would only ever return
a value in the range `[0, sz]` anyway.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the stack length is already stored as `size_t`, we frequently use
`int`s to refer to those stacks throughout the reftable library. Convert
those cases to use `size_t` instead to make things consistent.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Throughout the reftable library we have many cases where we need to grow
arrays. In order to avoid too many reallocations, we roughly double the
capacity of the array on each iteration. The resulting code pattern is
duplicated across many sites.
We have similar patterns in our main codebase, which is why we have
eventually introduced an `ALLOC_GROW()` macro to abstract it away and
avoid some code duplication. We cannot easily reuse this macro here
though because `ALLOC_GROW()` uses `REALLOC_ARRAY()`, which in turn will
call realloc(3P) to grow the array. The reftable code is structured as a
library though (even if the boundaries are fuzzy), and one property this
brings with it is that it is possible to plug in your own allocators. So
instead of using realloc(3P), we need to use `reftable_realloc()` that
knows to use the user-provided implementation.
So let's introduce two new macros `REFTABLE_REALLOC_ARRAY()` and
`REFTABLE_ALLOC_GROW()` that mirror what we do in our main codebase,
with two modifications:
- They use `reftable_realloc()`, as explained above.
- They use a different growth factor of `2 * cap + 1` instead of `(cap
+ 16) * 3 / 2`.
The second change is because we know a bit more about the allocation
patterns in the reftable library. In most cases, we end up only having a
handful of items in the array and don't end up growing them. The initial
capacity that our normal growth factor uses (which is 24) would thus end
up over-allocating in a lot of code paths. This effect is measurable:
- Before change:
HEAP SUMMARY:
in use at exit: 671,983 bytes in 152 blocks
total heap usage: 3,843,446 allocs, 3,843,294 frees, 223,761,402 bytes allocated
- After change with a growth factor of `(2 * alloc + 1)`:
HEAP SUMMARY:
in use at exit: 671,983 bytes in 152 blocks
total heap usage: 3,843,446 allocs, 3,843,294 frees, 223,761,410 bytes allocated
- After change with a growth factor of `(alloc + 16)* 2 / 3`:
HEAP SUMMARY:
in use at exit: 671,983 bytes in 152 blocks
total heap usage: 3,833,673 allocs, 3,833,521 frees, 4,728,251,742 bytes allocated
While the total heap usage is roughly the same, we do end up allocating
significantly more bytes with our usual growth factor (in fact, roughly
21 times as many).
Convert the reftable library to use these new macros.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit provides basic utility classes for the reftable library.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>