Replace "test -f" and friends to use the test_path_is_file helper
function and friends from test-lib-functions.sh. These functions
perform identical operations while enhancing debugging capabilities
in case of test failures.
The original used 'test ! -f' to check if the file has been
correctly cleaned, so 'test ! -e' would have been a better choice.
Replace them with 'test_path_is_missing'.
Signed-off-by: Vincenzo Mezzela <vincenzo.mezzela@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fsmonitor_refresh_callback() to handle case-insensitive
lookups if case-sensitive lookups fail on case-insensitive systems.
This can cause 'git status' to report stale status for files if there
are case issues/errors in the worktree.
The FSMonitor daemon sends FSEvents using the observed spelling
of each pathname. On case-insensitive file systems this may be
different than the expected case spelling.
The existing code uses index_name_pos() to find the cache-entry for
the pathname in the FSEvent and clear the CE_FSMONITOR_VALID bit so
that the worktree scan/index refresh will revisit and revalidate the
path.
On a case-insensitive file system, the exact match lookup may fail
to find the associated cache-entry. This causes status to think that
the cached CE flags are correct and skip over the file.
Update event handling to optionally use the name-hash and dir-name-hash
if necessary.
Also update t7527 to convert the "test_expect_failure" to "_success"
now that we have fixed the bug.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor code in the fsmonitor_refresh_callback() call chain dealing
with invalidating the CE_FSMONITOR_VALID bit and add a trace message.
During the refresh, we clear the CE_FSMONITOR_VALID bit in response to
data from the FSMonitor daemon (so that a later phase will lstat() and
verify the true state of the file).
Create a new function to clear the bit and add some unique tracing for
it to help debug edge cases.
This is similar to the existing `mark_fsmonitor_invalid()` function,
but it also does untracked-cache invalidation and we've already
handled that in the refresh-callback handlers, so but we don't need
to repeat that.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Consolidate the directory/non-directory calls to the refresh handler
code. Log the resulting count of invalidated cache-entries.
The nr_in_cone value will be used in a later commit to decide if
we also need to try to do case-insensitive lookups.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the refresh callback helper function for unqualified FSEvents
(pathnames without a trailing slash) to return the number of
cache-entries that were invalided in response to the event.
This will be used in a later commit to help determine if the observed
pathname was (possibly) case-incorrect when (on a case-insensitive
file system).
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In b0f6b6b523 (refs/reftable: don't fail empty transactions in repo
without HEAD, 2024-02-27), we have added a new test to t0610. This test
contains a useless assignment to a variable that is never actually used.
Remove it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduced z/OS (OS/390) as a platform in config.mak.uname
Signed-off-by: Haritha D <harithamma.d@ibm.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Modernize the formatting of the test script to align with current
standards and improve its overall readability.
Signed-off-by: Aryan Gupta <garyan447@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ancestry-path is an option, not a command - mark it as such.
This brings it in sync with the rest of usages in the file
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
bare in that context is an option, not purely an adjective
Mark it properly
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark --force as option rather than variable names
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-branch(1) will error out if you give it a bad ref name. But the user
might not understand why or what part of the name is illegal.
The user might know that there are some limitations based on the *loose
ref* format (filenames), but there are also further rules for
easier integration with shell-based tools, pathname expansion, and
playing well with reference name expressions.
The man page for git-check-ref-format(1) contains these rules. Let’s
advise about it since that is not a command that you just happen
upon. Also make this advise configurable since you might not want to be
reminded every time you make a little typo.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use double quotes like we use for “die” in this document.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use backticks for inline-verbatim rather than single quotes. Also quote
the unquoted ref globs.
Also replace “the add command” with “`git add`”.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In general, rewrite entries to the following form:
1. Clause or sentence describing when the advice is shown
2. Optional “to <verb>” clause which says what the advice is
about (e.g. for resetNoRefresh: tell the user that they can use
`--no-refresh`)
Concretely:
1. Use “shown” instead of “advice shown”
• “advice” is implied and a bit repetitive
2. Use “when” instead of “if”
3. Lead with “Shown when” and end the entry with the effect it has,
where applicable
4. Use “the user” instead of “a user” or “you”
5. implicitIdentity: rewrite description in order to lead with *when*
the advice is shown (see point (3))
6. Prefer the present tense (with the exception of pushNonFFMatching)
7. waitingForEditor: give example of relevance in this new context
8. pushUpdateRejected: exception to the above principles
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests use a preliminary heredoc for `expect` or have setup and
teardown commands before and after, respectively. It is however
preferred to keep all the logic in the test itself. Let’s move these
into the tests.
Also:
• Remove a now-irrelevant comment about test placement and switch back
to `main` post-test
• Prefer indented literal heredocs (`-\EOF`) except for a block which
says that this is intentional
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
d3b3419f8f (config: tell the user that we expect an ASCII character,
2023-03-27) updated an error message to make clear that this option
specifically wants an ASCII character but neglected to consider the
config documentation.
Reported-by: Manlio Perillo <manlio.perillo@gmail.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also update the DEF_VER in GIT-VERSION-GEN, which I forgot to do
earlier (it should have been done when we started the new cycle).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git for-each-ref" learned "--include-root-refs" option to show
even the stuff outside the 'refs/' hierarchy.
* kn/for-all-refs:
for-each-ref: add new option to include root refs
ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR'
refs: introduce `refs_for_each_include_root_refs()`
refs: extract out `loose_fill_ref_dir_regular_file()`
refs: introduce `is_pseudoref()` and `is_headref()`
When a merge conflicted at a submodule, merge-ort backend used to
unconditionally give a lengthy message to suggest how to resolve
it. Now the message can be squelched as an advice message.
* pb/ort-make-submodule-conflict-message-an-advice:
merge-ort: turn submodule conflict suggestions into an advice
Clarify wording in the CodingGuidelines that requires <git-compat-util.h>
to be the first header file.
* jc/doc-compat-util:
doc: clarify the wording on <git-compat-util.h> requirement
An error message from "git upload-pack", which responds to "git
fetch" requests, had a trialing NUL in it, which has been
corrected.
* sg/upload-pack-error-message-fix:
upload-pack: don't send null character in abort message to the client
Many small allocations "git name-rev" makes have been updated to
allocate from a mem-pool.
* rs/name-rev-with-mempool:
name-rev: use mem_pool_strfmt()
mem-pool: add mem_pool_strfmt()
The logic to access reflog entries by date and number had ugly
corner cases at the boundaries, which have been cleaned up.
* jk/reflog-special-cases-fix:
read_ref_at(): special-case ref@{0} for an empty reflog
get_oid_basic(): special-case ref@{n} for oldest reflog entry
Revert "refs: allow @{n} to work with n-sized reflog"
The code incorrectly attempted to use textconv cache when asked,
even when we are not running in a repository, which has been
corrected.
* jk/textconv-cache-outside-repo-fix:
userdiff: skip textconv caching when not in a repository
The reflog iterator enumerates all reflogs known to a ref backend. In
the "reftable" backend there is no way to list all existing reflogs
directly. Instead, we have to iterate through all reflog entries and
discard all those redundant entries for which we have already returned a
reflog entry.
This logic is implemented by tracking the last reflog name that we have
emitted to the iterator's user. If the next log record has the same name
we simply skip it until we find another record with a different refname.
This last reflog name is stored in a simple C string, which requires us
to free and reallocate it whenever we need to update the reflog name.
Convert it to use a `struct strbuf` instead, which reduces the number of
allocations. Before:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 1,068,485 allocs, 1,068,363 frees, 281,122,886 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 68,485 allocs, 68,363 frees, 256,234,072 bytes allocated
Note that even after this change we still allocate quite a lot of data,
even though the number of allocations does not scale with the number of
log records anymore. This remainder comes mostly from decompressing the
log blocks, where we decompress each block into newly allocated memory.
This will be addressed at a later point in time.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When decoding log records we need a temporary buffer to decode the
reflog entry's name, mail address and message. As this buffer is local
to the function we thus have to reallocate it for every single log
record which we're about to decode, which is inefficient.
Refactor the code such that callers need to pass in a scratch buffer,
which allows us to reuse it for multiple decodes. This reduces the
number of allocations when iterating through reflogs. Before:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 2,068,487 allocs, 2,068,365 frees, 305,122,946 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 1,068,485 allocs, 1,068,363 frees, 281,122,886 bytes allocated
Note that this commit also drop some redundant calls to `strbuf_reset()`
right before calling `decode_string()`. The latter already knows to
reset the buffer, so there is no need for these.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Same as the preceding commit we can allocate log messages as needed when
decoding log records, thus further reducing the number of allocations.
Before:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 3,068,488 allocs, 3,068,366 frees, 307,122,961 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 2,068,487 allocs, 2,068,365 frees, 305,122,946 bytes allocated
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When decoding a log record we always reallocate their refname arrays.
This results in quite a lot of needless allocation churn.
Refactor the code to grow the array as required only. Like this, we
should usually only end up reallocating the array a small handful of
times when iterating over many refs. Before:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 4,068,487 allocs, 4,068,365 frees, 332,011,793 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 3,068,488 allocs, 3,068,366 frees, 307,122,961 bytes allocated
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each reflog entry contains information regarding the authorship of who
has made the change. This authorship information is not the same as that
of any of the commits that the reflog entry references, but instead
corresponds to the local user that has executed the command. Thus, it is
almost always the case that all reflog entries have the same author.
We can make use of this fact when decoding reftable records: instead of
freeing and then reallocating the authorship information of log records,
we can special-case when the next record during an iteration has the
exact same authorship as the preceding record. If so, then there is no
need to reallocate the respective fields.
This change results in two allocations less per log record that we're
iterating over in the most common case. Before:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 4,068,487 allocs, 4,068,365 frees, 332,011,793 bytes allocated
An alternative would be to store the capacity of both name and email and
then use `REFTABLE_ALLOC_GROW()` to conditionally reallocate the array.
But reftable records are copied around quite a lot, and thus we need to
be a bit mindful of the overall record size. Furthermore, a memory
comparison should also be more efficient than having to copy over memory
even if we wouldn't have to allocate a new array every time.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7af607c58d (reftable/record: store "val1" hashes as static arrays,
2024-01-03) and b31e3cc620 (reftable/record: store "val2" hashes as
static arrays, 2024-01-03) we have converted ref records to store their
object IDs in a static array. Convert log records to do the same so that
their old and new object IDs are arrays, too.
This change results in two allocations less per log record that we're
iterating over. Before:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 8,068,495 allocs, 8,068,373 frees, 401,011,862 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 13,473 bytes in 122 blocks
total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating a new reflog iterator, we first have to reload the stack
that the iterator is being created. This is done so that any concurrent
writes to the stack are reflected. But `reflog_iterator_for_stack()`
always reloads the main stack, which is wrong.
Fix this and reload the correct stack.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/reftable-iteration-perf-part2:
refs/reftable: precompute prefix length
reftable: allow inlining of a few functions
reftable/record: decode keys in place
reftable/record: reuse refname when copying
reftable/record: reuse refname when decoding
reftable/merged: avoid duplicate pqueue emptiness check
reftable/merged: circumvent pqueue with single subiter
reftable/merged: handle subiter cleanup on close only
reftable/merged: remove unnecessary null check for subiters
reftable/merged: make subiters own their records
reftable/merged: advance subiter on subsequent iteration
reftable/merged: make `merged_iter` structure private
reftable/pq: use `size_t` to track iterator index
We clarified how "clean.requireForce" interacts with the "--dry-run"
option in the previous commit, both in the implementation and in the
documentation. Even when "git clean" (without other options) is
required to be used with "--force" (i.e. either clean.requireForce
is unset, or explicitly set to true) to protect end-users from
casual invocation of the command by mistake, "--dry-run" does not
require "--force" to be used, because it is already its own
protection mechanism by being a no-op to the working tree files.
The previous commit, however, missed another clean-up opportunity
around the same area. Just like in the "--dry-run" mode, the
command in the "--interactive" mode does not require "--force",
either. This is because by going interactive and giving the end
user one more chance to confirm, the mode itself is serving as its
own protection mechanism.
Let's take things one step further, and unify the code that defines
interaction between "--force" and these two other options. Just
like we added explanation for the reason why "--dry-run" does not
honor "clean.requireForce", give an explanation for the reason why
"--interactive" makes "clean.requireForce" to be ignored.
Finally, add some tests to show the interaction between "--force"
and "--interactive". We already have tests that show interaction
between "--force" and "--dry-run", but didn't test "--interactive".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're recomputing the prefix length on every iteration of the ref
iterator. Precompute it for another speedup when iterating over 1
million refs:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 100.3 ms ± 3.7 ms [User: 97.3 ms, System: 2.8 ms]
Range (min … max): 97.5 ms … 139.7 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 95.8 ms ± 3.4 ms [User: 92.9 ms, System: 2.8 ms]
Range (min … max): 93.0 ms … 121.9 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a few functions which are basically just accessors to
structures. As those functions are executed inside the hot loop when
iterating through many refs, the fact that they cannot be inlined is
costing us some performance.
Move the function definitions into their respective headers so that they
can be inlined. This results in a performance improvement when iterating
over 1 million refs:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 105.9 ms ± 3.6 ms [User: 103.0 ms, System: 2.8 ms]
Range (min … max): 103.1 ms … 133.4 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 100.7 ms ± 3.4 ms [User: 97.8 ms, System: 2.8 ms]
Range (min … max): 97.8 ms … 124.0 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading a record from a block, we need to decode the record's key.
As reftable keys are prefix-compressed, meaning they reuse a prefix from
the preceding record's key, this is a bit more involved than just having
to copy the relevant bytes: we need to figure out the prefix and suffix
lengths, copy the prefix from the preceding record and finally copy the
suffix from the current record.
This is done by passing three buffers to `reftable_decode_key()`: one
buffer that holds the result, one buffer that holds the last key, and
one buffer that points to the current record. The final key is then
assembled by calling `strbuf_add()` twice to copy over the prefix and
suffix.
Performing two memory copies is inefficient though. And we can indeed do
better by decoding keys in place. Instead of providing two buffers, the
caller may only call a single buffer that is already pre-populated with
the last key. Like this, we only have to call `strbuf_setlen()` to trim
the record to its prefix and then `strbuf_add()` to add the suffix.
This refactoring leads to a noticeable performance bump when iterating
over 1 million refs:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 112.2 ms ± 3.9 ms [User: 109.3 ms, System: 2.8 ms]
Range (min … max): 109.2 ms … 149.6 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 106.0 ms ± 3.5 ms [User: 103.2 ms, System: 2.7 ms]
Range (min … max): 103.2 ms … 133.7 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.06 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do the same optimization as in the preceding commit, but this time for
`reftable_record_copy()`. While not as noticeable, it still results in a
small speedup when iterating over 1 million refs:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 114.0 ms ± 3.8 ms [User: 111.1 ms, System: 2.7 ms]
Range (min … max): 110.9 ms … 144.3 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 112.5 ms ± 3.7 ms [User: 109.5 ms, System: 2.8 ms]
Range (min … max): 109.2 ms … 140.7 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.01 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When decoding a reftable record we will first release the user-provided
record and then decode the new record into it. This is quite inefficient
as we basically need to reallocate at least the refname every time.
Refactor the function to start tracking the refname capacity. Like this,
we can stow away the refname, release, restore and then grow the refname
to the required number of bytes via `REFTABLE_ALLOC_GROW()`.
This refactoring is safe to do because all functions that assigning to
the refname will first call `reftable_ref_record_release()`, which will
zero out the complete record after releasing memory.
This change results in a nice speedup when iterating over 1 million
refs:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 124.0 ms ± 3.9 ms [User: 121.1 ms, System: 2.7 ms]
Range (min … max): 120.4 ms … 152.7 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 114.4 ms ± 3.7 ms [User: 111.5 ms, System: 2.7 ms]
Range (min … max): 111.0 ms … 152.1 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.08 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)
Furthermore, with this change we now perform a mostly constant number of
allocations when iterating. Before this change:
HEAP SUMMARY:
in use at exit: 13,603 bytes in 125 blocks
total heap usage: 1,006,620 allocs, 1,006,495 frees, 25,398,363 bytes allocated
After this change:
HEAP SUMMARY:
in use at exit: 13,603 bytes in 125 blocks
total heap usage: 6,623 allocs, 6,498 frees, 509,592 bytes allocated
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `merged_iter_next_void()` we first check whether the iter
has been exhausted already. We already perform this check two levels
down the stack in `merged_iter_next_entry()` though, which makes this
check redundant.
Now if this check was there to accelerate the common case it might have
made sense to keep it. But the iterator being exhausted is rather the
uncommon case because you can expect most reftable stacks to contain
more than two refs.
Simplify the code by removing the check. As `merged_iter_next_void()` is
basically empty except for calling `merged_iter_next()` now, merge these
two functions. This also results in a tiny speedup when iterating over
many refs:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 125.6 ms ± 3.8 ms [User: 122.7 ms, System: 2.8 ms]
Range (min … max): 122.4 ms … 153.4 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 124.0 ms ± 3.9 ms [User: 121.1 ms, System: 2.8 ms]
Range (min … max): 120.1 ms … 156.4 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.01 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The merged iterator uses a priority queue to order records so that we
can yielid them in the expected order. This priority queue of course
comes with some overhead as we need to add, compare and remove entries
in that priority queue.
In the general case, that overhead cannot really be avoided. But when we
have a single subiter left then there is no need to use the priority
queue anymore because the order is exactly the same as what that subiter
would return.
While having a single subiter may sound like an edge case, it happens
more frequently than one might think. In the most common scenario, you
can expect a repository to have a single large table that contains most
of the records and then a set of smaller tables which contain later
additions to the reftable stack. In this case it is quite likely that we
exhaust subiters of those smaller stacks before exhausting the large
table.
Special-case this and return records directly from the remaining
subiter. This results in a sizeable speedup when iterating over 1m refs
in a repository with a single table:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 135.4 ms ± 4.4 ms [User: 132.5 ms, System: 2.8 ms]
Range (min … max): 131.0 ms … 166.3 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 126.3 ms ± 3.9 ms [User: 123.3 ms, System: 2.8 ms]
Range (min … max): 122.7 ms … 157.0 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.07 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When advancing one of the subiters fails we immediately release
resources associated with that subiter. This is not necessary though as
we will release these resources when closing the merged iterator anyway.
Drop the logic and only release resources when the merged iterator is
done. This is a mere cleanup that should help reduce the cognitive load
when reading through the code.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever we advance a subiter we first call `iterator_is_null()`. This
is not needed though because we only ever advance subiters which have
entries in the priority queue, and we do not end entries to the priority
queue when the subiter has been exhausted.
Drop the check as well as the now-unused function. This results in a
surprisingly big speedup:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 138.1 ms ± 4.4 ms [User: 135.1 ms, System: 2.8 ms]
Range (min … max): 133.4 ms … 167.3 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 134.4 ms ± 4.2 ms [User: 131.5 ms, System: 2.8 ms]
Range (min … max): 130.0 ms … 164.0 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.03 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For each subiterator, the merged table needs to track their current
record. This record is owned by the priority queue though instead of by
the merged iterator. This is not optimal performance-wise.
For one, we need to move around records whenever we add or remove a
record from the priority queue. Thus, the bigger the entries the more
bytes we need to copy around. And compared to pointers, a reftable
record is rather on the bigger side. The other issue is that this makes
it harder to reuse the records.
Refactor the code so that the merged iterator tracks ownership of the
records per-subiter. Instead of having records in the priority queue, we
can now use mere pointers to the per-subiter records. This also allows
us to swap records between the caller and the per-subiter record instead
of doing an actual copy via `reftable_record_copy_from()`, which removes
the need to release the caller-provided record.
This results in a noticeable speedup when iterating through many refs.
The following benchmark iterates through 1 million refs:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 145.5 ms ± 4.5 ms [User: 142.5 ms, System: 2.8 ms]
Range (min … max): 141.3 ms … 177.0 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 139.0 ms ± 4.7 ms [User: 136.1 ms, System: 2.8 ms]
Range (min … max): 134.2 ms … 182.2 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)
This refactoring also allows a subsequent refactoring where we start
reusing memory allocated by the reftable records because we do not need
to release the caller-provided record anymore.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>