Commit Graph

73107 Commits (v2.45.1)

Author SHA1 Message Date
Junio C Hamano ce47f7c85f GitHub Actions: update to checkout@v4
We seem to be getting "Node.js 16 actions are deprecated." warnings
for jobs that use checkout@v3.  Except for the i686 containers job
that is kept at checkout@v1 [*], update to checkout@v4, which is
said to use Node.js 20.

[*] 6cf4d908 (ci(main): upgrade actions/checkout to v3, 2022-12-05)
    refers to https://github.com/actions/runner/issues/2115 and
    explains why container jobs are kept at checkout@v1.  We may
    want to check the current status of the issue and move it to the
    same version as other jobs, but that is outside the scope of
    this step.

Backported-from: e94dec0c1d (GitHub Actions: update to checkout@v4, 2024-02-02)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:00:57 +02:00
Johannes Schindelin 6133e3a77e Merge branch 'quicker-asan-lsan'
This patch speeds up the `asan`/`lsan` jobs that are really slow enough
already.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:00:56 +02:00
Johannes Schindelin ea094eec54 Merge branch 'jk/test-lsan-denoise-output'
Tests with LSan from time to time seem to emit harmless message
that makes our tests unnecessarily flakey; we work it around by
filtering the uninteresting output.

* jk/test-lsan-denoise-output:
  test-lib: ignore uninteresting LSan output

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17 00:00:54 +02:00
Johannes Schindelin b12dcab61d Merge branch 'js/ci-use-macos-13'
Replace macos-12 used at GitHub CI with macos-13.

* js/ci-use-macos-13:
  ci: upgrade to using macos-13

This is another backport to `maint-2.39` to allow less CI jobs to break.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:59:03 +02:00
Johannes Schindelin c7db432de6 Merge branch 'backport/jk/libcurl-8.7-regression-workaround' into maint-2.39
Fix was added to work around a regression in libcURL 8.7.0 (which has
already been fixed in their tip of the tree).

* jk/libcurl-8.7-regression-workaround:
  remote-curl: add Transfer-Encoding header only for older curl
  INSTALL: bump libcurl version to 7.21.3
  http: reset POSTFIELDSIZE when clearing curl handle

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:58:53 +02:00
Johannes Schindelin e1813a335c Merge branch 'jk/redact-h2h3-headers-fix' into maint-2.42
HTTP Header redaction code has been adjusted for a newer version of
cURL library that shows its traces differently from earlier
versions.

* jk/redact-h2h3-headers-fix:
  http: update curl http/2 info matching for curl 8.3.0
  http: factor out matching of curl http/2 trace lines

This backport to `maint-2.39` is needed to bring the following test
cases back to a working state in conjunction with recent libcurl
versions:

- t5559.17 GIT_TRACE_CURL redacts auth details
- t5559.18 GIT_CURL_VERBOSE redacts auth details
- t5559.38 cookies are redacted by default

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:58:48 +02:00
Johannes Schindelin ef0fc42829 Merge branch 'jk/httpd-test-updates'
Test update.

* jk/httpd-test-updates:
  t/lib-httpd: increase ssl key size to 2048 bits
  t/lib-httpd: drop SSLMutex config
  t/lib-httpd: bump required apache version to 2.4
  t/lib-httpd: bump required apache version to 2.2

This is a backport onto the `maint-2.39` branch, to improve the CI
health of that branch.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:58:40 +02:00
Johannes Schindelin e3cbeb9673 Merge branch 'jk/http-test-fixes'
Various fix-ups on HTTP tests.

* jk/http-test-fixes:
  t5559: make SSL/TLS the default
  t5559: fix test failures with LIB_HTTPD_SSL
  t/lib-httpd: enable HTTP/2 "h2" protocol, not just h2c
  t/lib-httpd: respect $HTTPD_PROTO in expect_askpass()
  t5551: drop curl trace lines without headers
  t5551: handle v2 protocol in cookie test
  t5551: simplify expected cookie file
  t5551: handle v2 protocol in upload-pack service test
  t5551: handle v2 protocol when checking curl trace
  t5551: stop forcing clone to run with v0 protocol
  t5551: handle HTTP/2 when checking curl trace
  t5551: lower-case headers in expected curl trace
  t5551: drop redundant grep for Accept-Language
  t5541: simplify and move "no empty path components" test
  t5541: stop marking "used receive-pack service" test as v0 only
  t5541: run "used receive-pack service" test earlier

This is a backport onto the `maint-2.39` branch, starting to take care
of making that branch's CI builds healthy again.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:58:06 +02:00
Johannes Schindelin cf5dcd817a ci(linux-asan/linux-ubsan): let's save some time
Every once in a while, the `git-p4` tests flake for reasons outside of
our control. It typically fails with "Connection refused" e.g. here:
https://github.com/git/git/actions/runs/5969707156/job/16196057724

	[...]
	+ git p4 clone --dest=/home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git //depot
	  Initialized empty Git repository in /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git/.git/
	  Perforce client error:
		Connect to server failed; check $P4PORT.
		TCP connect to localhost:9807 failed.
		connect: 127.0.0.1:9807: Connection refused
	  failure accessing depot: could not run p4
	  Importing from //depot into /home/runner/work/git/git/t/trash directory.t9807-git-p4-submit/git
	 [...]

This happens in other jobs, too, but in the `linux-asan`/`linux-ubsan`
jobs it hurts the most because those jobs often take an _awfully_ long
time to run, therefore re-running a failed `linux-asan`/`linux-ubsan`
jobs is _very_ costly.

The purpose of the `linux-asan`/`linux-ubsan` jobs is to exercise the C
code of Git, anyway, and any part of Git's source code that the `git-p4`
tests run and that would benefit from the attention of ASAN/UBSAN are
run better in other tests anyway, as debugging C code run via Python
scripts can get a bit hairy.

In fact, it is not even just `git-p4` that is the problem (even if it
flakes often enough to be problematic in the CI builds), but really the
part about Python scripts. So let's just skip any Python parts of the
tests from being run in that job.

For good measure, also skip the Subversion tests because debugging C
code run via Perl scripts is as much fun as debugging C code run via
Python scripts. And it will reduce the time this very expensive job
takes, which is a big benefit.

Backported to `maint-2.39` as another step to get that branch's CI
builds back to a healthy state.

Backported-from: 6ba913629f (ci(linux-asan-ubsan): let's save some time, 2023-08-29)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-16 23:58:04 +02:00
Jeff King c67cf4c434 test-lib: ignore uninteresting LSan output
When I run the tests in leak-checking mode the same way our CI job does,
like:

  make SANITIZE=leak \
       GIT_TEST_PASSING_SANITIZE_LEAK=true \
       GIT_TEST_SANITIZE_LEAK_LOG=true \
       test

then LSan can racily produce useless entries in the log files that look
like this:

  ==git==3034393==Unable to get registers from thread 3034307.

I think they're mostly harmless based on the source here:

  7e0a52e8e9/compiler-rt/lib/lsan/lsan_common.cpp (L414)

which reads:

    PtraceRegistersStatus have_registers =
        suspended_threads.GetRegistersAndSP(i, &registers, &sp);
    if (have_registers != REGISTERS_AVAILABLE) {
      Report("Unable to get registers from thread %llu.\n", os_id);
      // If unable to get SP, consider the entire stack to be reachable unless
      // GetRegistersAndSP failed with ESRCH.
      if (have_registers == REGISTERS_UNAVAILABLE_FATAL)
        continue;
      sp = stack_begin;
    }

The program itself still runs fine and LSan doesn't cause us to abort.
But test-lib.sh looks for any non-empty LSan logs and marks the test as
a failure anyway, under the assumption that we simply missed the failing
exit code somehow.

I don't think I've ever seen this happen in the CI job, but running
locally using clang-14 on an 8-core machine, I can't seem to make it
through a full run of the test suite without having at least one
failure. And it's a different one every time (though they do seem to
often be related to packing tests, which makes sense, since that is one
of our biggest users of threaded code).

We can hack around this by only counting LSan log files that contain a
line that doesn't match our known-uninteresting pattern.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-16 23:58:04 +02:00
Johannes Schindelin 3167b60e5b ci: upgrade to using macos-13
In April, GitHub announced that the `macos-13` pool is available:
https://github.blog/changelog/2023-04-24-github-actions-macos-13-is-now-available/.
It is only a matter of time until the `macos-12` pool is going away,
therefore we should switch now, without pressure of a looming deadline.

Since the `macos-13` runners no longer include Python2, we also drop
specifically testing with Python2 and switch uniformly to Python3, see
https://github.com/actions/runner-images/blob/HEAD/images/macos/macos-13-Readme.md
for details about the software available on the `macos-13` pool's
runners.

Also, on macOS 13, Homebrew seems to install a `gcc@9` package that no
longer comes with a regular `unistd.h` (there seems only to be a
`ssp/unistd.h`), and hence builds would fail with:

    In file included from base85.c:1:
    git-compat-util.h:223:10: fatal error: unistd.h: No such file or directory
      223 | #include <unistd.h>
          |          ^~~~~~~~~~
    compilation terminated.

The reason why we install GCC v9.x explicitly is historical, and back in
the days it was because it was the _newest_ version available via
Homebrew: 176441bfb5 (ci: build Git with GCC 9 in the 'osx-gcc' build
job, 2019-11-27).

To reinstate the spirit of that commit _and_ to fix that build failure,
let's switch to the now-newest GCC version: v13.x.

Backported-from: 682a868f67 (ci: upgrade to using macos-13, 2023-11-03)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-16 23:58:03 +02:00
Johannes Schindelin bddc176e79 Merge branch 'jh/fsmonitor-darwin-modernize'
Stop using deprecated macOS API in fsmonitor.

* jh/fsmonitor-darwin-modernize:
  fsmonitor: eliminate call to deprecated FSEventStream function

This backport to `maint-2.39` is needed to be able to build on
`macos-13`, which we need to update to as we restore the CI health of
that branch.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2024-04-16 23:55:55 +02:00
Junio C Hamano 21306a098c The twentieth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 14:50:31 -07:00
Junio C Hamano 93e3f9df7a Merge branch 'pw/t3428-cleanup'
Test cleanup.

* pw/t3428-cleanup:
  t3428: restore coverage for "apply" backend
  t3428: use test_commit_message
  t3428: modernize test setup
2024-04-16 14:50:31 -07:00
Junio C Hamano 51c15ac1b6 Merge branch 'ba/osxkeychain-updates'
Update osxkeychain backend with features required for the recent
credential subsystem.

* ba/osxkeychain-updates:
  osxkeychain: store new attributes
  osxkeychain: erase matching passwords only
  osxkeychain: erase all matching credentials
  osxkeychain: replace deprecated SecKeychain API
2024-04-16 14:50:30 -07:00
Junio C Hamano 82a31ec324 Merge branch 'jt/reftable-geometric-compaction'
The strategy to compact multiple tables of reftables after many
operations accumulate many entries has been improved to avoid
accumulating too many tables uncollected.

* jt/reftable-geometric-compaction:
  reftable/stack: use geometric table compaction
  reftable/stack: add env to disable autocompaction
  reftable/stack: expose option to disable auto-compaction
2024-04-16 14:50:30 -07:00
Junio C Hamano 2b49e41155 Merge branch 'tb/make-indent-conditional-with-non-spaces'
Adjust to an upcoming changes to GNU make that breaks our Makefiles.

* tb/make-indent-conditional-with-non-spaces:
  Makefile(s): do not enforce "all indents must be done with tab"
  Makefile(s): avoid recipe prefix in conditional statements
2024-04-16 14:50:29 -07:00
Junio C Hamano a7589384d5 Merge branch 'rs/usage-fallback-to-show-message-format'
vreportf(), which is usede by error() and friends, has been taught
to give the error message printf-format string when its vsnprintf()
call fails, instead of showing nothing useful to identify the
nature of the error.

* rs/usage-fallback-to-show-message-format:
  usage: report vsnprintf(3) failure
2024-04-16 14:50:29 -07:00
Junio C Hamano 107313eb11 Merge branch 'rs/date-mode-pass-by-value'
The codepaths that reach date_mode_from_type() have been updated to
pass "struct date_mode" by value to make them thread safe.

* rs/date-mode-pass-by-value:
  date: make DATE_MODE thread-safe
2024-04-16 14:50:29 -07:00
Junio C Hamano 2d642afb0a Merge branch 'sj/userdiff-c-sharp'
The userdiff patterns for C# has been updated.

Acked-by: Johannes Sixt <j6t@kdbg.org>
cf. <c2154457-3f2f-496e-9b8b-c8ea7257027b@kdbg.org>

* sj/userdiff-c-sharp:
  userdiff: better method/property matching for C#
2024-04-16 14:50:28 -07:00
Junio C Hamano 625ef1c6f1 Merge branch 'tb/t7700-fixup'
Test fix.

* tb/t7700-fixup:
  t/t7700-repack.sh: fix test breakages with `GIT_TEST_MULTI_PACK_INDEX=1 `
2024-04-16 14:50:28 -07:00
Junio C Hamano 92e8388bd3 Merge branch 'jc/local-extern-shell-rules'
Document and apply workaround for a buggy version of dash that
mishandles "local var=val" construct.

* jc/local-extern-shell-rules:
  t1016: local VAR="VAL" fix
  t0610: local VAR="VAL" fix
  t: teach lint that RHS of 'local VAR=VAL' needs to be quoted
  t: local VAR="VAL" (quote ${magic-reference})
  t: local VAR="VAL" (quote command substitution)
  t: local VAR="VAL" (quote positional parameters)
  CodingGuidelines: quote assigned value in 'local var=$val'
  CodingGuidelines: describe "export VAR=VAL" rule
2024-04-16 14:50:27 -07:00
René Scharfe 20fee9af9e apply: avoid using fixed-size buffer in write_out_one_reject()
On some systems PATH_MAX is not a hard limit.  Support longer paths by
building them on the heap instead of using static buffers.

Take care to work around (arguably buggy) implementations of free(3)
that change errno by calling it only after using the errno value.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 13:38:35 -07:00
Marcel Röthke 167395bb47 rerere: fix crashes due to unmatched opening conflict markers
When rerere handles a conflict with an unmatched opening conflict marker
in a file with other conflicts, it will fail create a preimage and also
fail allocate the status member of struct rerere_dir. Currently the
status member is allocated after the error handling. This will lead to a
SEGFAULT when the status member is accessed during cleanup of the failed
parse.

Additionally, in subsequent executions of rerere, after removing the
MERGE_RR.lock manually, rerere crashes for a similar reason. MERGE_RR
points to a conflict id that has no preimage, therefore the status
member is not allocated and a SEGFAULT happens when trying to check if a
preimage exists.

Solve this by making sure the status field is allocated correctly and add
tests to prevent the bug from reoccurring.

This does not fix the root cause, failing to parse stray conflict
markers, but I don't think we can do much better than recognizing it,
printing an error, and moving on gracefully.

Signed-off-by: Marcel Röthke <marcel@roethke.info>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 08:42:36 -07:00
Junio C Hamano 548fe35913 The ninteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 14:11:44 -07:00
Junio C Hamano cb25f97eab Merge branch 'jc/t2104-style-fixes'
Test style fixes.

* jc/t2104-style-fixes:
  t2104: style fixes
2024-04-15 14:11:44 -07:00
Junio C Hamano b415f15b49 Merge branch 'jc/unleak-core-excludesfile'
The variable that holds the value read from the core.excludefile
configuration variable used to leak, which has been corrected.

* jc/unleak-core-excludesfile:
  config: do not leak excludes_file
2024-04-15 14:11:44 -07:00
Junio C Hamano eba498a774 Merge branch 'jk/libcurl-8.7-regression-workaround'
Fix was added to work around a regression in libcURL 8.7.0 (which has
already been fixed in their tip of the tree).

* jk/libcurl-8.7-regression-workaround:
  remote-curl: add Transfer-Encoding header only for older curl
  INSTALL: bump libcurl version to 7.21.3
  http: reset POSTFIELDSIZE when clearing curl handle
2024-04-15 14:11:44 -07:00
Junio C Hamano 372aabe912 Merge branch 'ps/t0610-umask-fix'
The "shared repository" test in the t0610 reftable test failed
under restrictive umask setting (e.g. 007), which has been
corrected.

* ps/t0610-umask-fix:
  t0610: execute git-pack-refs(1) with specified umask
  t0610: make `--shared=` tests reusable
2024-04-15 14:11:43 -07:00
Junio C Hamano d75ec4c627 Merge branch 'gt/add-u-commit-i-pathspec-check'
"git add -u <pathspec>" and "git commit [-i] <pathspec>" did not
diagnose a pathspec element that did not match any files in certain
situations, unlike "git add <pathspec>" did.

* gt/add-u-commit-i-pathspec-check:
  builtin/add: error out when passing untracked path with -u
  builtin/commit: error out when passing untracked path with -i
  revision: optionally record matches with pathspec elements
2024-04-15 14:11:43 -07:00
Junio C Hamano 6c142bc846 Merge branch 'ds/fetch-config-parse-microfix'
A config parser callback function fell through instead of returning
after recognising and processing a variable, wasting cycles, which
has been corrected.

* ds/fetch-config-parse-microfix:
  fetch: return when parsing submodule.recurse
2024-04-15 14:11:43 -07:00
Junio C Hamano ce729ea9ba Merge branch 'rs/apply-reject-fd-leakfix'
A file descriptor leak in an error codepath, used when "git apply
--reject" fails to create the *.rej file, has been corrected.

* rs/apply-reject-fd-leakfix:
  apply: don't leak fd on fdopen() error
2024-04-15 14:11:43 -07:00
Junio C Hamano c7a9ec4728 Merge branch 'rs/apply-lift-path-length-limit'
"git apply" has been updated to lift the hardcoded pathname length
limit, which in turn allowed a mksnpath() function that is no
longer used.

* rs/apply-lift-path-length-limit:
  path: remove mksnpath()
  apply: avoid fixed-size buffer in create_one_file()
2024-04-15 14:11:42 -07:00
Junio C Hamano 509cc1d413 Merge branch 'ma/win32-unix-domain-socket'
Windows binary used to decide the use of unix-domain socket at
build time, but it learned to make the decision at runtime instead.

* ma/win32-unix-domain-socket:
  Win32: detect unix socket support at runtime
2024-04-15 14:11:42 -07:00
René Scharfe 21b5821acd imap-send: increase command size limit
nfvasprintf() has a 8KB limit, but it's not relevant, as its result is
combined with other strings and added to a 1KB buffer by its caller.
That 1KB limit is not mentioned in RFC 9051, which specifies IMAP.

While 1KB is plenty for user names, passwords and mailbox names,
there's no point in limiting our commands like that.  Call xstrvfmt()
instead of open-coding it and use strbuf to format the command to
send, as we need its length.  Fail hard if it exceeds INT_MAX, because
socket_write() can't take more than that.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 11:34:17 -07:00
Peter Krefting 8198993c81 bisect: report the found commit with "show"
When "git bisect" finds the first bad commit and shows it to the user,
it calls "git diff-tree" to do so, whose output is meant to be stable
and deliberately ignores end-user customizations.

As the output is supposed to be consumed by humans, replace this with
a call to "git show". This command honors configuration options (such
as "log.date" and "log.mailmap") and other UI improvements (renames
are detected).

Pass some hard-coded options to "git show" to make the output similar
to the one we are replacing, such as showing a patch summary only.

Reported-by: Michael Osipov <michael.osipov@innomotics.com>
Signed-off-By: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 11:29:09 -07:00
Yehezkel Bernat f412d72c19 Documentation: fix linkgit reference
In git-replay documentation, linkgit to git-rev-parse is missing the
man section, which breaks its rendering.

Add section number as done in other references to this command.

Signed-off-by: Yehezkel Bernat <YehezkelShB@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 11:02:43 -07:00
René Scharfe 44bdba2fa6 git-compat-util: fix NO_OPENSSL on current macOS
b195aa00c1 (git-compat-util: suppress unavoidable Apple-specific
deprecation warnings, 2014-12-16) started to define
__AVAILABILITY_MACROS_USES_AVAILABILITY in git-compat-util.h.  On
current versions it is already defined (e.g. on macOS 14.4.1).  Undefine
it before redefining it to avoid a compilation error.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 11:01:31 -07:00
Patrick Steinhardt 795006fff4 pack-bitmap: gracefully handle missing BTMP chunks
In 0fea6b73f1 (Merge branch 'tb/multi-pack-verbatim-reuse', 2024-01-12)
we have introduced multi-pack verbatim reuse of objects. This series has
introduced a new BTMP chunk, which encodes information about bitmapped
objects in the multi-pack index. Starting with dab60934e3 (pack-bitmap:
pass `bitmapped_pack` struct to pack-reuse functions, 2023-12-14) we use
this information to figure out objects which we can reuse from each of
the packfiles.

One thing that we glossed over though is backwards compatibility with
repositories that do not yet have BTMP chunks in their multi-pack index.
In that case, `nth_bitmapped_pack()` would return an error, which causes
us to emit a warning followed by another error message. These warnings
are visible to users that fetch from a repository:

```
$ git fetch
...
remote: error: MIDX does not contain the BTMP chunk
remote: warning: unable to load pack: 'pack-f6bb7bd71d345ea9fe604b60cab9ba9ece54ffbe.idx', disabling pack-reuse
remote: Enumerating objects: 40, done.
remote: Counting objects: 100% (40/40), done.
remote: Compressing objects: 100% (39/39), done.
remote: Total 40 (delta 5), reused 0 (delta 0), pack-reused 0 (from 0)
...
```

While the fetch succeeds the user is left wondering what they did wrong.
Furthermore, as visible both from the warning and from the reuse stats,
pack-reuse is completely disabled in such repositories.

What is quite interesting is that this issue can even be triggered in
case `pack.allowPackReuse=single` is set, which is the default value.
One could have expected that in this case we fall back to the old logic,
which is to use the preferred packfile without consulting BTMP chunks at
all. But either we fail with the above error in case they are missing,
or we use the first pack in the multi-pack-index. The former case
disables pack-reuse altogether, whereas the latter case may result in
reusing objects from a suboptimal packfile.

Fix this issue by partially reverting the logic back to what we had
before this patch series landed. Namely, in the case where we have no
BTMP chunks or when `pack.allowPackReuse=single` are set, we use the
preferred pack instead of consulting the BTMP chunks.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:42:00 -07:00
Patrick Steinhardt 9da5c992dd reftable/block: avoid copying block iterators on seek
When seeking a reftable record in a block we need to position the
iterator _before_ the sought-after record so that the next call to
`block_iter_next()` would yield that record. To achieve this, the loop
that performs the linear seeks to restore the previous position once it
has found the record.

This is done by advancing two `block_iter`s: one to check whether the
next record is our sought-after record, and one that we update after
every iteration. This of course involves quite a lot of copying and also
leads to needless memory allocations.

Refactor the code to get rid of the `next` iterator and the copying this
involves. Instead, we can restore the previous offset such that the call
to `next` will return the correct record.

Next to being simpler conceptually this also leads to a nice speedup.
The following benchmark parser 10k refs out of 100k existing refs via
`git-rev-list --no-walk`:

  Benchmark 1: rev-list: print many refs (HEAD~)
    Time (mean ± σ):     170.2 ms ±   1.7 ms    [User: 86.1 ms, System: 83.6 ms]
    Range (min … max):   166.4 ms … 180.3 ms    500 runs

  Benchmark 2: rev-list: print many refs (HEAD~)
    Time (mean ± σ):     161.6 ms ±   1.6 ms    [User: 78.1 ms, System: 83.0 ms]
    Range (min … max):   158.4 ms … 172.3 ms    500 runs

  Summary
    rev-list: print many refs (HEAD) ran
      1.05 ± 0.01 times faster than rev-list: print many refs (HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:37:59 -07:00
Patrick Steinhardt ce1f213cc9 reftable/block: reuse `zstream` state on inflation
When calling `inflateInit()` and `inflate()`, the zlib library will
allocate several data structures for the underlying `zstream` to keep
track of various information. Thus, when inflating repeatedly, it is
possible to optimize memory allocation patterns by reusing the `zstream`
and then calling `inflateReset()` on it to prepare it for the next chunk
of data to inflate.

This is exactly what the reftable code is doing: when iterating through
reflogs we need to potentially inflate many log blocks, but we discard
the `zstream` every single time. Instead, as we reuse the `block_reader`
for each of the blocks anyway, we can initialize the `zstream` once and
then reuse it for subsequent inflations.

Refactor the code to do so, which leads to a significant reduction in
the number of allocations. The following measurements were done when
iterating through 1 million reflog entries. Before:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 23,028 allocs, 22,906 frees, 162,813,552 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 302 allocs, 180 frees, 88,352 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt 15a60b747e reftable/block: open-code call to `uncompress2()`
The reftable format stores log blocks in a compressed format. Thus,
whenever we want to read such a block we first need to decompress it.
This is done by calling the convenience function `uncompress2()` of the
zlib library, which is a simple wrapper that manages the lifecycle of
the `zstream` structure for us.

While nice for one-off inflation of data, when iterating through reflogs
we will likely end up inflating many such log blocks. This requires us
to reallocate the state of the `zstream` every single time, which adds
up over time. It would thus be great to reuse the `zstream` instead of
discarding it after every inflation.

Open-code the call to `uncompress2()` such that we can start reusing the
`zstream` in the subsequent commit. Note that our open-coded variant is
different from `uncompress2()` in two ways:

  - We do not loop around `inflate()` until we have processed all input.
    As our input is limited by the maximum block size, which is 16MB, we
    should not hit limits of `inflate()`.

  - We use `Z_FINISH` instead of `Z_NO_FLUSH`. Quoting the `inflate()`
    documentation: "inflate() should normally be called until it returns
    Z_STREAM_END or an error. However if all decompression is to be
    performed in a single step (a single call of inflate), the parameter
    flush should be set to Z_FINISH."

    Furthermore, "Z_FINISH also informs inflate to not maintain a
    sliding window if the stream completes, which reduces inflate's
    memory footprint."

Other than that this commit is expected to be functionally equivalent
and does not yet reuse the `zstream`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt dd347bbce6 reftable/block: reuse uncompressed blocks
The reftable backend stores reflog entries in a compressed format and
thus needs to uncompress blocks before one can read records from it.
For each reflog block we thus have to allocate an array that we can
decompress the block contents into. This block is being discarded
whenever the table iterator moves to the next block. Consequently, we
reallocate a new array on every block, which is quite wasteful.

Refactor the code to reuse the uncompressed block data when moving the
block reader to a new block. This significantly reduces the number of
allocations when iterating through many compressed blocks. The following
measurements are done with `git reflog list` when listing 100k reflogs.
Before:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 45,755 allocs, 45,633 frees, 254,779,456 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 23,028 allocs, 22,906 frees, 162,813,547 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt b00bcb7c49 reftable/reader: iterate to next block in place
The table iterator has to iterate towards the next block once it has
yielded all records of the current block. This is done by creating a new
table iterator, initializing it to the next block, releasing the old
iterator and then copying over the data.

Refactor the code to instead advance the table iterator in place. This
is simpler and unlocks some optimizations in subsequent patches. Also,
it allows us to avoid some allocations.

The following measurements show a single matching ref out of 1 million
refs. Before this change:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 315 allocs, 190 frees, 107,027 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt bcdc586db0 reftable/block: move ownership of block reader into `struct table_iter`
The table iterator allows the caller to iterate through all records in a
reftable table. To do so it iterates through all blocks of the desired
type one by one, where for each block it creates a new block iterator
and yields all its entries.

One of the things that is somewhat confusing in this context is who owns
the block reader that is being used to read the blocks and pass them to
the block iterator. Intuitively, as the table iterator is responsible
for iterating through the blocks, one would assume that this iterator is
also responsible for managing the lifecycle of the reader. And while it
somewhat is, the block reader is ultimately stored inside of the block
iterator.

Refactor the code such that the block reader is instead fully managed by
the table iterator. Instead of passing the reader to the block iterator,
we now only end up passing the block data to it. Despite clearing up the
lifecycle of the reader, it will also allow for better reuse of the
reader in subsequent patches.

The following benchmark prints a single matching ref out of 1 million
refs. Before:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 6,607 allocs, 6,482 frees, 509,635 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated

Note that while there are more allocation and free calls now, the
overall number of bytes allocated is significantly lower. The number of
allocations will be reduced significantly by the next patch though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt b371221a60 reftable/block: introduce `block_reader_release()`
Introduce a new function `block_reader_release()` that releases
resources acquired by the block reader. This function will be extended
in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt aac8c03cc4 reftable/block: better grouping of functions
Function definitions and declaration of `struct block_reader` and
`struct block_iter` are somewhat mixed up, making it hard to see which
functions belong together. Rearrange them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt 42c7bdc36d reftable/block: merge `block_iter_seek()` and `block_reader_seek()`
The function `block_iter_seek()` is merely a simple wrapper around
`block_reader_seek()`. Merge those two functions into a new function
`block_iter_seek_key()` that more clearly says what it is actually
doing.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt 3122d44025 reftable/block: rename `block_reader_start()`
The function `block_reader_start()` does not really apply to the block
reader, but to the block iterator. It's name is thus somewhat confusing.
Rename it to `block_iter_seek_start()` to clarify.

We will rename `block_reader_seek()` in similar spirit in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Rubén Justo 48e1ca27b1 launch_editor: waiting message on error
When advice.waitingForEditor configuration is not set to false, we show
a hint telling that we are waiting for user's editor to close the file
when we launch an editor and wait for it to return control back to us.
We give the message on an incomplete line, expecting that we can go back
to the beginning of the line and clear the message when the editor returns.

However, it is possible that the editor exits with an error status, in
which case we show an error message and then return to our caller.  In
such a case, the error message is given where the terminal cursor
happens to be, which is most likely after the "we are waiting for your
editor" message on the same line.

Clear the line before showing the error.

While we're here, make the error message follow our CodingGuideLines.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:13:32 -07:00