The connectivity check has been refactored to search for promisor
objects in a generic way using the object database interface,
rather than iterating packfiles directly. This allows connectivity
checks to work properly in repositories that do not use packfiles.
* ps/connected-generic-promisor-checks:
connected: search promisor objects generically
connected: split out promisor-based connectivity check
odb/source-packed: support flags when iterating an object prefix
odb/source-packed: extract logic to skip certain packs
Reference backend configuration is now loaded lazily to avoid
recursive calls during repository initialization when "onbranch"
configuration conditions are evaluated. This also fixes a memory
leak and allows dropping the unused `chdir_notify_reparent()`
machinery.
* ps/refs-onbranch-fixes:
refs: protect against chicken-and-egg recursion
refs/reftable: lazy-load configuration to fix chicken-and-egg
reftable: split up write options
refs/files: lazy-load configuration to fix chicken-and-egg
refs: move parsing of "core.logAllRefUpdates" back into ref stores
repository: free main reference database
chdir-notify: drop unused `chdir_notify_reparent()`
refs: unregister reference stores from "chdir_notify"
setup: don't apply "GIT_REFERENCE_BACKEND" without a repository
setup: stop applying repository format twice
setup: inline `check_and_apply_repository_format()`
Documentation on community contribution guidelines has been updated to
encourage replying to review comments before rerolling, and to advise
a default limit of at most one reroll per day to give reviewers across
different time zones enough time to participate.
* wy/doc-clarify-review-replies:
doc: advise batching patch rerolls
doc: encourage review replies before rerolling
The "git repo info" command has been taught new keys to output both
absolute and relative paths for "gitdir" and "commondir", supported by
a new path-formatting helper extracted from "git rev-parse".
* jk/repo-info-path-keys:
repo: add path.gitdir with absolute and relative suffix formatting
repo: add path.commondir with absolute and relative suffix formatting
path: extract format_path() and use in rev-parse
The Apache timeout in HTTP tests has been increased to prevent test
failures on heavily loaded CI runners. The tests creating an
enormous number of refs have been isolated to their own repositories
to avoid slowing down subsequent tests.
* jk/t5551-expensive-test-timeouts-fix:
t5551: put many-tags case into its own repo
t/lib-httpd: bump apache timeout
Most of the t5551 http fetch tests use a handful of refs. But there are
a few test cases which check our handling of large numbers of refs.
These tests use the same server-side repo, so all subsequent tests end
up having to consider those extra refs, too.
The result is that the test script is a bit slower than it needs to be.
In a normal run, moving the "2,000 tags" test into its own repo drops my
runtime for the whole script from ~2.7s to ~1.9s.
This is a modest gain, but when we add the "--long" flag it gets much
bigger. There we trigger a test (marked with EXPENSIVE) that adds
100,000 tags, and the script runtime jumps to ~95s. But if we use the
same "many tags" repo for that, our runtime drops to just ~37s.
This is a pretty easy win to drop the cost of the script. It may even be
a larger gain on a heavily loaded system, since one of the main costs
here is unpacked refs, which are heavy on system time and I/O costs.
It's possible we are reducing test coverage, since all of those other
tests were inadvertently using large ref advertisements (and thus could
have uncovered some unexpected interaction). But that seems somewhat
unlikely; the tests targeted at the large number of refs are doing
roughly similar things to the other tests.
Note that the real performance culprit is the 100k-tag --long test, not
the 2k-tag one. So we could just let the 100k one use its own repo, and
keep the 2k tags in the main repo. But since these two tests are
somewhat interlinked, it's easier to just move them both (and it does
provide a small gain even for the 2000-tag test). I also notice that the
2000-tag test is gated on the CMDLINE_LIMIT prereq, and without that the
later EXPENSIVE test will fail (since we won't have a too-many-refs
clone). Nobody seems to have noticed or complained after many years, and
I left it alone for this patch.
Signed-off-by: Jeff King <peff@peff.net>
[jc: made the new "many-tags.git" bare to match the original "repo.git"]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We lost ability to use https:// proxies during this cycle; this is
a hotfix for the regression.
* js/http-https-proxy-fix:
http: accept https:// proxies again
Since enabling more tests with 7a094d68a2 (ci: run expensive tests on
push builds to integration branches, 2026-05-08), we sometimes see test
failures or timeouts in GitHub CI. The culprit seems to be the "enormous
ref negotiation" test in t5551, which creates ~100k tag refs in our http
server-side repo.
Iterating through the loose refs of this repo to generate a ref
advertisement can take a long time, especially on a platform with slow
I/O. On my otherwise unloaded local machine, a cold cache ref
advertisement takes ~10s. On a busy CI machine running tests in
parallel, it can presumably top 60s, which runs afoul of Apache's
default CGI timeout.
The result in t5551 is a test failure, where Apache simply hangs up the
connection and the client reports an error. But worse, t5559 runs the
same test with HTTP/2, and a bug in Apache causes the connection to hang
indefinitely! We eventually see this as a CI timeout after 6 hours.
Let's bump Apache's timeout to something much larger: 600 seconds. This
doesn't eliminate the possibility of a timeout, but it makes it much
less likely. It should eliminate both the test failures and the CI
timeouts in practice, and it protects us from running into similar
problems with other tests in the future.
There are two counter-arguments to consider.
One, could/should we just make the test faster? Probably yes. The
biggest mistake here is having such an absurd number of unpacked refs on
a system which is bottle-necked on I/O. But I think it's worth bumping
the timeout so that we can fix this (and possibly other) correctness
issues, and then consider performance separately (which we'll do in
subsequent patches).
And two, is this just papering over a problem that users might see in
the real world? We could teach Git to handle this case more gracefully
with optimizations or keep-alives. But I think it's really an artificial
situation. You need a combination of this silly number of loose refs,
plus a very heavily loaded system. If you were trying to run a real
server and it took more than 60s to generate the ref advertisement, I
don't think the timeout is your biggest problem. Your crappy service is,
and you should adjust your resources to match your load. I.e., it is
probably reasonable for Git to assume that advertisements happen
fast-ish and don't need protocol-level keepalives.
Though the patch here is small, tons of work went into analyzing the
problem. Many thanks to the contributors credited below.
Helped-by: Michael Montalbo <mmontalbo@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 663d7abe07 (http: reject unsupported proxy URL schemes,
2026-05-05), set_curl_proxy_type() returns 0 only for the "http"
and SOCKS variants via dedicated early returns, and -1 for
everything else. The "https" branch configures the CURL handle for
HTTPS proxying but then falls through to the trailing `return -1`
intended for unknown schemes, so the caller in get_curl_handle()
treats a perfectly valid https:// proxy URL as unsupported and
refuses to use it.
Noticed while looking into a Coverity report against the same
function; the unchecked curl_easy_setopt() return values it flags
are orthogonal to this fix.
Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several tests in 't3420-rebase-autostash.sh' start various rebase
processes that are expected to fail because of merge conflicts. The
tests [1] checking that 'git rebase --quit' and autostash work
together as expected after such a failure then run '! grep ...' to
ensure that the dirty contents of the file is gone. However, due to
the test repo's history and the choice of upstream branch that file
shouldn't exist in the conflicted state at all, and thus it shouldn't
exist after the subsequent 'git rebase --quit' either. Consequently,
this 'grep' doesn't fail as expected, i.e. because it can't find the
dirty content, but instead it fails, because it can't open the file.
Thighten this check by using 'test_path_is_missing' instead, thereby
avoiding unexpected errors from 'grep' as well.
Previously 2745817028 (t3420-rebase-autostash: don't try to grep
non-existing files, 2018-08-22) fixed a couple of similar issues; this
one was added later in 9b2df3e8d0 (rebase: save autostash entry into
stash reflog on --quit, 2020-04-28).
[1] This patch modifies only a single test, but that test is run
several times with different strategies ('--apply', '--merge', and
'--interactive'), hence the plural "tests".
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commits we have fixed recursion when creating the
reference backends due to a chicken-and-egg situation with "onbranch"
conditions. Unfortunately, this issue has existed for a while, and we
didn't really have a good mechanism to detect this recursion.
Improve the status quo by detecting the recursion when creating the main
reference store.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Same as with the "files" backend, the "reftable" backend also has a
chicken-and-egg problem with "onbranch" conditions. Fix this issue the
same as we did with the "files" backend by lazy-loading configuration.
Now that both the "files" and the "reftable" backend handle this
properly, add a generic test to t1400 that verifies that the user can
configure "core.logAllRefUpdates" via an "onbranch" condition. This is
mostly a nonsensical thing to do in the first place, but it serves as a
good sanity check.
Note that we had to move `should_write_log()` around so that it can
access the new `reftable_be_write_options()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When initializing the reftable stack the caller may optionally pass some
write options. These write options mix up two different concerns though:
- Of course, they allow the caller to configure how new reftables are
being written.
- But they also allow the caller to configure the stack itself, like
its hash ID and the `on_reload` callback.
This is somewhat awkward, as it doesn't easily give the caller the
flexibility to for example write multiple reftables with different
options. Furthermore, this requires us to eagerly parse relevant
configuration when initializing the reftable backend.
Refactor the code by splitting out those options that configure the
stack itself. Creating a new stack will thus only require this limited
set of options, whereas the caller is expected to pass write options to
all functions that end up writing tables.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When initializing the "files" reference backend we read the repository's
config to parse "core.preferSymlinkRefs" and "core.logAllRefUpdates".
This results in a chicken-and-egg problem though, because parsing the
configuration may require us to have access to the reference store
already when an "onbranch" condition exists.
Luckily, all the configuration that we honor only relates to writing
references. Consequently, we don't strictly need that configuration to
be readily available at initialization time, and we can easiliy defer
parsing it to a later point in time.
Implement this fix and add tests that verify that we can indeed properly
parse these config knobs via an "onbranch" condition.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cc42c88945 (refs: extract out reflog config to generic layer,
2026-05-04) we have refactored how we parse "core.logAllRefUpdates" so
that it happens in the generic layer. Unfortunately, this has worsened a
preexisting issue where we may recurse when creating the reference store
because of a chicken-and-egg problem between parsing the configuration
and evaluating "onbranch" conditions.
Prepare for a fix by essentially reverting that change so that we handle
this setting in the respective backends again. The backends are already
parsing other configuration anyway, so by moving the logic back in there
we can ensure that all backend configuration is parsed the same way.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we release worktree and submodule reference databases when
clearing a repository, we don't ever release the main reference
database. This memory leak went unnoticed because its pointer is
kept alive by the "chdir_notify" subsystem.
Fix the memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the preceding commit we've removed all callers of
`chdir_notify_reparent()`, so the function is unused now. Drop it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating reference stores we register them with the "chdir_notify"
subsystem. This is required because some of the paths we track may be
relative paths, so we have to reparent them in case the current working
directory changes.
But while we register the reference stores, we never unregister them.
This can have multiple outcomes:
- For a repository's main reference database we essentially keep the
pointer alive. We never free that database, either, and our leak
checker doesn't notice because it's still registered.
- For submodule and worktree reference databases we do eventually free
them in `repo_clear()`, so we may keep pointers to free'd memory
registered. We never notice though as we don't tend to chdir around
in the middle of the process.
We never noticed either of these symptoms, but they are obviously bad.
Partially fix those issues by unregistering the reference stores when
releasing them. The leak of the main reference database will be fixed in
a subsequent commit.
Note that this requires us to use `chdir_notify_register()` instead of
`chdir_notify_reparent()`, as there is no infrastructure to unregister the
latter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When discovering a repository we eventually also apply the
"GIT_REFERENCE_BACKEND" environment variable to the repository. There's
two problems with that:
- We do this unconditionally, which is rather pointless: we really
only have to configure the repository when we have found one.
- We have already applied the repository format at that point in time,
so we need to manually reapply it.
Move the logic around so that we only apply the environment variable
when a repository was discovered. This also allows us to drop the
explcit call to `repo_set_ref_storage_format()` because we now adjust
the format before we apply it via `apply_repository_format()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When discovering the repository in "setup.c" we apply the final
repository format multiple times:
- Once via `repository_format_configure()`, where we apply the hash
algorithm and ref storage format to both `struct repository_format`
and `struct repository`.
- And once via `apply_repository_format()`, where we apply these two
settings from `struct repository_format` to `struct repository`.
With the current flow both of these are in fact necessary. But this is
only because we call `repository_format_configure()` after we have
called `apply_repository_format()`. Consequently, if we only changed the
repository format in `repository_format_configure()` it would never
propagate to the repository.
Refactor the code so that we first configure the repository format
before applying it to the repository so that we can stop setting the
hash and reference storage format multiple times.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two callsites of `check_and_apply_repository_format()`. In a
subsequent commit we'll want to adapt one of those callsites to change
the order in which we read and apply the repository format, at which
point the helper function will not really be a good fit for us anymore.
Inline the function to both of the callsites.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/setup-centralize-odb-creation:
setup: construct object database in `apply_repository_format()`
repository: stop reading loose object map twice on repo init
setup: stop initializing object database without repository
setup: stop creating the object database in `setup_git_env()`
repository: stop initializing the object database in `repo_set_gitdir()`
setup: deduplicate logic to apply repository format
setup: drop `setup_git_env()`
t0001: plug test gaps for git-init(1) with GIT_OBJECT_DIRECTORY
Add a "Preserving Quotation Marks" section to prevent AI-assisted
translation and review from incorrectly converting language-specific
UTF-8 curly quotes (e.g., „ U+201E, " U+201C for Bulgarian) into
ASCII straight quotes " (U+0022), which would cause PO string
truncation and syntax errors.
Also update the "Special characters" item in the Quality checklist
to reference the new section.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>