Commit Graph

404 Commits (d2779beb36ff64eb062103db14006f7ae6da5f37)

Author SHA1 Message Date
Junio C Hamano 9a85fa8406 Merge branch 'ps/remote-rename-fix'
"git remote rename origin upstream" failed to move origin/HEAD to
upstream/HEAD when origin/HEAD is unborn and performed other
renames extremely inefficiently, which has been corrected.

* ps/remote-rename-fix:
  builtin/remote: only iterate through refs that are to be renamed
  builtin/remote: rework how remote refs get renamed
  builtin/remote: determine whether refs need renaming early on
  builtin/remote: fix sign comparison warnings
  refs: simplify logic when migrating reflog entries
  refs: pass refname when invoking reflog entry callback
2025-08-21 13:46:58 -07:00
Junio C Hamano c3c8b6910a Merge branch 'ps/reflog-migrate-fixes'
"git refs migrate" to migrate the reflog entries from a refs
backend to another had a handful of bugs squashed.

* ps/reflog-migrate-fixes:
  refs: fix invalid old object IDs when migrating reflogs
  refs: stop unsetting REF_HAVE_OLD for log-only updates
  refs/files: detect race when generating reflog entry for HEAD
  refs: fix identity for migrated reflogs
  ident: fix type of string length parameter
  builtin/reflog: implement subcommand to write new entries
  refs: export `ref_transaction_update_reflog()`
  builtin/reflog: improve grouping of subcommands
  Documentation/git-reflog: convert to use synopsis type
2025-08-21 13:46:57 -07:00
Patrick Steinhardt 16c4fa26b9 builtin/remote: only iterate through refs that are to be renamed
When renaming a remote we also need to rename all references
accordingly. But while we only need to rename references that are
contained in the "refs/remotes/$OLDNAME/" namespace, we end up using
`refs_for_each_rawref()` that iterates through _all_ references. We know
to exit early in the callback in case we see an irrelevant reference,
but ultimately this is still a waste of compute as we knowingly iterate
through references that we won't ever care about.

Improve this by using `refs_for_each_rawref_in()`, which knows to only
iterate through (potentially broken) references in a given prefix.

The following benchmark renames a remote with a single reference in a
repository that has 100k unrelated references. This shows a sizeable
improvement with the "files" backend:

    Benchmark 1: rename remote (refformat = files, revision = HEAD~)
      Time (mean ± σ):      42.6 ms ±   0.9 ms    [User: 29.1 ms, System: 8.4 ms]
      Range (min … max):    40.1 ms …  43.3 ms    10 runs

    Benchmark 2: rename remote (refformat = files, revision = HEAD)
      Time (mean ± σ):      31.7 ms ±   4.0 ms    [User: 19.6 ms, System: 6.9 ms]
      Range (min … max):    27.1 ms …  36.0 ms    10 runs

    Summary
      rename remote (refformat = files, revision = HEAD) ran
        1.35 ± 0.17 times faster than rename remote (refformat = files, revision = HEAD~)

The "reftable" backend shows roughly the same absolute improvement, but
given that it's already significantly faster than the "files" backend
this translates to a much larger relative improvement:

    Benchmark 1: rename remote (refformat = reftable, revision = HEAD~)
      Time (mean ± σ):      18.2 ms ±   0.5 ms    [User: 12.7 ms, System: 3.0 ms]
      Range (min … max):    17.3 ms …  21.4 ms    110 runs

    Benchmark 2: rename remote (refformat = reftable, revision = HEAD)
      Time (mean ± σ):       8.8 ms ±   0.5 ms    [User: 3.8 ms, System: 2.9 ms]
      Range (min … max):     7.5 ms …   9.9 ms    167 runs

    Summary
      rename remote (refformat = reftable, revision = HEAD) ran
        2.07 ± 0.12 times faster than rename remote (refformat = reftable, revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 14:19:30 -07:00
Patrick Steinhardt b9fd73a234 refs: pass refname when invoking reflog entry callback
With `refs_for_each_reflog_ent()` callers can iterate through all the
reflog entries for a given reference. The callback that is being invoked
for each such entry does not receive the name of the reference that we
are currently iterating through. This isn't really a limiting factor, as
callers can simply pass the name via the callback data.

But this layout sometimes does make for a bit of an awkward calling
pattern. One example: when iterating through all reflogs, and for each
reflog we iterate through all refnames, we have to do some extra book
keeping to track which reference name we are currently yielding reflog
entries for.

Change the signature of the callback function so that the reference name
of the reflog gets passed through to it. Adapt callers accordingly and
start using the new parameter in trivial cases. The next commit will
refactor the reference migration logic to make use of this parameter so
that we can simplify its logic a bit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 14:19:30 -07:00
Junio C Hamano cf03815537 Merge branch 'ps/reflog-migrate-fixes' into ps/remote-rename-fix
* ps/reflog-migrate-fixes:
  refs: fix invalid old object IDs when migrating reflogs
  refs: stop unsetting REF_HAVE_OLD for log-only updates
  refs/files: detect race when generating reflog entry for HEAD
  refs: fix identity for migrated reflogs
  ident: fix type of string length parameter
  builtin/reflog: implement subcommand to write new entries
  refs: export `ref_transaction_update_reflog()`
  builtin/reflog: improve grouping of subcommands
  Documentation/git-reflog: convert to use synopsis type
2025-08-06 14:18:43 -07:00
Patrick Steinhardt 465eff81de refs: fix invalid old object IDs when migrating reflogs
When migrating reflog entries between different storage formats we end
up with invalid old object IDs for the migrated entries: instead of
writing the old object ID of the to-be-migrated entry, we end up with
the all-zeroes object ID.

The root cause of this issue is that we don't know to use the old object
ID provided by the caller. Instead, we manually resolve the old object
ID by resolving the current value of its matching reference. But as that
reference does not yet exist in the target ref storage we always end up
resolving it to all-zeroes.

This issue got unnoticed as there is no user-facing command that would
even show the old object ID. While `git log -g` knows to show the new
object ID, we don't have any formatting directive to show the old object
ID.

Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If
set, backends are instructed to use the old and new object IDs provided
by the caller, without doing any manual resolving. Set this flag in
`ref_transaction_update_reflog()`.

Amend our tests in t1460-refs-migrate to use our test tool to read
reflog entries. This test tool prints out both old and new object ID of
each reflog entry, which fixes the test gap. Furthermore it also prints
the full identity used to write the reflog, which provides test coverage
for the previous commit in this patch series that fixed the identity for
migrated reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 07:36:31 -07:00
Patrick Steinhardt 1ffd2d4159 refs: export `ref_transaction_update_reflog()`
In a subsequent commit we'll add another user that wants to write reflog
entries. This requires them to call `ref_transaction_update_reflog()`,
but that function is local to "refs.c".

Export the function to prepare for the change. While at it, drop the
`flags` field, as all callers are for now expected to use the same flags
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 07:36:29 -07:00
Junio C Hamano 8d9f536a51 Merge branch 'kn/for-each-ref-skip'
"git for-each-ref" learns "--start-after" option to help
applications that want to page its output.

* kn/for-each-ref-skip:
  ref-cache: set prefix_state when seeking
  for-each-ref: introduce a '--start-after' option
  ref-filter: remove unnecessary else clause
  refs: selectively set prefix in the seek functions
  ref-cache: remove unused function 'find_ref_entry()'
  refs: expose `ref_iterator` via 'refs.h'
2025-08-03 18:44:26 -07:00
Junio C Hamano 7b625c2a35 Merge branch 'ph/fetch-prune-optim'
"git fetch --prune" used to be O(n^2) expensive when there are many
refs, which has been corrected.

* ph/fetch-prune-optim:
  clean up interface for refs_warn_dangling_symrefs
  refs: remove old refs_warn_dangling_symref
  fetch-prune: optimize dangling-ref reporting
2025-07-16 09:42:27 -07:00
Karthik Nayak 2b4648b919 refs: selectively set prefix in the seek functions
The ref iterator exposes a `ref_iterator_seek()` function. The name
suggests that this would seek the iterator to a specific reference in
some ways similar to how `fseek()` works for the filesystem.

However, the function actually sets the prefix for refs iteration. So
further iteration would only yield references which match the particular
prefix. This is a bit confusing.

Let's add a 'flags' field to the function, which when set with the
'REF_ITERATOR_SEEK_SET_PREFIX' flag, will set the prefix for the
iteration in-line with the existing behavior. Otherwise, the reference
backends will simply seek to the specified reference and clears any
previously set prefix. This allows users to start iteration from a
specific reference.

In the packed and reftable backend, since references are available in a
sorted list, the changes are simply setting the prefix if needed. The
changes on the files-backend are a little more involved, since the files
backend uses the 'ref-cache' mechanism. We move out the existing logic
within `cache_ref_iterator_seek()` to `cache_ref_iterator_set_prefix()`
which is called when the 'REF_ITERATOR_SEEK_SET_PREFIX' flag is set. We
then parse the provided seek string and set the required levels and
their indexes to ensure that seeking is possible.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15 11:54:20 -07:00
Karthik Nayak 6bde5d43b7 refs: expose `ref_iterator` via 'refs.h'
The `ref_iterator` is an internal structure to the 'refs/'
sub-directory, which allows iteration over refs. All reference iteration
is built on top of these iterators.

External clients of the 'refs' subsystem use the various
'refs_for_each...()' functions to iterate over refs. However since these
are wrapper functions, each combination of functionality requires a new
wrapper function. This is not feasible as the functions pile up with the
increase in requirements. Expose the internal reference iterator, so
advanced users can mix and match options as needed.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15 11:54:19 -07:00
Phil Hord 87d8d8c5d0 clean up interface for refs_warn_dangling_symrefs
The refs_warn_dangling_symrefs interface is a bit fragile as it passes
in printf-formatting strings with expectations about the number of
arguments. This patch series made it worse by adding a 2nd positional
argument. But there are only two call sites, and they both use almost
identical display options.

Make this safer by moving the format strings into the function that uses
them to make it easier to see when the arguments don't match. Pass a
prefix string and a dry_run flag so the decision logic can be handled
where needed.

Signed-off-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 18:42:25 -07:00
Phil Hord 0f84695499 refs: remove old refs_warn_dangling_symref
The dangling warning function that takes a single ref to search for
is no longer used.  Remove it.

Signed-off-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 18:42:25 -07:00
Karthik Nayak b3de3832ce refs: add function to translate errors to strings
The commit 76e760b999 (refs: introduce enum-based transaction error
types, 2025-04-08) introduced enum-based transaction error types. The
refs transaction logic was also modified to propagate these errors. For
clients of the ref transaction system, it would be beneficial to provide
human readable messages for these errors.

There is already an existing mapping in 'builtin/update-ref.c', move it
to 'refs.c' as `ref_transaction_error_msg()` and use the same within the
'builtin/update-ref.c'.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19 11:06:31 -07:00
Junio C Hamano 47478802da Merge branch 'kn/non-transactional-batch-updates'
Updating multiple references have only been possible in all-or-none
fashion with transactions, but it can be more efficient to batch
multiple updates even when some of them are allowed to fail in a
best-effort manner.  A new "best effort batches of updates" mode
has been introduced.

* kn/non-transactional-batch-updates:
  update-ref: add --batch-updates flag for stdin mode
  refs: support rejection in batch updates during F/D checks
  refs: implement batch reference update support
  refs: introduce enum-based transaction error types
  refs/reftable: extract code from the transaction preparation
  refs/files: remove duplicate duplicates check
  refs: move duplicate refname update check to generic layer
  refs/files: remove redundant check in split_symref_update()
2025-04-16 13:54:19 -07:00
Karthik Nayak 31726bb90d refs: support rejection in batch updates during F/D checks
The `refs_verify_refnames_available()` is used to batch check refnames
for F/D conflicts. While this is the more performant alternative than
its individual version, it does not provide rejection capabilities on a
single update level. For batched updates, this would mean a rejection of
the entire transaction whenever one reference has a F/D conflict.

Modify the function to call `ref_transaction_maybe_set_rejected()` to
check if a single update can be rejected. Since this function is only
internally used within 'refs/' and we want to pass in a `struct
ref_transaction *` as a variable. We also move and mark
`refs_verify_refnames_available()` to 'refs-internal.h' to be an
internal function.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08 07:57:21 -07:00
Karthik Nayak 23fc8e4f61 refs: implement batch reference update support
Git supports making reference updates with or without transactions.
Updates with transactions are generally better optimized. But
transactions are all or nothing. This means, if a user wants to batch
updates to take advantage of the optimizations without the hard
requirement that all updates must succeed, there is no way currently to
do so. Particularly with the reftable backend where batching multiple
reference updates is more efficient than performing them sequentially.

Introduce batched update support with a new flag,
'REF_TRANSACTION_ALLOW_FAILURE'. Batched updates while different from
transactions, use the transaction infrastructure under the hood. When
enabled, this flag allows individual reference updates that would
typically cause the entire transaction to fail due to non-system-related
errors to be marked as rejected while permitting other updates to
proceed. System errors referred by 'REF_TRANSACTION_ERROR_GENERIC'
continue to result in the entire transaction failing. This approach
enhances flexibility while preserving transactional integrity where
necessary.

The implementation introduces several key components:

  - Add 'rejection_err' field to struct `ref_update` to track failed
    updates with failure reason.

  - Add a new struct `ref_transaction_rejections` and a field within
    `ref_transaction` to this struct to allow quick iteration over
    rejected updates.

  - Modify reference backends (files, packed, reftable) to handle
    partial transactions by using `ref_transaction_set_rejected()`
    instead of failing the entire transaction when
    `REF_TRANSACTION_ALLOW_FAILURE` is set.

  - Add `ref_transaction_for_each_rejected_update()` to let callers
    examine which updates were rejected and why.

This foundational change enables batched update support throughout the
reference subsystem. A following commit will expose this capability to
users by adding a `--batch-updates` flag to 'git-update-ref(1)',
providing both a user-facing feature and a testable implementation.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08 07:57:20 -07:00
Karthik Nayak 76e760b999 refs: introduce enum-based transaction error types
Replace preprocessor-defined transaction errors with a strongly-typed
enum `ref_transaction_error`. This change:

  - Improves type safety and function signature clarity.
  - Makes error handling more explicit and discoverable.
  - Maintains existing error cases, while adding new error cases for
    common scenarios.

This refactoring paves the way for more comprehensive error handling
which we will utilize in the upcoming commits to add batch reference
update support.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08 07:57:20 -07:00
Junio C Hamano 8d6413a1be Merge branch 'ps/refname-avail-check-optim'
The code paths to check whether a refname X is available (by seeing
if another ref X/Y exists, etc.) have been optimized.

* ps/refname-avail-check-optim:
  refs: reuse iterators when determining refname availability
  refs/iterator: implement seeking for files iterators
  refs/iterator: implement seeking for packed-ref iterators
  refs/iterator: implement seeking for ref-cache iterators
  refs/iterator: implement seeking for reftable iterators
  refs/iterator: implement seeking for merged iterators
  refs/iterator: provide infrastructure to re-seek iterators
  refs/iterator: separate lifecycle from iteration
  refs: stop re-verifying common prefixes for availability
  refs/files: batch refname availability checks for initial transactions
  refs/files: batch refname availability checks for normal transactions
  refs/reftable: batch refname availability checks
  refs: introduce function to batch refname availability checks
  builtin/update-ref: skip ambiguity checks when parsing object IDs
  object-name: allow skipping ambiguity checks in `get_oid()` family
  object-name: introduce `repo_get_oid_with_flags()`
2025-03-29 16:39:07 +09:00
Patrick Steinhardt 2ff58dec49 refs: introduce function to batch refname availability checks
The `refs_verify_refname_available()` functions checks whether a
reference update can be committed or whether it would conflict with
either a prefix or suffix thereof. This function needs to be called once
per reference that one wants to check, which requires us to redo a
couple of checks every time the function is called.

Introduce a new function `refs_verify_refnames_available()` that does
the same, but for a list of references. For now, the new function uses
the exact same implementation, except that we loop through all refnames
provided by the caller. This will be tuned in subsequent commits.

The existing `refs_verify_refname_available()` function is reimplemented
on top of the new function. As such, the diff is best viewed with the
`--ignore-space-change option`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12 11:31:17 -07:00
Junio C Hamano 62c58891e1 Merge branch 'tz/doc-txt-to-adoc-fixes'
Fallouts from recent renaming of documentation files from .txt
suffix to the new .adoc suffix have been corrected.

* tz/doc-txt-to-adoc-fixes: (38 commits)
  xdiff: *.txt -> *.adoc fixes
  unpack-trees.c: *.txt -> *.adoc fixes
  transport.h: *.txt -> *.adoc fixes
  trace2/tr2_sysenv.c: *.txt -> *.adoc fixes
  trace2.h: *.txt -> *.adoc fixes
  t6434: *.txt -> *.adoc fixes
  t6012: *.txt -> *.adoc fixes
  t/helper/test-rot13-filter.c: *.txt -> *.adoc fixes
  simple-ipc.h: *.txt -> *.adoc fixes
  setup.c: *.txt -> *.adoc fixes
  refs.h: *.txt -> *.adoc fixes
  pseudo-merge.h: *.txt -> *.adoc fixes
  parse-options.h: *.txt -> *.adoc fixes
  object-name.c: *.txt -> *.adoc fixes
  list-objects-filter-options.h: *.txt -> *.adoc fixes
  fsck.h: *.txt -> *.adoc fixes
  diffcore.h: *.txt -> *.adoc fixes
  diff.h: *.txt -> *.adoc fixes
  contrib/long-running-filter: *.txt -> *.adoc fixes
  config.c: *.txt -> *.adoc fixes
  ...
2025-03-06 14:06:31 -08:00
Todd Zullinger 72d385824a refs.h: *.txt -> *.adoc fixes
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:24 -08:00
Junio C Hamano 3c0f4abaf5 Merge branch 'kn/ref-migrate-skip-reflog'
"git refs migrate" can optionally be told not to migrate the reflog.

* kn/ref-migrate-skip-reflog:
  builtin/refs: add '--no-reflog' flag to drop reflogs
2025-02-27 15:23:00 -08:00
Karthik Nayak 89be7d2774 builtin/refs: add '--no-reflog' flag to drop reflogs
The "git refs migrate" subcommand converts the backend used for ref
storage. It always migrates reflog data as well as refs. Introduce an
option to exclude reflogs from migration, allowing them to be discarded
when they are unnecessary.

This is particularly useful in server-side repositories, where reflogs
are typically not expected. However, some repositories may still have
them due to historical reasons, such as bugs, misconfigurations, or
administrative decisions to enable reflogs for debugging. In such
repositories, it would be optimal to drop reflogs during the migration.

To address this, introduce the '--no-reflog' flag, which prevents reflog
migration. When this flag is used, reflogs from the original reference
backend are migrated. Since only the new reference backend remains in
the repository, all previous reflogs are permanently discarded.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-21 09:55:02 -08:00
Karthik Nayak af47976cc0 refs: mark `ref_transaction_update_reflog()` as static
The `ref_transaction_update_reflog()` function is only used within
'refs.c', so mark it as static.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 09:51:35 -08:00
Junio C Hamano 6f8ae955bd Merge branch 'kn/reflog-migration'
"git refs migrate" learned to also migrate the reflog data across
backends.

* kn/reflog-migration:
  refs: mark invalid refname message for translation
  refs: add support for migrating reflogs
  refs: allow multiple reflog entries for the same refname
  refs: introduce the `ref_transaction_update_reflog` function
  refs: add `committer_info` to `ref_transaction_add_update()`
  refs: extract out refname verification in transactions
  refs/files: add count field to ref_lock
  refs: add `index` field to `struct ref_udpate`
  refs: include committer info in `ref_update` struct
2024-12-23 09:32:29 -08:00
Junio C Hamano 5f212684ab Merge branch 'bf/set-head-symref'
When "git fetch $remote" notices that refs/remotes/$remote/HEAD is
missing and discovers what branch the other side points with its
HEAD, refs/remotes/$remote/HEAD is updated to point to it.

* bf/set-head-symref:
  fetch set_head: handle mirrored bare repositories
  fetch: set remote/HEAD if it does not exist
  refs: add create_only option to refs_update_symref_extended
  refs: add TRANSACTION_CREATE_EXISTS error
  remote set-head: better output for --auto
  remote set-head: refactor for readability
  refs: atomically record overwritten ref in update_symref
  refs: standardize output of refs_read_symbolic_ref
  t/t5505-remote: test failure of set-head
  t/t5505-remote: set default branch to main
2024-12-19 10:58:27 -08:00
Karthik Nayak 84675fa271 refs: introduce the `ref_transaction_update_reflog` function
Introduce a new function `ref_transaction_update_reflog`, for clients to
add a reflog update to a transaction. While the existing function
`ref_transaction_update` also allows clients to add a reflog entry, this
function does a few things more, It:
  - Enforces that only a reflog entry is added and does not update the
  ref itself.
  - Allows the users to also provide the committer information. This
  means clients can add reflog entries with custom committer
  information.

The `transaction_refname_valid()` function also modifies the error
message selectively based on the type of the update. This change also
affects reflog updates which go through `ref_transaction_update()`.

A follow up commit will utilize this function to add reflog support to
`git refs migrate`.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16 09:45:34 -08:00
Junio C Hamano ab738b2f1f Merge branch 'jc/forbid-head-as-tagname'
"git tag" has been taught to refuse to create refs/tags/HEAD
as such a tag will be confusing in the context of UI provided by
the Git Porcelain commands.

* jc/forbid-head-as-tagname:
  tag: "git tag" refuses to use HEAD as a tagname
  t5604: do not expect that HEAD can be a valid tagname
  refs: drop strbuf_ prefix from helpers
  refs: move ref name helpers around
2024-12-15 17:54:26 -08:00
Junio C Hamano 57e81b59f3 Merge branch 'sj/ref-contents-check'
"git fsck" learned to issue warnings on "curiously formatted" ref
contents that have always been taken valid but something Git
wouldn't have written itself (e.g., missing terminating end-of-line
after the full object name).

* sj/ref-contents-check:
  ref: add symlink ref content check for files backend
  ref: check whether the target of the symref is a ref
  ref: add basic symref content check for files backend
  ref: add more strict checks for regular refs
  ref: port git-fsck(1) regular refs check for files backend
  ref: support multiple worktrees check for refs
  ref: initialize ref name outside of check functions
  ref: check the full refname instead of basename
  ref: initialize "fsck_ref_report" with zero
2024-12-04 10:14:42 +09:00
Junio C Hamano 93e5e048f8 refs: drop strbuf_ prefix from helpers
The helper functions (strbuf_branchname, strbuf_check_branch_ref,
and strbuf_check_tag_ref) are about handling branch and tag names,
and it is a non-essential fact that these functions use strbuf to
hold these names.  Rename them to make it clarify that these are
more about "ref".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-03 12:38:49 +09:00
Junio C Hamano 5bcbde9e49 refs: move ref name helpers around
strbuf_branchname(), strbuf_check_{branch,tag}_ref() are helper
functions to deal with branch and tag names, and the fact that they
happen to use strbuf to hold the name of a branch or a tag is not
essential.  These functions fit better in the refs API than strbuf
API, the latter of which is about string manipulations.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-03 12:38:49 +09:00
Patrick Steinhardt 0f5762b043 refs: adapt `initial_transaction` flag to be unsigned
The `initial_transaction` flag is tracked as a signed integer, but we
typically pass around flags via unsigned integers. Adapt the type
accordingly.

Suggested-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26 08:39:38 +09:00
Bence Ferdinandy 9963746c84 refs: add create_only option to refs_update_symref_extended
Allow the caller to specify that it only wants to update the symref if
it does not already exist. Silently ignore the error from the
transaction API if the symref already exists.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:36 +09:00
Bence Ferdinandy ed2f6f8804 refs: add TRANSACTION_CREATE_EXISTS error
Currently there is only one special error for transaction, for when
there is a naming conflict, all other errors are dumped under a generic
error. Add a new special error case for when the caller requests the
reference to be updated only when it does not yet exist and the
reference actually does exist.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:36 +09:00
Bence Ferdinandy d842cd1301 refs: atomically record overwritten ref in update_symref
When updating a symref with update_symref it's currently not possible to
know for sure what was the previous value that was overwritten. Extend
refs_update_symref under a new function name, to record the value after
the ref has been locked if the caller of refs_update_symref_extended
requests it via a new variable in the function call. Make the return
value of the function notify the caller, if the previous value was
actually not a symbolic reference. Keep the original refs_update_symref
function with the same signature, but now as a wrapper around
refs_update_symref_extended.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:35 +09:00
Bence Ferdinandy 8102d10ff8 refs: standardize output of refs_read_symbolic_ref
When the symbolic reference we want to read with refs_read_symbolic_ref
is actually not a symbolic reference, the files and the reftable
backends return different values (1 and -1 respectively). Standardize
the returned values so that 0 is success, -1 is a generic error and -2
is that the reference was actually non-symbolic.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25 11:46:35 +09:00
shejialuo 7c78d819e6 ref: support multiple worktrees check for refs
We have already set up the infrastructure to check the consistency for
refs, but we do not support multiple worktrees. However, "git-fsck(1)"
will check the refs of worktrees. As we decide to get feature parity
with "git-fsck(1)", we need to set up support for multiple worktrees.

Because each worktree has its own specific refs, instead of just showing
the users "refs/worktree/foo", we need to display the full name such as
"worktrees/<id>/refs/worktree/foo". So we should know the id of the
worktree to get the full name. Add a new parameter "struct worktree *"
for "refs-internal.h::fsck_fn". Then change the related functions to
follow this new interface.

The "packed-refs" only exists in the main worktree, so we should only
check "packed-refs" in the main worktree. Use "is_main_worktree" method
to skip checking "packed-refs" in "packed_fsck" function.

Then, enhance the "files-backend.c::files_fsck_refs_dir" function to add
"worktree/<id>/" prefix when we are not in the main worktree.

Last, add a new test to check the refname when there are multiple
worktrees to exercise the code.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 08:21:32 +09:00
Patrick Steinhardt e4929cdf79 refs: skip collision checks in initial transactions
Reference transactions use `refs_verify_refname_available()` to check
for colliding references. This check consists of two parts:

  - Checks for whether multiple ref updates in the same transaction
    conflict with each other.

  - Checks for whether existing refs conflict with any refs part of the
    transaction.

While we generally cannot avoid the first check, the second check is
superfluous in cases where the transaction is an initial one in an
otherwise empty ref store. The check results in multiple ref reads as
well as the creation of a ref iterator for every ref we're checking,
which adds up quite fast when performing the check for many refs.

Introduce a new flag that allows us to skip this check and wire it up in
such that the backends pass it when running an initial transaction. This
leads to significant speedups when migrating ref storage backends. From
"files" to "reftable":

    Benchmark 1: migrate files:reftable (refcount = 100000, revision = HEAD~)
      Time (mean ± σ):     472.4 ms ±   6.7 ms    [User: 175.9 ms, System: 285.2 ms]
      Range (min … max):   463.5 ms … 483.2 ms    10 runs

    Benchmark 2: migrate files:reftable (refcount = 100000, revision = HEAD)
      Time (mean ± σ):      86.1 ms ±   1.9 ms    [User: 67.9 ms, System: 16.0 ms]
      Range (min … max):    82.9 ms …  90.9 ms    29 runs

    Summary
      migrate files:reftable (refcount = 100000, revision = HEAD) ran
        5.48 ± 0.15 times faster than migrate files:reftable (refcount = 100000, revision = HEAD~)

And from "reftable" to "files":

    Benchmark 1: migrate reftable:files (refcount = 100000, revision = HEAD~)
      Time (mean ± σ):     452.7 ms ±   3.4 ms    [User: 209.9 ms, System: 235.4 ms]
      Range (min … max):   445.9 ms … 457.5 ms    10 runs

    Benchmark 2: migrate reftable:files (refcount = 100000, revision = HEAD)
      Time (mean ± σ):      95.2 ms ±   2.2 ms    [User: 73.6 ms, System: 20.6 ms]
      Range (min … max):    91.7 ms … 100.8 ms    28 runs

    Summary
      migrate reftable:files (refcount = 100000, revision = HEAD) ran
        4.76 ± 0.11 times faster than migrate reftable:files (refcount = 100000, revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:16 +09:00
Patrick Steinhardt 1c299d03e5 refs: introduce "initial" transaction flag
There are two different ways to commit a transaction:

  - `ref_transaction_commit()` can be used to commit a regular
    transaction and is what almost every caller wants.

  - `initial_ref_transaction_commit()` can be used when it is known that
    the ref store that the transaction is committed for is empty and
    when there are no concurrent processes. This is used when cloning a
    new repository.

Implementing this via two separate functions has a couple of downsides.
First, every reference backend needs to implement a separate callback
even in the case where they don't special-case the initial transaction.
Second, backends are basically forced to reimplement the whole logic for
how to commit the transaction like the "files" backend does, even though
backends may wish to only tweak certain behaviour of a "normal" commit.
Third, it is awkward that callers must never prepare the transaction as
this is somewhat different than how a transaction typically works.

Refactor the code such that we instead mark initial transactions via a
separate flag when starting the transaction. This addresses all of the
mentioned painpoints, where the most important part is that it will
allow backends to have way more leeway in how exactly they want to
handle the initial transaction.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:15 +09:00
Patrick Steinhardt a0efef1446 refs: allow passing flags when setting up a transaction
Allow passing flags when setting up a transaction such that the
behaviour of the transaction itself can be altered. This functionality
will be used in a subsequent patch.

Adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21 07:59:14 +09:00
Junio C Hamano 78ce6660bb Merge branch 'ak/typofix-2.46-maint'
Typofix.

* ak/typofix-2.46-maint:
  upload-pack: fix a typo
  sideband: fix a typo
  setup: fix a typo
  run-command: fix a typo
  revision: fix a typo
  refs: fix typos
  rebase: fix a typo
  read-cache-ll: fix a typo
  pretty: fix a typo
  object-file: fix a typo
  merge-ort: fix typos
  merge-ll: fix a typo
  http: fix a typo
  gpg-interface: fix a typo
  git-p4: fix typos
  git-instaweb: fix a typo
  fsmonitor-settings: fix a typo
  diffcore-rename: fix typos
  config.mak.dev: fix a typo
2024-09-25 10:37:12 -07:00
Junio C Hamano 52f57e94bd Merge branch 'ps/reftable-exclude'
The reftable backend learned to more efficiently handle exclude
patterns while enumerating the refs.

* ps/reftable-exclude:
  refs/reftable: wire up support for exclude patterns
  reftable/reader: make table iterator reseekable
  t/unit-tests: introduce reftable library
  Makefile: stop listing test library objects twice
  builtin/receive-pack: fix exclude patterns when announcing refs
  refs: properly apply exclude patterns to namespaced refs
2024-09-25 10:37:11 -07:00
Andrew Kreimer 619cbc01a3 refs: fix typos
Fix typos in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-19 13:46:12 -07:00
Patrick Steinhardt 155dc8447d refs: properly apply exclude patterns to namespaced refs
Reference namespaces allow commands like git-upload-pack(1) to serve
different sets of references to the client depending on which namespace
is enabled, which is for example useful in fork networks. Namespaced
refs are stored with a `refs/namespaces/$namespace` prefix, but all the
user will ultimately see is a stripped version where that prefix is
removed.

The way that this interacts with "transfer.hideRefs" is not immediately
obvious: the hidden refs can either apply to the stripped references, or
to the non-stripped ones that still have the namespace prefix. In fact,
the "transfer.hideRefs" machinery does the former and applies to the
stripped reference by default, but rules can have "^" prefixed to switch
this behaviour to instead match against the full reference name.

Namespaces are exclusively handled at the generic "refs" layer, the
respective backends have no clue that such a thing even exists. This
also has the consequence that they cannot handle hiding references as
soon as reference namespaces come into play because they neither know
whether a namespace is active, nor do they know how to strip references
if they are active.

Handling such exclude patterns in `refs_for_each_namespaced_ref()` and
`refs_for_each_fullref_in_prefixes()` is broken though, as both support
that the user passes both namespaces and exclude patterns. In the case
where both are set we will exclude references with unstripped names,
even though we really wanted to exclude references based on their
stripped names.

This only surfaces when:

  - A repository uses reference namespaces.

  - "transfer.hideRefs" is active.

  - The namespaced references are packed into the "packed-refs" file.

None of our tests exercise this scenario, and thus we haven't ever hit
it. While t5509 exercises both (1) and (2), it does not happen to hit
(3). It is trivial to demonstrate the bug though by explicitly packing
refs in the tests, and then we indeed surface the breakage.

Fix this bug by prefixing exclude patterns with the namespace in the
generic layer. The newly introduced function will be used outside of
"refs.c" in the next patch, so we add a declaration to "refs.h".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 13:57:18 -07:00
Patrick Steinhardt 9a20b889e8 refs: stop modifying global `log_all_ref_updates` variable
In refs-related code we modify the global `log_all_ref_updates`
variable, which is done because `should_autocreate_reflog()` does not
accept passing an `enum log_refs_config` but instead accesses the global
variable. Adapt its interface such that the value is provided by the
caller, which allows us to compute the proper value locally without
having to modify global state.

This change requires us to move the enum to "repo-settings.h", or
otherwise we get compilation errors due to include cycles. We're about
to fully move this setting into the repo-settings subsystem anyway, so
this is fine.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:43 -07:00
Junio C Hamano b3d175409d Merge branch 'sj/ref-fsck'
"git fsck" infrastructure has been taught to also check the sanity
of the ref database, in addition to the object database.

* sj/ref-fsck:
  fsck: add ref name check for files backend
  files-backend: add unified interface for refs scanning
  builtin/refs: add verify subcommand
  refs: set up ref consistency check infrastructure
  fsck: add refs report function
  fsck: add a unified interface for reporting fsck messages
  fsck: make "fsck_error" callback generic
  fsck: rename objects-related fsck error functions
  fsck: rename "skiplist" to "skip_oids"
2024-08-16 12:51:51 -07:00
Junio C Hamano e7f86cb69d Merge branch 'jc/refs-symref-referent'
The refs API has been taught to give symref target information to
the users of ref iterators, allowing for-each-ref and friends to
avoid an extra ref_resolve_* API call per a symbolic ref.

* jc/refs-symref-referent:
  ref-filter: populate symref from iterator
  refs: add referent to each_ref_fn
  refs: keep track of unresolved reference value in iterators
2024-08-15 13:22:15 -07:00
John Cai e8207717f1 refs: add referent to each_ref_fn
Add a parameter to each_ref_fn so that callers to the ref APIs
that use this function as a callback can have acess to the
unresolved value of a symbolic ref.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:47:34 -07:00
shejialuo ab6f79d8df refs: set up ref consistency check infrastructure
The "struct ref_store" is the base class which contains the "be" pointer
which provides backend-specific functions whose interfaces are defined
in the "ref_storage_be". We could reuse this polymorphism to define only
one interface. For every backend, we need to provide its own function
pointer.

The interfaces defined in the `ref_storage_be` are carefully structured
in semantic. It's organized as the five parts:

1. The name and the initialization interfaces.
2. The ref transaction interfaces.
3. The ref internal interfaces (pack, rename and copy).
4. The ref filesystem interfaces.
5. The reflog related interfaces.

To keep consistent with the git-fsck(1), add a new interface named
"fsck_refs_fn" to the end of "ref_storage_be". This semantic cannot be
grouped into any above five categories. Explicitly add blank line to
make it different from others.

Last, implement placeholder functions for each ref backends.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:53 -07:00