Commit Graph

78509 Commits (seen)

Author SHA1 Message Date
Junio C Hamano 4a3422b161 Merge branch 'jt/de-global-bulk-checkin' into jt/odb-transaction
* jt/de-global-bulk-checkin:
  bulk-checkin: use repository variable from transaction
  bulk-checkin: require transaction for index_blob_bulk_checkin()
  bulk-checkin: remove global transaction state
  bulk-checkin: introduce object database transaction structure
2025-09-09 14:46:00 -07:00
Kristoffer Haugsbakk ebeca259cb BreakingChanges: remove claim about whatchanged reports
This was written in e836757e14 (whatschanged: list it in
BreakingChanges document, 2025-05-12) which was on the same
topic that added the `--i-still-use-this` requirement.[1]

Maybe it was a work-in-progress comment/status.

[1]: jc/you-still-use-whatchanged

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 14:37:09 -07:00
Kristoffer Haugsbakk 4a9b8f9a12 whatchanged: remove not-even-shorter clause
The closest equivalent is `git log --raw --no-merges`.

Also change to “defaults” (implicit plural).

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 14:37:09 -07:00
Kristoffer Haugsbakk a9388b1cd1 whatchanged: tell users the git-log(1) equivalent
There have been quite a few `--i-still-use-this` user reports since Git
2.51.0 was released.[1][2]  And it doesn’t seem like they are reading
the man page about the git-log(1) equivalent.

Tell them what options to plug into git-log(1).  That template produces
almost the same output[3] and is arguably a plug-in replacement.
Concretely, add an optional `hint` argument so that we can use it right
after the initial error line.

Also mention the same concrete options in the documentation while we’re
at it.

[1]: E.g.,
    • https://lore.kernel.org/git/e1a69dea-bcb6-45fc-83d3-9e50d32c410b@5y5.one/https://lore.kernel.org/git/1011073f-9930-4360-a42f-71eb7421fe3f@chrispalmer.uk/#thttps://lore.kernel.org/git/9fcbfcc4-79f9-421f-b9a4-dc455f7db485@acm.org/#thttps://lore.kernel.org/git/83241BDE-1E0D-489A-9181-C608E9FCC17B@gmail.com/
[2]: The error message on 2.51.0 does tell them to report it, unconditionally
[3]: You only get different outputs if you happen to have empty
     commits (no changes)[4]
[4]: https://lore.kernel.org/git/20250825085428.GA367101@coredump.intra.peff.net/

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 14:37:09 -07:00
Kristoffer Haugsbakk a24d671fae you-still-use-that??: help the user help themselves
Give the user a list of suggestions for what to do when they run a
deprecated command.

The first order of action will be to check the breaking changes
document;[1] this short error message says nothing about why this
command is deprecated, and in any case going into any kind of detail
might overwhelm the user.

Then they can find out if this has been discussed on the mailing list.
Then users who e.g. are using git-whatchanged(1) can learn that this is
arguably a plug-in replacement:

    git log <opts> --raw --no-merges

Finally they are invited to send an email to the mailing list.

Also drop the “please add” part in favor of just using the “refusing”
die-message; these two would have been right after each other in this
new version.

Also drop “Thanks” since it now would require a new paragraph.

[1]: www.git-scm.com has a disclaimer for these internal documents that
    says that “This information is specific to the Git project”.  That’s
    misleading in this particular case.  But users are unlikely to get
    discouraged from reading about why they (or their programs) cannot run a
    command any more; it clearly concerns them.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 14:37:08 -07:00
Kristoffer Haugsbakk 8376cd12a8 t0014: test shadowing of aliases for a sample of builtins
The previous commit added a test for shadowing deprecated builtins.
Let’s make the test suite more complete by exercising a sample of
the builtins and in turn test the documentation for git-config(1):

    To avoid confusion and troubles with script usage, aliases that hide
    existing Git commands are ignored except for deprecated commands.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 14:37:08 -07:00
Kristoffer Haugsbakk 5db2eb522a git: allow alias-shadowing deprecated builtins
git-whatchanged(1) is deprecated and you need to pass
`--i-still-use-this` in order to force it to work as before.
There are two affected users, or usages:

1. people who use the command in scripts; and
2. people who are used to using it interactively.

For (1) the replacement is straightforward.[1]  But people in (2) might
like the name or be really used to typing it.[3]

An obvious first thought is to suggest aliasing `whatchanged` to the
git-log(1) equivalent.[1]  But this doesn’t work and is awkward since you
cannot shadow builtins via aliases.

Now you are left in an uncomfortable limbo; your alias won’t work until
the command is removed for good.

Let’s lift this limitation by allowing *deprecated* builtins to be
shadowed by aliases.

The only observed demand for aliasing has been for git-whatchanged(1),
not for git-pack-redundant(1).  But let’s be consistent and treat all
deprecated commands the same.

[1]:

        git log --raw --no-merges

     With a minor caveat: you get different outputs if you happen to
     have empty commits (no changes)[2]
[2]: https://lore.kernel.org/git/20250825085428.GA367101@coredump.intra.peff.net/
[3]: https://lore.kernel.org/git/BL3P221MB0449288C8B0FA448A227FD48833AA@BL3P221MB0449.NAMP221.PROD.OUTLOOK.COM/

Based-on-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 14:37:08 -07:00
Kristoffer Haugsbakk 0e0b3f8e3d git: add `deprecated` category to --list-cmds
With 145 builtin commands (according to `git --list-cmds=builtins`),
users are probably not keeping on top of which ones (if any) are
deprecated.

Let’s expand the experimental `--list-cmds`[1] to allow users and
programs to query for this information.  We will also use this in an
upcoming commit to implement `is_deprecated_command`.

[1]: Using something which is experimental to query for deprecations is
    perhaps not the most ideal approach, but it is simple to implement
    and better than having to scan the documentation

Acked-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 14:37:07 -07:00
Patrick Steinhardt 32361b7bcd packfile: refactor `get_packed_git_mru()` to work on packfile store
The `get_packed_git_mru()` function prepares the packfile store and then
returns its packfiles in most-recently-used order. Refactor it to accept
a packfile store instead of a repository to clarify its scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:57 -07:00
Patrick Steinhardt b368baad84 packfile: refactor `get_all_packs()` to work on packfile store
The `get_all_packs()` function prepares the packfile store and then
returns its packfiles. Refactor it to accept a packfile store instead of
a repository to clarify its scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:57 -07:00
Patrick Steinhardt 387b9369e0 packfile: remove `get_packed_git()`
We have two different functions to retrieve packfiles for a packfile
store:

  - `get_packed_git()` returns the list of packfiles after having called
    `prepare_packed_git()`.

  - `get_all_packs()` calls `prepare_packed_git()`, as well, but also
    calls `prepare_midx_pack()` for each pack.

Based on the naming alone one might think that `get_all_packs()` would
return more packs than `get_packed_git()`. But that's not the case: both
functions end up returning the exact same list of packfiles. The real
difference between those functions is that `get_all_packs()` also loads
the info of whether or not a packfile is part of a multi-pack index.

Preparing this extra information also shouldn't be significantly more
expensive:

  - We have already loaded all packfiles via `prepare_packed_git_one()`.
    So given that multi-pack indices may only refer to packfiles in the
    same object directory we know that we already loaded each packfile.

  - The multi-pack index was prepared via `packfile_store_prepare()`
    already, which calls `prepare_multi_pack_index_one()`.

  - So all that remains to be done is to look up the index of the pack
    in its multi-pack index so that we can store that info in both the
    pack itself and the MIDX.

So it is somewhat confusing to readers that one of these two functions
claims to load "all" packfiles while the other one doesn't, even though
the ultimate difference is way more nuanced.

Convert all of these sites to use `get_all_packs()` instead and remove
`get_packed_git()`. There doesn't seem to be a good reason to discern
these two functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:57 -07:00
Patrick Steinhardt 426ee6c1a0 packfile: move `get_multi_pack_index()` into "midx.c"
The `get_multi_pack_index()` function is declared and implemented in the
packfile subsystem, even though it really belongs into the multi-pack
index subsystem. The reason for this is likely that it needs to call
`packfile_store_prepare()`, which is not exposed by the packfile system.
In a subsequent commit we're about to add another caller outside of the
packfile system though, so we'll have to expose the function anyway.

Do so now already and move `get_multi_pack_index()` into the MIDX
subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:57 -07:00
Patrick Steinhardt b53f2f0f33 packfile: introduce function to load and add packfiles
We have a recurring pattern where we essentially perform an upsert of a
packfile in case it isn't yet known by the packfile store. The logic to
do so is non-trivial as we have to reconstruct the packfile's key, check
the map of packfiles, then create the new packfile and finally add it to
the store.

Introduce a new function that does this dance for us. Refactor callsites
to use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:56 -07:00
Patrick Steinhardt 58857734eb packfile: refactor `install_packed_git()` to work on packfile store
The `install_packed_git()` functions adds a packfile to a specific
object store. Refactor it to accept a packfile store instead of a
repository to clarify its scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:56 -07:00
Patrick Steinhardt 3ea8ee17ef packfile: split up responsibilities of `reprepare_packed_git()`
In `reprepare_packed_git()` we perform a couple of operations:

  - We reload alternate object directories.

  - We clear the loose object cache.

  - We reprepare packfiles.

While the logic is hosted in "packfile.c", it clearly reaches into other
subsystems that aren't related to packfiles.

Split up the responsibility and introduce `odb_reprepare()` which now
becomes responsible for repreparing the whole object database. The
existing `reprepare_packed_git()` function is refactored accordingly and
only cares about reloading the packfile store now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:56 -07:00
Patrick Steinhardt d7f5c0f0c2 packfile: refactor `prepare_packed_git()` to work on packfile store
The `prepare_packed_git()` function and its friends are responsible for
loading packfiles as well as the multi-pack index for a given object
database. Refactor these functions to accept a packfile store instead of
a repository to clarify their scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:56 -07:00
Patrick Steinhardt e074290cb0 packfile: reorder functions to avoid function declaration
Reorder functions so that we can avoid a forward declaration of
`prepare_packed_git()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:56 -07:00
Patrick Steinhardt a2505c3da9 odb: move kept cache into `struct packfile_store`
The object database tracks a cache of "kept" packfiles, which is used by
git-pack-objects(1) to handle cruft objects. With the introduction of
the `struct packfile_store` we have a better place to host this cache
though.

Move the cache accordingly.

This moves the last bit of packfile-related state from the object
database into the packfile store. Adapt the comment for the `packfiles`
pointer in `struct object_database` to reflect this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:55 -07:00
Patrick Steinhardt e7c9f152ed odb: move MRU list of packfiles into `struct packfile_store`
The object database tracks the list of packfiles in most-recently-used
order, which is mostly used to favor reading from packfiles that contain
most of the objects that we're currently accessing. With the
introduction of the `struct packfile_store` we have a better place to
host this list though.

Move the list accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:55 -07:00
Patrick Steinhardt 7ca74cfd2b odb: move packfile map into `struct packfile_store`
The object database tracks a map of packfiles by their respective paths,
which is used to figure out whether a given packfile has already been
loaded. With the introduction of the `struct packfile_store` we have a
better place to host this list though.

Move the map accordingly.

`pack_map_entry_cmp()` isn't used anywhere but in "packfile.c" anymore
after this change, so we convert it to a static function, as well. Note
that we also drop the `inline` hint: the function is used as a callback
function exclusively, and callbacks cannot be inlined.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:55 -07:00
Patrick Steinhardt 770c819596 odb: move initialization bit into `struct packfile_store`
The object database knows to skip re-initializing the list of packfiles
in case it's already been initialized. Whether or not that is the case
is tracked via a separate `initialized` bit that is stored in the object
database. With the introduction of the `struct packfile_store` we have a
better place to host this bit though.

Move it accordingly. While at it, convert the field into a boolean now
that we're allowed to use them in our code base.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:55 -07:00
Patrick Steinhardt bac8d3817d odb: move list of packfiles into `struct packfile_store`
The object database tracks the list of packfiles it currently knows
about. With the introduction of the `struct packfile_store` we have a
better place to host this list though.

Move the list accordingly. Extract the logic from `odb_clear()` that
knows to close all such packfiles and move it into the new subsystem, as
well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:55 -07:00
Patrick Steinhardt cc8c9c54e1 packfile: introduce a new `struct packfile_store`
Information about a object database's packfiles is currently distributed
across two different structures:

  - `struct packed_git` contains the `next` pointer as well as the
    `mru_head`, both of which serve to store the list of packfiles.

  - `struct object_database` contains several fields that relate to the
    packfiles.

So we don't really have a central data structure that tracks our
packfiles, and consequently responsibilities aren't always clear cut.
A consequence for the upcoming pluggable object databases is that this
makes it very hard to move management of packfiles from the object
database level down into the object database source.

Introduce a new `struct packfile_store` which is about to become the
single source of truth for managing packfiles. Right now this data
structure doesn't yet contain anything, but in subsequent patches we
will move all data structures that relate to packfiles and that are
currently contained in `struct object_database` into this new home.

Note that this is only a first step: most importantly, we won't (yet)
move the `struct packed_git::next` pointer around. This will happen in a
subsequent patch series though so that `struct packed_git` will really
only host information about the specific packfile it represents.

Further note that the new structure still sits at the wrong level at the
end of this patch series: as mentioned, it should eventually sit at the
level of the object database source, not at the object database level.
But introducing the packfile store now already makes it way easier to
eventually push down the now-selfcontained data structure by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-09 13:44:54 -07:00
Adrian Ratiu e55ab8e896 t7425: add gitdir encoding tests
Add some tests to further exercise the gitdir encoding functionality
alongside the existing mixed directory and nested gitdir tests.

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:52 -07:00
Adrian Ratiu 610dc3cf5b t7450: move nested gitdir tests to t7425
Now that we are encoding gitdir paths, these tests are not handling
pathological cases anymore, because nested git dirs shouldn't cause
conflicts, so move them from t7450-bad-git-dotfiles.sh to a more
appropriate location where we test mixed gitdir path & encoding use.

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:52 -07:00
Adrian Ratiu cda79ee498 submodule: remove validate_submodule_git_dir()
The validate_submodule_git_dir test is not very useful anymore, after
submodule names are encoded to resolve gitdir path conflicts.

In other words, the purpouse of gitdir path encoding is precisely to
avoid such conflicts as this function tries to also prevent.

The first test from the function can be kept though, because it just
verifies invariants which should always be true and raise a BUG if:

  - no "/" separator is between dirs/names.
  - len(full_gitdir) < len(name).
  - name does not match the gitdir path suffix.

Thus we move the invariant checks to submodule_name_to_gitdir() and
clean up the rest of validate_submodule_git_dir() and its uses.

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:51 -07:00
Adrian Ratiu 21345e0d17 submodule: error out if gitdir name is too long
Encoding submodule names increases their name size, so there is an
increased risk to hit the max filename length in the gitdir path.
(the likelihood is still rather small, so it's an acceptable risk)

This gitdir file-name-too-long corner case can be be addressed in
multiple ways, including sharding or trimming, however for now, just
add the portable logic (suggested by Peff) to detect the corner case
then error out to avoid comitting to a specific policy (or policies).

In the future, instead of throwing an error (which we do now anyway
without submodule encoding), we could maybe let the user specify via
configs how to address this case, eg pick trimming or sharding.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:51 -07:00
Adrian Ratiu 0e8e2fc86e submodule: encode gitdir paths to avoid conflicts
Based on previous work by Brandon & all [1].

This encodes submodule gitdir names to avoid colisions like nested gitdirs
due to names like "foo" and "foo/bar".

A custom encoding can become unnecesarily complex, while url-encoding is
relatively well-known, however it needs some extending to support case
insensitive filesystems and quirks like Windows reserving "COM1" names.
Hence why I opted to encode A as _a, B as _b and so on.

Unfortunately encoding A -> _a (...) is not enough to fix the reserved
Windows file names (eg COM1) because workdirs still use the name COM1
even though gitdirs paths encoded, so future work will be needed to
fully address that case (or just use a different name).

This affected tests are fixed and a TODO is added to cleanup a hack /
short-circuit in validate_submodule_git_dir().

A further commit will add more tests to exercise these codepaths.

Link: https://lore.kernel.org/git/20180807230637.247200-1-bmwill@google.com/ [1]
Based-on-patch-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:51 -07:00
Adrian Ratiu 66dd36ae03 strbuf: bring back is_rfc3986_unreserved
is_rfc3986_unreserved() was moved to credential-store.c and was made
static by f89854362c (credential-store: move related functions to
credential-store file, 2023-06-06) under a correct assumption, at the
time, that it's the only place used.

However now we need it to apply url encoding to submodule names when
constructing gitdir paths, to avoid conflicts, so bring it back.

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:51 -07:00
Adrian Ratiu 59119995c7 t7425: add basic mixed submodule gitdir path tests
Add some basic submodule tests for mixed gitdir path handling of
legacy (.git/modules) and new-style (.git/submodule) paths.

For now these just test the coexistence, creation and push/pull of
submodules using mixed paths.

More tests will be added later, especially for new-style encoding.

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:51 -07:00
Adrian Ratiu 9cc52c5728 submodule: add gitdir path config override
This adds an ability to override gitdir paths via config files
(not .gitmodules), such that any encoding scheme can be changed
and JGit & co don't need to exactly match the default encoding.

A new test and a helper are added. The helper will be used by
further tests exercising gitdir paths & encodings.

Based-on-patch-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:50 -07:00
Adrian Ratiu 2ea4de8418 submodule: create new gitdirs under submodules path
This is in preparation for encoding the submodule names to avoid conflicts
like submodules named foo and foo/bar together with case-insensitive file-
system handling and other corner cases like reserved filenames on Windows.

Backward compatibility is kept with plain-name modules already existing at
paths like .git/modules/<name>, however a clear separation between legacy
(plain) and new (encoded) namespaces is desirable, to avoid situations like
an existing plain-name module containing the encoding escape character/

Thus we split the new-style (encoded) gitdir name paths to .git/submodules,
while legacy-style paths remain under .git/modules.

This is just a default directory change with the accompanying test updates,
in preparation for the actual encoding additions in future commits.

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:50 -07:00
Adrian Ratiu 878016e008 submodule--helper: use submodule_name_to_gitdir in add_submodule
While testing submodule gitdir path encoding, I noticed submodule--helper
is still using a hardcoded name-based path leading to test failures, so
convert it to the common helper function introduced by commit ce125d431a
(submodule: extract path to submodule gitdir func, 2021-09-15)  and used
in other locations accross the source tree.

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 17:00:50 -07:00
Karthik Nayak 34010a809f refs/files: handle D/F conflicts during locking
The previous commit, added the necessary validation and checks for F/D
conflicts in the files backend when working on case insensitive systems.

There is still a possibility for D/F conflicts. This is a different from
the F/D since for F/D conflicts, there would not be a conflict during
the lock creation phase:

    refs/heads/foo.lock
    refs/heads/foo/bar.lock

However there would be a conflict when the locks are committed, since we
cannot have 'refs/heads/foo/bar' and 'refs/heads/foo'. These kinds of
conflicts are checked and resolved in
`refs_verify_refnames_available()`, so the previous commit ensured that
for case-insensitive filesystems, we would lowercase the inputs to that
function.

For D/F conflicts, there is a conflict during the lock creation phase
itself:

    refs/heads/foo/bar.lock
    refs/heads/foo.lock

As in `lock_raw_ref()` after creating the lock, we also check for D/F
conflicts. To fix this, simply categorize the error as a name conflict.
Also remove this reference from the list of valid refnames for
availability checks.

By categorizing the error and removing it from the list of valid
references, batched updates now knows to reject such reference updates
and apply the other reference updates.

Fix a small typo in `ref_transaction_maybe_set_rejected()` while here.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 15:11:58 -07:00
Karthik Nayak 583e205a1b refs/files: handle F/D conflicts in case-insensitive FS
When using the files-backend on case-insensitive filesystems, there is
possibility of hitting F/D conflicts when creating references within a
single transaction, such as:

  - 'refs/heads/foo'
  - 'refs/heads/Foo/bar'

Ideally such conflicts are caught in `refs_verify_refnames_available()`
which is responsible for checking F/D conflicts within a given
transaction. This utility function is shared across the reference
backends. As such, it doesn't consider the issues of using a
case-insensitive file system, which only affects the files-backend.

While one solution would be to make the function aware of such issues,
this feels like leaking implementation details of file-backend specific
issues into the utility function. So opt for the more simpler option, of
lowercasing all references sent to this function when on a
case-insensitive filesystem and operating on the files-backend.

To do this, simply use a `struct strbuf` to convert the refname to a
lower case and append it to the list of refnames to be checked. Since we
use a `struct strbuf` and the memory is cleared right after, make sure
that the string list duplicates all provided string.

Without this change, the user would simply be left with a repository
with '.lock' files which were created in the 'prepare' phase of the
transaction, as the 'commit' phase would simply abort and not do the
necessary cleanup.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 15:11:58 -07:00
Karthik Nayak d795a2dda2 refs/files: use correct error type when lock exists
When fetching references into a repository, if a lock for a particular
reference exists, then `lock_raw_ref()` throws the generic error
'REF_TRANSACTION_ERROR_GENERIC'. This causes the entire set of batched
updates to fail.

Instead, return a 'REF_TRANSACTION_ERROR_CREATE_EXISTS' error. This
allows batched updates to reject the individual update which conflicts
with the existing file, while updating the rest of the references.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 15:11:58 -07:00
Karthik Nayak ebeb77b0af refs/files: catch conflicts on case-insensitive file-systems
During the 'prepare' phase of reference transaction in the files
backend, we create the lock files for references to be created. When
using batched updates on case-insensitive filesystems, the entire
batched updates would be aborted if there are conflicting names such as:

  refs/heads/Foo
  refs/heads/foo

This affects all commands which were migrated to use batched updates in
Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before
that, reference updates would be applied serially with one transaction
used per update. When users fetched multiple references on
case-insensitive systems, subsequent references would simply overwrite
any earlier references. So when fetching:

  refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6
  refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3
  refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56

The user would simply end up with:

  refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3
  refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56

This is buggy behavior since the user is never informed about the
overrides performed and missing references. Nevertheless, the user is
left with a working repository with a subset of the references. Since
Git 2.51, in such situations fetches would simply fail without updating
any references. Which is also buggy behavior and worse off since the
user is left without any references.

The error is triggered in `lock_raw_ref()` where the files backend
attempts to create a lock file. When a lock file already exists the
function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens,
the entire batched updates, not individual operation, is aborted as if
it were in a transaction.

Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to
aid the batched update mechanism to simply reject such errors. The
change only affects batched updates since batched updates will reject
individual updates with non-generic errors. So specifically this would
only affect:

    1. git fetch
    2. git receive-pack
    3. git update-ref --batch-updates

This bubbles the error type up to `files_transaction_prepare()` which
tries to lock each reference update. So if the locking fails, we check
if the rejection type can be ignored, which is done by calling
`ref_transaction_maybe_set_rejected()`.

As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT',
the specific reference update would simply be rejected, while other
updates in the transaction would continue to be applied. This allows
partial application of references in case-insensitive filesystems when
fetching colliding references.

While the earlier implementation allowed the last reference to be
applied overriding the initial references, this change would allow the
first reference to be applied while rejecting consequent collisions.
This should be an okay compromise since with the files backend, there is
no scenario possible where we would retain all colliding references.

Let's also be more pro-active and notify users on case-insensitive
filesystems about such problems by providing a brief about the issue
while also recommending using the reftable backend, which doesn't have
the same issue.

Reported-by: Joe Drew <joe.drew@indexexchange.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 15:11:57 -07:00
Junio C Hamano 4975ec3473 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 14:54:36 -07:00
Junio C Hamano 95a8428323 Merge branch 'tc/last-modified'
A new command "git last-modified" has been added to show the closest
ancestor commit that touched each path.

* tc/last-modified:
  last-modified: use Bloom filters when available
  t/perf: add last-modified perf script
  last-modified: new subcommand to show when files were last modified
2025-09-08 14:54:35 -07:00
Junio C Hamano 576e0b6eb3 Merge branch 'ds/ls-files-lazy-unsparse'
"git ls-files <pathspec>..." should not necessarily have to expand
the index fully if a sparsified directory is excluded by the
pathspec; the code is taught to expand the index on demand to avoid
this.

* ds/ls-files-lazy-unsparse:
  ls-files: conditionally leave index sparse
2025-09-08 14:54:35 -07:00
Junio C Hamano 4a7ebb9138 Merge branch 'ds/path-walk-repack-fix'
"git repack --path-walk" lost objects in some corner cases, which
has been corrected.

* ds/path-walk-repack-fix:
  path-walk: create initializer for path lists
  path-walk: fix setup of pending objects
2025-09-08 14:54:34 -07:00
Junio C Hamano 9e3d0bd1e1 Merge branch 'am/xdiff-hash-tweak'
Inspired by Ezekiel's recent effort to showcase Rust interface, the
hash function implementation used to hash lines have been updated
to the one used for ELF symbol lookup by Glibc.

* am/xdiff-hash-tweak:
  xdiff: optimize xdl_hash_record_verbatim
  xdiff: refactor xdl_hash_record()
2025-09-08 14:54:34 -07:00
Junio C Hamano 8d5e4290a7 Merge branch 'da/cargo-serialize'
Makefile tried to run multiple "cargo build" which would not work
very well; serialize their execution to work it around.

* da/cargo-serialize:
  Makefile: build libgit-rs and libgit-sys serially
2025-09-08 14:54:34 -07:00
Jeff King 1092cd6435 contrib/diff-highlight: mention interactive.diffFilter
When the README for diff-highlight was written, there was no way to
trigger it for the `add -p` interactive patch mode. We've since grown a
feature to support that, but it was documented only on the Git side.
Let's also let people coming the other direction, from diff-highlight,
know that it's an option.

Suggested-by: Isaac Oscar Gariano <IsaacOscar@live.com.au>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 14:00:33 -07:00
Jeff King 776d6fbd45 add-interactive: manually fall back color config to color.ui
Color options like color.interactive and color.diff should fall back to
the value of color.ui if they aren't set. In add-interactive, we check
the specific options (e.g., color.diff) via repo_config_get_value(),
which does not depend on the main command having loaded any color config
via the git_config() callback mechanism.

But then we call want_color() on the result; if our specific config is
unset then that function uses the value of git_use_color_default. That
variable is typically set from color.ui by the git_color_config()
callback, which is called by the main command in its own git_config()
callback function.

This works fine for "add -p", whose add_config() callback calls into
git_color_config(). But it doesn't work for other commands like
"checkout -p", which is otherwise unaware of color at all. People tend
not to notice because the default is "auto", and that's what they'd set
color.ui to as well. But something like:

  git -c color.ui=false checkout -p

should disable color, and it doesn't.

This regression goes back to 0527ccb1b5 (add -i: default to the built-in
implementation, 2021-11-30). In the perl version we got the color config
from "git config --get-colorbool", which did the full lookup for us.

The obvious fix is for git-checkout to add a call to git_color_config()
to its own config callback. But we'd have to do so for every command
with this problem, which is error-prone. Let's see if we can fix it more
centrally.

It is tempting to teach want_color() to look up the value of
repo_config_get_value("color.ui") itself. But I think that would have
disastrous consequences. Plumbing commands, especially older ones, avoid
porcelain config like "color.*" by simply not parsing it in their config
callbacks. Looking up the value of color.ui under the hood would
undermine that.

Instead, let's do that lookup in the add-interactive setup code. We're
already demand-loading other color config there, which is probably fine
(even in a plumbing command like "git reset", the interactive mode is
inherently porcelain-ish). That catches all commands that use the
interactive code, whether they were calling git_color_config()
themselves or not.

Reported-by: Isaac Oscar Gariano <isaacoscar@live.com.au>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 14:00:32 -07:00
Jeff King 8c78b5c8bc add-interactive: respect color.diff for diff coloring
The old perl git-add--interactive.perl script used the color.diff config
option to decide whether to color diffs (and if not set, it fell back to
the value of color.ui via git-config's --get-colorbool option). When we
switched to the builtin version, this was lost: we respect only
color.ui. So for example:

  git -c color.diff=false add -p

would color the diff, even when it should not.

The culprit is this line in add-interactive.c's parse_diff():

  if (want_color_fd(1, -1))

That "-1" means "no config has been set", which causes it to fall back
to the color.ui setting. We should instead be passing the value of
color.diff. But the problem is that we never even parse that config
option!

Instead the builtin interactive code parses only the value of
color.interactive, which is used for prompts and other messages. One
could perhaps argue that this should cover interactive diff coloring,
too, but historically it did not. The perl script treated
color.interactive and color.diff separately. So we should grab the
values for both, keeping separate fields in our add_i_state variable,
rather than a single use_color field.

We also load individual color slots (e.g., color.interactive.prompt),
leaving them as the empty string when color is disabled. This happens
via the init_color() helper in add-interactive, which checks that
use_color field. Now that there are two such fields, we need to pass the
appropriate one for each color.

The colors are mostly easy to divide up; color.interactive.* follows
color.interactive, and color.diff.* follows color.diff. But the "reset"
color is tricky. It is used for both types of coloring, but the two can
be configured independently. So we introduce two separate reset colors,
and use each in the appropriate spot.

There are two new tests. The first enables interactive prompt colors but
disables color.diff. We should see a colored prompt but not a colored
diff, showing that we are now respecting color.diff (and not
color.interactive or color.ui).

The second does the opposite. We disable color.interactive but turn on
color.diff with a custom fragment color. When we split a hunk, the
interactive code has to re-color the hunk header, which lets us check
that we correctly loaded the color.diff.frag config based on color.diff,
not color.interactive.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 14:00:32 -07:00
Jeff King 89b4183efe stash: pass --no-color to diff plumbing child processes
After a partial stash, we may clear out the working tree by capturing
the output of diff-tree and piping it into git-apply (and likewise we
may use diff-index to restore the index). So we most definitely do not
want color diff output from that diff-tree process.  And it normally
would not produce any, since its stdout is not going to a tty, and the
default value of color.ui is "auto".

However, if GIT_PAGER_IN_USE is set in the environment, that overrides
the tty check, and we'll produce a colorized diff that chokes git-apply:

  $ echo y | GIT_PAGER_IN_USE=1 git stash -p
  [...]
  Saved working directory and index state WIP on main: 4f2e2bb foo
  error: No valid patches in input (allow with "--allow-empty")
  Cannot remove worktree changes

Setting this variable is a relatively silly thing to do, and not
something most users would run into. But we sometimes do it in our tests
to stimulate color. And it is a user-visible bug, so let's fix it rather
than work around it in the tests.

The root issue here is that diff-tree (and other diff plumbing) should
probably not ever produce color by default. It does so not by parsing
color.ui, but because of the baked-in "auto" default from 4c7f1819b3
(make color.ui default to 'auto', 2013-06-10). But changing that is
risky; we've had discussions back and forth on the topic over the years.
E.g.:

  https://lore.kernel.org/git/86D0A377-8AFD-460D-A90E-6327C6934DFC@gmail.com/.

So let's accept that as the status quo for now and protect ourselves by
passing --no-color to the child processes. This is the same thing we did
for add-interactive itself in 1c6ffb546b (add--interactive.perl: specify
--no-color explicitly, 2020-09-07).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 14:00:32 -07:00
Christian Couder 68a746e9a8 promisor-remote: use string_list_split() in mark_remotes_as_accepted()
Previous commits replaced some strbuf_split*() calls with calls to
string_list_split*() in "promisor-remote.c".

For consistency, let's also replace the strbuf_split_str() call in
mark_remotes_as_accepted() with a call to string_list_split(), as we
don't need the splitted strings to be managed by a `struct strbuf`.
Using the lighter-weight `string_list` API is enough for our needs.

While at it let's remove a useless call to `strbuf_strip_suffix()`.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 10:30:56 -07:00
Christian Couder c213820c51 promisor-remote: allow a client to check fields
A previous commit allowed a server to pass additional fields through
the "promisor-remote" protocol capability after the "name" and "url"
fields, specifically the "partialCloneFilter" and "token" fields.

Let's make it possible for a client to check if these fields match
what it expects before accepting a promisor remote.

We allow this by introducing a new "promisor.checkFields"
configuration variable. It should contain a comma or space separated
list of fields that will be checked.

By limiting the protocol to specific well-defined fields, we ensure
both server and client have a shared understanding of field
semantics and usage.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 10:30:55 -07:00
Christian Couder bcb08c8375 promisor-remote: use string_list_split() in filter_promisor_remote()
A previous commit introduced a new parse_one_advertised_remote()
function that takes a `const char *` argument. This function is called
from filter_promisor_remote() and parses all the fields for one remote.

This means that in filter_promisor_remote() we no longer need to split
the remote information that will be passed to
parse_one_advertised_remote() into an array of relatively heavy and
complex `struct strbuf`.

To use something lighter, let's then replace strbuf_split_str() with
string_list_split() in filter_promisor_remote() to parse the remote
information that is passed to parse_one_advertised_remote().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 10:30:55 -07:00