Commit Graph

25 Commits (next)

Author SHA1 Message Date
Junio C Hamano f544afcab3 Merge branch 'jt/de-global-bulk-checkin' into next
The bulk-checkin code used to depend on a file-scope static
singleton variable, which has been updated to pass an instance
throughout the callchain.

* jt/de-global-bulk-checkin:
  bulk-checkin: use repository variable from transaction
  bulk-checkin: require transaction for index_blob_bulk_checkin()
  bulk-checkin: remove global transaction state
  bulk-checkin: introduce object database transaction structure
2025-09-08 15:24:41 -07:00
Junio C Hamano 4b12427226 Merge branch 'ps/object-store-midx-dedup-info' into next
Further code clean-up for multi-pack-index code paths.

* ps/object-store-midx-dedup-info:
  midx: compute paths via their source
  midx: stop duplicating info redundant with its owning source
  midx: write multi-pack indices via their source
  midx: load multi-pack indices via their source
  midx: drop redundant `struct repository` parameter
  odb: simplify calling `link_alt_odb_entry()`
  odb: return newly created in-memory sources
  odb: consistently use "dir" to refer to alternate's directory
  odb: allow `odb_find_source()` to fail
  odb: store locality in object database sources
2025-09-03 14:54:37 -07:00
Justin Tobler b336144725 bulk-checkin: remove global transaction state
Object database transactions in the bulk-checkin subsystem rely on
global state to track transaction status. Stop relying on global state
and instead store the transaction in the `struct object_database`.
Functions that operate on transactions are updated to now wire
transaction state.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-25 09:48:13 -07:00
Patrick Steinhardt a59d44ff3f odb: return newly created in-memory sources
Callers have no trivial way to obtain the newly created object database
source when adding it to the in-memory list of alternates. While not yet
needed anywhere, a subsequent commit will want to obtain that pointer.

Refactor the function to return the source to make it easily accessible.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11 09:22:21 -07:00
Patrick Steinhardt 0d61933b8f odb: allow `odb_find_source()` to fail
When trying to locate a source for an unknown object directory we will
die right away. In subsequent patches we will add new callsites though
that want to handle this situation gracefully instead.

Refactor the function to return a `NULL` pointer if the source could not
be found and adapt the callsites to die instead. Introduce a new wrapper
`odb_find_source_or_die()` that continues to die in case the source
could not be found.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11 09:22:21 -07:00
Patrick Steinhardt 595bef7180 odb: store locality in object database sources
Object database sources are classified either as:

  - Local, which means that the source is the repository's primary
    source. This is typically ".git/objects".

  - Non-local, which is everything else. Most importantly this includes
    alternates and quarantine directories.

This locality is often computed ad-hoc by checking whether a given
object source is the first one. This works, but it is quite roundabout.

Refactor the code so that we store locality when creating the sources in
the first place. This makes it both more accessible and robust.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11 09:22:21 -07:00
Junio C Hamano 4ce0caa7cc Merge branch 'ps/object-file-wo-the-repository'
Reduce implicit assumption and dependence on the_repository in the
object-file subsystem.

* ps/object-file-wo-the-repository:
  object-file: get rid of `the_repository` in index-related functions
  object-file: get rid of `the_repository` in `force_object_loose()`
  object-file: get rid of `the_repository` in `read_loose_object()`
  object-file: get rid of `the_repository` in loose object iterators
  object-file: remove declaration for `for_each_file_in_obj_subdir()`
  object-file: inline `for_each_loose_file_in_objdir_buf()`
  object-file: get rid of `the_repository` when writing objects
  odb: introduce `odb_write_object()`
  loose: write loose objects map via their source
  object-file: get rid of `the_repository` in `finalize_object_file()`
  object-file: get rid of `the_repository` in `loose_object_info()`
  object-file: get rid of `the_repository` when freshening objects
  object-file: inline `check_and_freshen()` functions
  object-file: get rid of `the_repository` in `has_loose_object()`
  object-file: stop using `the_hash_algo`
  object-file: fix -Wsign-compare warnings
2025-08-05 11:53:55 -07:00
Junio C Hamano 0c1253014e Merge branch 'ps/object-file-wo-the-repository' into next
Reduce implicit assumption and dependence on the_repository in the
object-file subsystem.

* ps/object-file-wo-the-repository:
  object-file: get rid of `the_repository` in index-related functions
  object-file: get rid of `the_repository` in `force_object_loose()`
  object-file: get rid of `the_repository` in `read_loose_object()`
  object-file: get rid of `the_repository` in loose object iterators
  object-file: remove declaration for `for_each_file_in_obj_subdir()`
  object-file: inline `for_each_loose_file_in_objdir_buf()`
  object-file: get rid of `the_repository` when writing objects
  odb: introduce `odb_write_object()`
  loose: write loose objects map via their source
  object-file: get rid of `the_repository` in `finalize_object_file()`
  object-file: get rid of `the_repository` in `loose_object_info()`
  object-file: get rid of `the_repository` when freshening objects
  object-file: inline `check_and_freshen()` functions
  object-file: get rid of `the_repository` in `has_loose_object()`
  object-file: stop using `the_hash_algo`
  object-file: fix -Wsign-compare warnings
2025-08-01 11:31:29 -07:00
Patrick Steinhardt ab1c6e1d12 odb: introduce `odb_write_object()`
We do not have a backend-agnostic way to write objects into an object
database. While there is `write_object_file()`, this function is rather
specific to the loose object format.

Introduce `odb_write_object()` to plug this gap. For now, this function
is a simple wrapper around `write_object_file()` and doesn't even use
the passed-in object database yet. This will change in subsequent
commits, where `write_object_file()` is converted so that it works on
top of an `odb_source`. `odb_write_object()` will then become
responsible for deciding which source an object shall be written to.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16 22:16:15 -07:00
Patrick Steinhardt ec865d94d4 midx: remove now-unused linked list of multi-pack indices
In the preceding commits we have migrated all users of the linked list
of multi-pack indices to instead use those stored in the object database
sources. Remove those now-unused pointers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15 12:07:30 -07:00
Patrick Steinhardt 4d8be89d97 midx: start tracking per object database source
Multi-pack indices are tracked via `struct multi_pack_index`. This data
structure is stored as a linked list inside `struct object_database`,
which is the global database that spans across all of the object
sources.

This layout causes two problems:

  - Object databases consist of multiple object sources (e.g. one source
    per alternate object directory), where each multi-pack index is
    specific to one of those sources. Regardless of that though, the
    MIDX is not tracked per source, but tracked globally for the whole
    object database. This creates a mismatch between the on-disk layout
    and how things are organized in the object database subsystems and
    makes some parts, like figuring out whether a source has an MIDX,
    quite awkward.

  - Multi-pack indices are an implementation detail of how efficient
    access for packfiles work. As such, they are neither relevant in the
    context of loose objects, nor in a potential future where we have
    pluggable backends.

Refactor `prepare_multi_pack_index_one()` so that it works on a specific
source, which allows us to easily store a pointer to the multi-pack
index inside of it. For now, this pointer exists next to the existing
linked list we have in the object database. Users will be adjusted in
subsequent patches to instead use the per-source pointers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15 12:07:28 -07:00
Patrick Steinhardt 841a03b404 odb: rename `read_object_with_reference()`
Rename `read_object_with_reference()` to `odb_read_object_peeled()` to
match other functions related to the object database and our modern
coding guidelines. Furthermore though, the old name didn't really
describe very well what this function actually does, which is to walk
down any commit and tag objects until an object of the required type has
been found. This is generally referred to as "peeling", so the new name
should be way more descriptive.

No compatibility wrapper is introduced as the function is not used a lot
throughout our codebase.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:39 -07:00
Patrick Steinhardt 08218b8cd4 odb: rename `pretend_object_file()`
Rename `pretend_object_file()` to `odb_pretend_object()` to match other
functions related to the object database and our modern coding
guidelines.

No compatibility wrapper is introduced as the function is not used a lot
throughout our codebase.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:38 -07:00
Patrick Steinhardt fcf8e3e111 odb: rename `has_object()`
Rename `has_object()` to `odb_has_object()` to match other functions
related to the object database and our modern coding guidelines.

Introduce a compatibility wrapper so that any in-flight topics will
continue to compile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:38 -07:00
Patrick Steinhardt d4ff88aee3 odb: rename `repo_read_object_file()`
Rename `repo_read_object_file()` to `odb_read_object()` to match other
functions related to the object database and our modern coding
guidelines.

Introduce a compatibility wrapper so that any in-flight topics will
continue to compile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:38 -07:00
Patrick Steinhardt e989dd96b8 odb: rename `oid_object_info()`
Rename `oid_object_info()` to `odb_read_object_info()` as well as their
`_extended()` variant to match other functions related to the object
database and our modern coding guidelines.

Introduce compatibility wrappers so that any in-flight topics will
continue to compile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:37 -07:00
Patrick Steinhardt fc28a8a856 odb: get rid of `the_repository` when handling submodule sources
The "--recursive" flag for git-grep(1) allows users to grep for a string
across submodule boundaries. To make this work we add each submodule's
object sources to our own object database so that the objects can be
accessed directly.

The infrastructure for this depends on a global string list of submodule
paths. The caller is expected to call `add_submodule_odb_by_path()` for
each source and the object database will then eventually register all
submodule sources via `do_oid_object_info_extended()` in case it isn't
able to look up a specific object.

This reliance on global state is of course suboptimal with regards to
our libification efforts.

Refactor the logic so that the list of submodule sources is instead
tracked in the object database itself. This allows us to lose the
condition of `r == the_repository` before registering submodule sources
as we only ever add submodule sources to `the_repository` anyway. As
such, behaviour before and after this refactoring should always be the
same.

Rename the functions accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:37 -07:00
Patrick Steinhardt 7eafd4472d odb: get rid of `the_repository` when handling the primary source
The functions `set_temporary_primary_odb()` and `restore_primary_odb()`
are responsible for managing a temporary primary source for the
database. Both of these functions implicitly rely on `the_repository`.

Refactor them to instead take an explicit object database parameter as
argument and adjust callers. Rename the functions accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:36 -07:00
Patrick Steinhardt 798c661ce3 odb: get rid of `the_repository` in `for_each()` functions
There are a couple of iterator-style functions that execute a callback
for each instance of a given set, all of which currently depend on
`the_repository`. Refactor them to instead take an object database as
parameter so that we can get rid of this dependency.

Rename the functions accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:36 -07:00
Patrick Steinhardt c44185f6c1 odb: get rid of `the_repository` when handling alternates
The functions to manage alternates all depend on `the_repository`.
Refactor them to accept an object database as a parameter and adjust all
callers. The functions are renamed accordingly.

Note that right now the situation is still somewhat weird because we end
up using the object store path provided by the object store's repository
anyway. Consequently, we could have instead passed in a pointer to the
repository instead of passing in the pointer to the object store. This
will be addressed in subsequent commits though, where we will start to
use the path owned by the object store itself.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:36 -07:00
Patrick Steinhardt 1b1679c688 odb: get rid of `the_repository` in `odb_mkstemp()`
Get rid of our dependency on `the_repository` in `odb_mkstemp()` by
passing in the object database as a parameter and adjusting all callers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:35 -07:00
Patrick Steinhardt 961038856b odb: get rid of `the_repository` in `assert_oid_type()`
Get rid of our dependency on `the_repository` in `assert_oid_type()` by
passing in the object database as a parameter and adjusting all callers.

Rename the function to `odb_assert_oid_type()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:35 -07:00
Patrick Steinhardt bd52ea343d odb: get rid of `the_repository` in `find_odb()`
Get rid of our dependency on `the_repository` in `find_odb()` by passing
in the object database in which we want to search for the source and
adjusting all callers.

Rename the function to `odb_find_source()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:35 -07:00
Patrick Steinhardt 2f5181fce6 odb: introduce parent pointers
In subsequent commits we'll get rid of our use of `the_repository` in
"odb.c" in favor of explicitly passing in a `struct object_database` or
a `struct odb_source`. In some cases though we'll need access to the
repository, for example to read a config value from it, but we don't
have a way to access the repository owning a specific object database.

Introduce parent pointers for `struct object_database` to its owning
repository as well as for `struct odb_source` to its owning object
database, which will allow us to adapt those use cases.

Note that this change requires us to pass through the object database to
`link_alt_odb_entry()` so that we can set up the parent pointers for any
source there. The callchain is adapted to pass through the object
database accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:34 -07:00
Patrick Steinhardt 8f49151763 object-store: rename files to "odb.{c,h}"
In the preceding commits we have renamed the structures contained in
"object-store.h" to `struct object_database` and `struct odb_backend`.
As such, the code files "object-store.{c,h}" are confusingly named now.
Rename them to "odb.{c,h}" accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:34 -07:00