kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Junio C Hamano	8c13c31404	Merge branch 'ps/packfile-store' Code clean-up around the in-core list of all the pack files and object database(s). * ps/packfile-store: packfile: refactor `get_packed_git_mru()` to work on packfile store packfile: refactor `get_all_packs()` to work on packfile store packfile: refactor `get_packed_git()` to work on packfile store packfile: move `get_multi_pack_index()` into "midx.c" packfile: introduce function to load and add packfiles packfile: refactor `install_packed_git()` to work on packfile store packfile: split up responsibilities of `reprepare_packed_git()` packfile: refactor `prepare_packed_git()` to work on packfile store packfile: reorder functions to avoid function declaration odb: move kept cache into `struct packfile_store` odb: move MRU list of packfiles into `struct packfile_store` odb: move packfile map into `struct packfile_store` odb: move initialization bit into `struct packfile_store` odb: move list of packfiles into `struct packfile_store` packfile: introduce a new `struct packfile_store`	2025-10-07 12:25:27 -07:00
Patrick Steinhardt	751808b2a1	packfile: refactor `get_packed_git()` to work on packfile store The `get_packed_git()` function prepares the packfile store and then returns its packfiles. Refactor it to accept a packfile store instead of a repository to clarify its scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-24 11:53:51 -07:00
Patrick Steinhardt	78237ea53d	packfile: split up responsibilities of `reprepare_packed_git()` In `reprepare_packed_git()` we perform a couple of operations: - We reload alternate object directories. - We clear the loose object cache. - We reprepare packfiles. While the logic is hosted in "packfile.c", it clearly reaches into other subsystems that aren't related to packfiles. Split up the responsibility and introduce `odb_reprepare()` which now becomes responsible for repreparing the whole object database. The existing `reprepare_packed_git()` function is refactored accordingly and only cares about reloading the packfile store now. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-24 11:53:50 -07:00
Junio C Hamano	7c15d990cc	Merge branch 'rs/get-oid-with-flags-cleanup' Code clean-up. * rs/get-oid-with-flags-cleanup: use repo_get_oid_with_flags()	2025-09-23 11:53:40 -07:00
Junio C Hamano	d29cbbbf36	Merge branch 'jc/longer-disambiguation-fix' "git rev-parse --short" and friends failed to disambiguate two objects with object names that share common prefix longer than 32 characters, which has been fixed. * jc/longer-disambiguation-fix: abbrev: allow extending beyond 32 chars to disambiguate	2025-09-18 10:07:01 -07:00
Junio C Hamano	5fe4f6304f	Merge branch 'rs/object-name-extend-abbrev-len-update' Code clean-up. * rs/object-name-extend-abbrev-len-update: object-name: declare pointer type of extend_abbrev_len()'s 2nd parameter	2025-09-15 08:52:07 -07:00
René Scharfe	a66fc22bf9	use repo_get_oid_with_flags() get_oid_with_context() allows specifying flags and reports object details via a passed-in struct object_context. Some callers just want to specify flags, but don't need any details back. Convert them to repo_get_oid_with_flags(), which provides just that and frees them from dealing with the context structure. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-10 14:29:49 -07:00
René Scharfe	069c15d256	object-name: declare pointer type of extend_abbrev_len()'s 2nd parameter Expose the expected type of the second parameter of extend_abbrev_len() instead of casting a void pointer internally. Just a single caller passes in a void pointer, the rest pass the correct type. Let the compiler help keeping it that way. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-04 13:25:46 -07:00
Junio C Hamano	9a85fa8406	Merge branch 'ps/remote-rename-fix' "git remote rename origin upstream" failed to move origin/HEAD to upstream/HEAD when origin/HEAD is unborn and performed other renames extremely inefficiently, which has been corrected. * ps/remote-rename-fix: builtin/remote: only iterate through refs that are to be renamed builtin/remote: rework how remote refs get renamed builtin/remote: determine whether refs need renaming early on builtin/remote: fix sign comparison warnings refs: simplify logic when migrating reflog entries refs: pass refname when invoking reflog entry callback	2025-08-21 13:46:58 -07:00
Junio C Hamano	8655908b9e	abbrev: allow extending beyond 32 chars to disambiguate When you have two or more objects with object names that share more than 32 letters in an SHA-1 repository, find_unique_abbrev() fails to show disambiguation. To see how many leading letters of a given full object name is sufficiently unambiguous, the algorithm starts from a initial length, guessed based on the estimated number of objects in the repository, and see if another object that shares the prefix, and keeps extending the abbreviation. The loop stops at GIT_MAX_RAWSZ, which is counted as the number of bytes, since `5b20ace6` (sha1_name: unroll len loop in find_unique_abbrev_r(), 2017-10-08); before that change, it extended up to GIT_SHA1_HEXSZ, which meant to stop at the end of hexadecimal SHA-1 object name. Because the hexadecimal object name passed to the function is NUL-terminated, and this fact is used to correctly terminate the loop that scans for the first difference earlier in the function, use it to make sure we do not increment the .cur_len member beyond the end of the string. Noticed-by: Jon Forrest <nobozo@gmail.com> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-14 08:17:21 -07:00
Patrick Steinhardt	b9fd73a234	refs: pass refname when invoking reflog entry callback With `refs_for_each_reflog_ent()` callers can iterate through all the reflog entries for a given reference. The callback that is being invoked for each such entry does not receive the name of the reference that we are currently iterating through. This isn't really a limiting factor, as callers can simply pass the name via the callback data. But this layout sometimes does make for a bit of an awkward calling pattern. One example: when iterating through all reflogs, and for each reflog we iterate through all refnames, we have to do some extra book keeping to track which reference name we are currently yielding reflog entries for. Change the signature of the callback function so that the reference name of the reflog gets passed through to it. Adapt callers accordingly and start using the new parameter in trivial cases. The next commit will refactor the reference migration logic to make use of this parameter so that we can simplify its logic a bit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-06 14:19:30 -07:00
Junio C Hamano	733b640d50	Merge branch 'ps/object-store-midx' Redefine where the multi-pack-index sits in the object subsystem, which recently was restructured to allow multiple backends that support a single object source that belongs to one repository. A midx does span mulitple "object sources". * ps/object-store-midx: midx: remove now-unused linked list of multi-pack indices packfile: stop using linked MIDX list in `get_all_packs()` packfile: stop using linked MIDX list in `find_pack_entry()` packfile: refactor `get_multi_pack_index()` to work on sources midx: stop using linked list when closing MIDX packfile: refactor `prepare_packed_git_one()` to work on sources midx: start tracking per object database source	2025-08-03 18:44:27 -07:00
Junio C Hamano	0f6e5037d4	Merge branch 'rs/pop-recent-commit-with-prio-queue' The pop_most_recent_commit() function can have quite expensive worst case performance characteristics, which has been optimized by using prio-queue data structure. * rs/pop-recent-commit-with-prio-queue: commit: use prio_queue_replace() in pop_most_recent_commit() prio-queue: add prio_queue_replace() commit: convert pop_most_recent_commit() to prio_queue	2025-07-28 12:02:34 -07:00
René Scharfe	d6ec08788e	commit: convert pop_most_recent_commit() to prio_queue pop_most_recent_commit() calls commit_list_insert_by_date() for parent commits, which is itself called in a loop. This can lead to quadratic complexity if there are many merges. Replace the commit_list with a prio_queue to ensure logarithmic worst case complexity and convert all three users. Add a performance test that exercises one of them using a pathological history that consists of 50% merges and 50% root commits to demonstrate the speedup: Test v2.50.1 HEAD ---------------------------------------------------------------------- 1501.2: rev-parse ':/65535' 2.48(2.47+0.00) 0.20(0.19+0.00) -91.9% Alas, sane histories don't benefit from the conversion much, and traversing Git's own history takes a 1% performance hit on my machine: $ hyperfine -w3 -L git ./git_2.50.1,./git '{git} rev-parse :/^Initial.revision' Benchmark 1: ./git_2.50.1 rev-parse :/^Initial.revision Time (mean ± σ): 1.071 s ± 0.004 s [User: 1.052 s, System: 0.017 s] Range (min … max): 1.067 s … 1.078 s 10 runs Benchmark 2: ./git rev-parse :/^Initial.revision Time (mean ± σ): 1.079 s ± 0.003 s [User: 1.060 s, System: 0.017 s] Range (min … max): 1.074 s … 1.083 s 10 runs Summary ./git_2.50.1 rev-parse :/^Initial.revision ran 1.01 ± 0.00 times faster than ./git rev-parse :/^Initial.revision Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-07-22 07:28:23 -07:00
Junio C Hamano	51b50c55a9	Merge branch 'ps/object-store' Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`	2025-07-15 15:18:18 -07:00
Patrick Steinhardt	736bb725eb	packfile: refactor `get_multi_pack_index()` to work on sources The function `get_multi_pack_index()` loads multi-pack indices via `prepare_packed_git()` and then returns the linked list of multi-pack indices that is stored in `struct object_database`. That list is in the process of being removed though in favor of storing the MIDX as part of the object database source it belongs to. Refactor `get_multi_pack_index()` so that it returns the multi-pack index for a single object source. Callers are now expected to call this function for each source they are interested in. This requires them to iterate through alternates, so we have to prepare alternate object sources before doing so. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-07-15 12:07:29 -07:00
Junio C Hamano	db0583b3fd	Merge branch 'ps/object-store' into ps/object-store-midx * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`	2025-07-09 08:29:08 -07:00
Patrick Steinhardt	e989dd96b8	odb: rename `oid_object_info()` Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-07-01 14:46:37 -07:00
Patrick Steinhardt	c44185f6c1	odb: get rid of `the_repository` when handling alternates The functions to manage alternates all depend on `the_repository`. Refactor them to accept an object database as a parameter and adjust all callers. The functions are renamed accordingly. Note that right now the situation is still somewhat weird because we end up using the object store path provided by the object store's repository anyway. Consequently, we could have instead passed in a pointer to the repository instead of passing in the pointer to the object store. This will be addressed in subsequent commits though, where we will start to use the path owned by the object store itself. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-07-01 14:46:36 -07:00
Patrick Steinhardt	a1e2581a1e	object-store: rename `object_directory` to `odb_source` The `object_directory` structure is used as an access point for a single object directory like ".git/objects". While the structure isn't yet fully self-contained, the intent is for it to eventually contain all information required to access objects in one specific location. While the name "object directory" is a good fit for now, this will change over time as we continue with the agenda to make pluggable object databases a thing. Eventually, objects may not be accessed via any kind of directory at all anymore, but they could instead be backed by any kind of durable storage mechanism. While it seems quite far-fetched for now, it is thinkable that eventually this might even be some form of a database, for example. As such, the current name of this structure will become worse over time as we evolve into the direction of pluggable ODBs. Immediate next steps will start to carve out proper self-contained object directories, which requires us to pass in these object directories as parameters. Based on our modern naming schema this means that those functions should then be named after their subsystem, which means that we would start to bake the current name into the codebase more and more. Let's preempt this by renaming the structure. There have been a couple alternatives that were discussed: - `odb_backend` was discarded because it led to the association that one object database has a single backend, but the model is that one alternate has one backend. Furthermore, "backend" is more about the actual backing implementation and less about the high-level concept. - `odb_alternate` was discarded because it is a bit of a stretch to also call the main object directory an "alternate". Instead, pick `odb_source` as the new name. It makes it sufficiently clear that there can be multiple sources and does not cause confusion when mixed with the already-existing "alternate" terminology. In the future, this change allows us to easily introduce for example a `odb_files_source` and other format-specific implementations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-07-01 14:46:34 -07:00
Junio C Hamano	91f10d7ca2	Merge branch 'bc/stash-export-import' An interchange format for stash entries is defined, and subcommand of "git stash" to import/export has been added. * bc/stash-export-import: builtin/stash: provide a way to import stashes from a ref builtin/stash: provide a way to export stashes to a ref builtin/stash: factor out revision parsing into a function object-name: make get_oid quietly return an error	2025-06-30 14:30:31 -07:00
brian m. carlson	393bbb21c9	object-name: make get_oid quietly return an error A reasonable person looking at the signature and usage of get_oid and friends might conclude that in the event of an error, it always returns -1. However, this is not the case. Instead, get_oid_basic dies if we go too far back into the history of a reflog (or, when quiet, simply exits). This is not especially useful, since in many cases, we might want to handle this error differently. Let's add a flag here to make it just return -1 like elsewhere in these code paths. Note that we cannot make this behavior the default, since we have many other codepaths that rely on the existing behavior, including in tests. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-06-12 13:32:17 -07:00
Patrick Steinhardt	1a793261c5	object-store: move function declarations to their respective subsystems We carry declarations for a couple of functions in "object-store.h" that are not defined in "object-store.c", but in a different subsystem. Move these declarations to the respective headers whose matching code files carry the corresponding definition. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-29 10:08:12 -07:00
Patrick Steinhardt	68cd492a3e	object-store: merge "object-store-ll.h" and "object-store.h" The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-15 08:24:37 -07:00
Junio C Hamano	8d6413a1be	Merge branch 'ps/refname-avail-check-optim' The code paths to check whether a refname X is available (by seeing if another ref X/Y exists, etc.) have been optimized. * ps/refname-avail-check-optim: refs: reuse iterators when determining refname availability refs/iterator: implement seeking for files iterators refs/iterator: implement seeking for packed-ref iterators refs/iterator: implement seeking for ref-cache iterators refs/iterator: implement seeking for reftable iterators refs/iterator: implement seeking for merged iterators refs/iterator: provide infrastructure to re-seek iterators refs/iterator: separate lifecycle from iteration refs: stop re-verifying common prefixes for availability refs/files: batch refname availability checks for initial transactions refs/files: batch refname availability checks for normal transactions refs/reftable: batch refname availability checks refs: introduce function to batch refname availability checks builtin/update-ref: skip ambiguity checks when parsing object IDs object-name: allow skipping ambiguity checks in `get_oid()` family object-name: introduce `repo_get_oid_with_flags()`	2025-03-29 16:39:07 +09:00
Patrick Steinhardt	46a2b52240	object-name: allow skipping ambiguity checks in `get_oid()` family When reading an object ID via `get_oid_basic()` or any of its related functions we perform a check whether the object ID is ambiguous, which can be the case when a reference with the same name exists. While the check is generally helpful, there are cases where it only adds to the runtime overhead without providing much of a benefit. Add a new flag that allows us to disable the check. The flag will be used in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-12 11:31:16 -07:00
Patrick Steinhardt	37e7546b91	object-name: introduce `repo_get_oid_with_flags()` Introduce a new function `repo_get_oid_with_flags()`. This function behaves the same as `repo_get_oid()`, except that it takes an extra `flags` parameter that it ends up passing to `get_oid_with_context()`. This function will be used in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-12 11:31:16 -07:00
Todd Zullinger	550fac1d13	object-name.c: .txt -> .adoc fixes Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-03 13:49:24 -08:00
Elijah Newren	191f0c8db2	object-name: be more strict in parsing describe-like output From Documentation/revisions.txt: '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb':: Output from `git describe`; i.e. a closest tag, optionally followed by a dash and a number of commits, followed by a dash, a 'g', and an abbreviated object name. which means that output of the format ${REFNAME}-${INTEGER}-g${HASH} should parse to fully expanded ${HASH}. This is fine. However, we currently don't validate any of ${REFNAME}-${INTEGER}, we only parse -g${HASH} and assume the rest is valid. That is problematic, since it breaks things like git cat-file -p branchname:path/to/file/named/i-gaffed which, when commit (or tree or blob) affed exists, will not return us information about the file we are looking for but will instead erroneously tell us about object affed. A few additional notes: - This is a slight backward incompatibility break, because we used to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}. However, a backward incompatible break is necessary, because there is no other way for someone to be more specific and disambiguate that they want the blob master:path/to/who-gabbed instead of the object abbed. - There is a possibility that check_refname_format() rules change in the future. However, we can only realistically loosen the rules for what that function accepts rather than tighten. If we were to tighten the rules, some real world repositories may already have refnames that suddenly become unacceptable and we break those repositories. As such, any describe-like syntax of the form ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the changes in this commit will remain valid in the future. - The fact that check_refname_format() rules could loosen in the future is probably also an important reason to make this change. If the rules loosen, there might be additional cases within ${GARBAGE}-g${HASH} that become ambiguous in the future. While abbreviated hashes can be disambiguated by abbreviating less, it may well be that these alternative object names have no way of being disambiguated (much like pathnames cannot be). Accepting all random ${GARBAGE} thus makes it difficult for us to allow future extensions to object naming. So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are present in the string, and would be considered a valid ref and non-negative integer. Also, add a few tests for git describe using object names of the form ${REVISION_NAME}${MODIFIERS} since an early version of this patch failed on constructs like git describe v2.48.0-rc2-161-g6c2274cdbc^0 Reported-by: Gabriel Amaral <gabriel-amaral@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-13 11:48:43 -08:00
Elijah Newren	71e19a0031	object-name: fix resolution of object names containing curly braces Given a branch name of 'foo{bar', commands like git cat-file -p foo{bar:README.md should succeed (assuming that branch had a README.md file, of course). However, the change in `cce91a2cae` (Change 'master@noon' syntax to 'master@{noon}'., 2006-05-19) presumed that curly braces would always come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md' to entirely miss the ':' and assume there's no object being referenced. In short, git would report: fatal: Not a valid object name foo{bar:README.md Change the parsing to only make the assumption of paired curly braces immediately after either a '@' or '^' character appears. Add tests for this, as well as for a few other test cases that initial versions of this patch broke: * 'foo@@{...}' * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}' Note that we'd prefer not duplicating the special logic for "@^" characters here, because if get_oid_basic() or interpret_nth_prior_checkout() or get_oid_basic() or similar gain extra methods of using curly braces, then the logic in get_oid_with_context_1() would need to be updated as well. But it's not clear how to refactor all of these to have a simple common callpoint with the specialized logic. Reported-by: Gabriel Amaral <gabriel-amaral@github.com> Helped-by: Michael Haggerty <mhagger@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-13 11:48:28 -08:00
Junio C Hamano	4156b6a741	Merge branch 'ps/build-sign-compare' Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings	2024-12-23 09:32:11 -08:00
Junio C Hamano	4007617fda	Merge branch 'ps/commit-with-message-syntax-fix' The syntax ":/<text>" to name the latest commit with the matching text was broken with a recent change, which has been corrected. * ps/commit-with-message-syntax-fix: object-name: fix reversed ordering with ":/<text>" revisions	2024-12-15 17:54:30 -08:00
Junio C Hamano	e6663b9ac5	Merge branch 'bf/explicit-config-set-in-advice-messages' The advice messages now tell the newer 'git config set' command to set the advice.token configuration variable to squelch a message. * bf/explicit-config-set-in-advice-messages: advice: suggest using subcommand "git config set"	2024-12-15 17:54:28 -08:00
Patrick Steinhardt	0ff919e87a	object-name: fix reversed ordering with ":/<text>" revisions Recently it was reported [1] that "look for the youngest commit reachable from any ref with log message that match the given pattern" syntax (i.e. ':/<text>') started to return results in reverse recency order. This regression was introduced in Git v2.47.0 and is caused by a memory leak fix done in `57fb139b5e` (object-name: fix leaking commit list items, 2024-08-01). The intent of the identified commit is to stop modifying the commit list provided by the caller such that the caller can properly free all commit list items, including those that the called function might potentially remove from the list. This was done by creating a copy of the passed-in commit list and modifying this copy instead of the caller-provided list. We already knew to create such a copy beforehand with the `backup` list, which was used to clear the `ONELINE_SEEN` commit mark after we were done. So the refactoring simply renamed that list to `copy` and started to operate on that list instead. There is a gotcha though: the backup list, and thus now also the copied list, is always being prepended to, so the resulting list is in reverse order! The end result is that we pop commits from the wrong end of the commit list, returning commits in reverse recency order. Fix the bug by appending to the list instead. [1]: <CAKOEJdcPYn3O01p29rVa+xv=Qr504FQyKJeSB-Moze04ViCGGg@mail.gmail.com> Reported-by: Aarni Koskela <aarni@valohai.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-08 08:23:14 +09:00
Patrick Steinhardt	41f43b8243	global: mark code units that generate warnings with `-Wsign-compare` Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-06 20:20:02 +09:00
Bence Ferdinandy	6c397d0104	advice: suggest using subcommand "git config set" The advice message currently suggests using "git config advice..." to disable advice messages, but since `00bbdde141` (builtin/config: introduce "set" subcommand, 2024-05-06) we have the "set" subcommand for config. Since using the subcommand is more in-line with the modern interface, any advice should be promoting its usage. Change the disable advice message to use the subcommand instead. Change all uses of "git config advice" in the tests to use the subcommand. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-06 11:24:21 +09:00
Junio C Hamano	5bcbde9e49	refs: move ref name helpers around strbuf_branchname(), strbuf_check_{branch,tag}_ref() are helper functions to deal with branch and tag names, and the fact that they happen to use strbuf to hold the name of a branch or a tag is not essential. These functions fit better in the refs API than strbuf API, the latter of which is about string manipulations. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-03 12:38:49 +09:00
Patrick Steinhardt	11dbb4ace3	environment: stop storing "core.warnAmbiguousRefs" globally Same as the preceding commits, storing the "core.warnAmbiguousRefs" value globally is misdesigned as this setting may be set per repository. Move the logic into the repo-settings subsystem. The usual pattern here is that users are expected to call `prepare_repo_settings()` before they access the settings themselves. This seems somewhat fragile though, as it is easy to miss and leads to somewhat ugly code patterns at the call sites. Instead, introduce a new function that encapsulates this logic for us. This also allows us to change how exactly the lazy initialization works in the future, e.g. by only partially initializing values as requested by the caller. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-09-12 10:15:44 -07:00
Junio C Hamano	1b6b2bfae5	Merge branch 'ps/leakfixes-part-4' More leak fixes. * ps/leakfixes-part-4: (22 commits) builtin/diff: free symmetric diff members diff: free state populated via options builtin/log: fix leak when showing converted blob contents userdiff: fix leaking memory for configured diff drivers builtin/format-patch: fix various trivial memory leaks diff: fix leak when parsing invalid ignore regex option unpack-trees: clear index when not propagating it sequencer: release todo list on error paths merge-ort: unconditionally release attributes index builtin/fast-export: plug leaking tag names builtin/fast-export: fix leaking diff options builtin/fast-import: plug trivial memory leaks builtin/notes: fix leaking `struct notes_tree` when merging notes builtin/rebase: fix leaking `commit.gpgsign` value config: fix leaking comment character config submodule-config: fix leaking name entry when traversing submodules read-cache: fix leaking hashfile when writing index fails bulk-checkin: fix leaking state TODO object-name: fix leaking symlink paths in object context object-file: fix memory leak when reading corrupted headers ...	2024-08-23 09:02:33 -07:00
Junio C Hamano	b9497848df	Merge branch 'tb/incremental-midx-part-1' Incremental updates of multi-pack index files. * tb/incremental-midx-part-1: midx: implement support for writing incremental MIDX chains t/t5313-pack-bounds-checks.sh: prepare for sub-directories t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' midx: implement verification support for incremental MIDXs midx: support reading incremental MIDX chains midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs midx: teach `midx_preferred_pack()` about incremental MIDXs midx: teach `midx_contains_pack()` about incremental MIDXs midx: remove unused `midx_locate_pack()` midx: teach `fill_midx_entry()` about incremental MIDXs midx: teach `nth_midxed_offset()` about incremental MIDXs midx: teach `bsearch_midx()` about incremental MIDXs midx: introduce `bsearch_one_midx()` midx: teach `nth_bitmapped_pack()` about incremental MIDXs midx: teach `nth_midxed_object_oid()` about incremental MIDXs midx: teach `prepare_midx_pack()` about incremental MIDXs midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs midx: add new fields for incremental MIDX chains Documentation: describe incremental MIDX format	2024-08-19 11:07:37 -07:00
Junio C Hamano	e7f86cb69d	Merge branch 'jc/refs-symref-referent' The refs API has been taught to give symref target information to the users of ref iterators, allowing for-each-ref and friends to avoid an extra ref_resolve_* API call per a symbolic ref. * jc/refs-symref-referent: ref-filter: populate symref from iterator refs: add referent to each_ref_fn refs: keep track of unresolved reference value in iterators	2024-08-15 13:22:15 -07:00
Patrick Steinhardt	9ddd5f755d	object-name: fix leaking symlink paths in object context The object context may be populated with symlink contents when reading a symlink, but the associated strbuf doesn't ever get released when releasing the object context, causing a memory leak. Plug it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-14 10:07:57 -07:00
John Cai	e8207717f1	refs: add referent to each_ref_fn Add a parameter to each_ref_fn so that callers to the ref APIs that use this function as a callback can have acess to the unresolved value of a symbolic ref. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-09 08:47:34 -07:00
Taylor Blau	3f5f1cff92	midx: introduce `bsearch_one_midx()` The `bsearch_midx()` function will be extended in a following commit to search for the location of a given object ID across all MIDXs in a chain (or the single non-chain MIDX if no chain is available). While most callers will naturally want to use the updated `bsearch_midx()` function, there are a handful of special cases that will want finer control and will only want to search through a single MIDX. For instance, the object abbreviation code, which cares about object IDs near to where we'd expect to find a match in a MIDX. In that case, we want to look at the nearby matches in each layer of the MIDX chain, not just a single one). Split the more fine-grained control out into a separate function called `bsearch_one_midx()` which searches only a single MIDX. At present both `bsearch_midx()` and `bsearch_one_midx()` have identical behavior, but the following commit will rewrite the former to be aware of incremental MIDXs for the remaining non-special case callers. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:36 -07:00
Patrick Steinhardt	57fb139b5e	object-name: fix leaking commit list items When calling `get_oid_oneline()`, we pass in a `struct commit_list` that gets modified by the function. This creates a weird situation where the commit list may sometimes be empty after returning, but sometimes it will continue to carry additional commits. In those cases the remainder of the list leaks. Ultimately, the design where we only pass partial ownership to `get_oid_oneline()` feels shoddy. Refactor the code such that we only pass a constant pointer to the list, creating a local copy as needed. Callers are thus always responsible for freeing the commit list, which then allows us to plug a bunch of memory leaks. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-01 08:47:37 -07:00
Junio C Hamano	3997614c24	Merge branch 'ps/leakfixes-more' More memory leaks have been plugged. * ps/leakfixes-more: (29 commits) builtin/blame: fix leaking ignore revs files builtin/blame: fix leaking prefixed paths blame: fix leaking data for blame scoreboards line-range: plug leaking find functions merge: fix leaking merge bases builtin/merge: fix leaking `struct cmdnames` in `get_strategy()` sequencer: fix memory leaks in `make_script_with_merges()` builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()` apply: fix leaking string in `match_fragment()` sequencer: fix leaking string buffer in `commit_staged_changes()` commit: fix leaking parents when calling `commit_tree_extended()` config: fix leaking "core.notesref" variable rerere: fix various trivial leaks builtin/stash: fix leak in `show_stash()` revision: free diff options builtin/log: fix leaking commit list in git-cherry(1) merge-recursive: fix memory leak when finalizing merge builtin/merge-recursive: fix leaking object ID bases builtin/difftool: plug memory leaks in `run_dir_diff()` object-name: free leaking object contexts ...	2024-07-08 14:53:10 -07:00
Junio C Hamano	7b472da915	Merge branch 'ps/use-the-repository' A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help transition the codebase to rely less on the availability of the singleton the_repository instance. * ps/use-the-repository: hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE` t/helper: remove dependency on `the_repository` in "proc-receive" t/helper: fix segfault in "oid-array" command without repository t/helper: use correct object hash in partial-clone helper compat/fsmonitor: fix socket path in networked SHA256 repos replace-object: use hash algorithm from passed-in repository protocol-caps: use hash algorithm from passed-in repository oidset: pass hash algorithm when parsing file http-fetch: don't crash when parsing packfile without a repo hash-ll: merge with "hash.h" refs: avoid include cycle with "repository.h" global: introduce `USE_THE_REPOSITORY_VARIABLE` macro hash: require hash algorithm in `empty_tree_oid_hex()` hash: require hash algorithm in `is_empty_{blob,tree}_oid()` hash: make `is_null_oid()` independent of `the_repository` hash: convert `oidcmp()` and `oideq()` to compare whole hash global: ensure that object IDs are always padded hash: require hash algorithm in `oidread()` and `oidclr()` hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()` hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions	2024-07-02 09:59:00 -07:00
Patrick Steinhardt	e7da938570	global: introduce `USE_THE_REPOSITORY_VARIABLE` macro Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:33 -07:00
Patrick Steinhardt	037df60013	object-name: don't try to abbreviate to lengths greater than hexsz When given a length that equals the current hash algorithm's hex size, then `repo_find_unique_abbrev_r()` exits early without trying to find an abbreviation. This is only sensible because there is nothing to abbreviate in the first place, so searching through objects to find a unique prefix would be a waste of compute. What we don't handle though is the case where the user passes a length greater than the hash length. This is fine in practice as we still compute the correct result. But at the very least, this is a waste of resources as we try to abbreviate a value that cannot be abbreviated, which causes us to hit the object database. Start to explicitly handle values larger than hexsz to avoid this performance penalty, which leads to a measureable speedup. The following benchmark has been executed in linux.git: Benchmark 1: git -c core.abbrev=9000 log --abbrev-commit (revision = HEAD~) Time (mean ± σ): 12.812 s ± 0.040 s [User: 12.225 s, System: 0.554 s] Range (min … max): 12.723 s … 12.857 s 10 runs Benchmark 2: git -c core.abbrev=9000 log --abbrev-commit (revision = HEAD) Time (mean ± σ): 11.095 s ± 0.029 s [User: 10.546 s, System: 0.521 s] Range (min … max): 11.037 s … 11.122 s 10 runs Summary git -c core.abbrev=9000 log --abbrev-commit HEAD (revision = HEAD) ran 1.15 ± 0.00 times faster than git -c core.abbrev=9000 log --abbrev-commit HEAD (revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-12 12:57:18 -07:00
Patrick Steinhardt	f87c55c264	object-name: free leaking object contexts While it is documented in `struct object_context::path` that this variable needs to be released by the caller, this fact is rather easy to miss given that we do not ever provide a function to release the object context. And of course, while some callers dutifully release the path, many others don't. Introduce a new `object_context_release()` function that releases the path. Convert callsites that used to free the path to use that new function and add missing calls to callsites that were leaking memory. Refactor those callsites as required to have a single return path, only. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-11 13:15:05 -07:00

1 2 3

101 Commits (c96add7552f9270e24dab90d005f7fad62dbb3e8)