kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Junio C Hamano	d61ff9c237	Merge branch 'ps/object-file-cleanup' into ps/object-store-cleanup * ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"	2025-04-24 11:37:21 -07:00
Josh Heinrichs	eb2d7beb0e	maintenance: fix launchctl calendar intervals When using the launchctl scheduler, the weekly job runs daily, and the daily job runs on the first six days of each month. This appears to be due to specifying "Day" in the calendar intervals, which according to launchd.plist(5) is for specifying days of the month rather than days of the week. The behaviour of running a job on the 0th day is undocumented, but in my testing appears to be the same as not specifying "Day" in the calendar interval, in which case the job will run daily. Use "Weekday" in the calendar intervals, which is the correct way to schedule jobs to run on specific days of the week. Signed-off-by: Josh Heinrichs <joshiheinrichs@gmail.com> Acked-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-23 12:58:52 -07:00
Patrick Steinhardt	785c17df78	parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()` With the preceding commit, `OPT_INTEGER()` has learned to support unit factors. Consequently, the major differencen between `OPT_INTEGER()` and `OPT_MAGNITUDE()` isn't the support of unit factors anymore, as both of them do support them now. Instead, the difference is that one handles signed and the other handles unsigned integers. Adapt the name of `OPT_MAGNITUDE()` accordingly by renaming it to `OPT_UNSIGNED()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-17 08:15:15 -07:00
Patrick Steinhardt	d012ceb5f3	global: use designated initializers for options While we expose macros for most of our different option types understood by the "parse-options" subsystem, not every combination of fields that has one as that would otherwise quickly lead to an explosion of macros. Instead, we just initialize structures manually for those variants of fields that don't have a macro. Callsites that open-code these structure initialization don't use designated initializers though and instead just provide values for each of the fields that they want to initialize. This has three significant downsides: - Callsites need to specify all values up to the last field that they care about. This often includes fields that should simply be left at their default zero-initialized state, which adds distraction. - Any reader not deeply familiar with the layout of the structure has a hard time figuring out what the respective initializers mean. - Reordering or introducing new fields in the middle of the structure is impossible without adapting all callsites. Convert all sites to instead use designated initializers, which we have started using in our codebase quite a while ago. This allows us to skip any default-initialized fields, gives the reader context by specifying the field names and allows us to reorder or introduce new fields where we want to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-17 08:15:15 -07:00
Ramsay Jones	c9a51775a3	builtin/gc.c: correct RAM calculation when using sysinfo The man page for sysinfo(2) on Linux states that (from v2.3.48) the sizes of the memory and swap fields, of the returned structure, are given as multiples of 'mem_unit' bytes. In earlier versions (prior to v2.3.23 on i386 in particular), the 'mem_unit' field was not part of the structure, and all sizes were measured in bytes. The man page does not discuss the motivation for this change, but it is possible that the change was intended for the, relatively rare, 32-bit platform with more than 4GB of memory. The total_ram() function makes the assumption that the 'totalram' field of the 'struct sysinfo' is measured in bytes, or alternatively that the 'mem_unit' field is always equal to one. Having writen a program to call the sysinfo() function and print the structure fields, it seems that, on Linux x84_64 and i686 anyway, the 'mem_unit' field is indeed set to one (note that the 32-bit system had only 2GB ram). However, cygwin also has an sysinfo() implementation, which gives the following values: $ ./sysinfo uptime: 21381 loads: 0, 0, 0 total ram: 2074637 free ram: 843237 shared ram: 0 buffer ram: 0 total swap: 327680 free swap: 306932 procs: 15 total high: 0 free high: 0 mem_unit: 4096 total ram: 8497713152 $ [This laptop has 8GB ram, so a little bit seems to be missing. ;) ] Modify the total_ram() function to allow for the possibility that the memory size is not specified in bytes (ie 'mem_unit' is greater than one). Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-16 20:43:45 -07:00
Junio C Hamano	01a6e244f9	Merge branch 'ps/maintenance-reflog-expire' "git maintenance" learns a new task to expire reflog entries. * ps/maintenance-reflog-expire: builtin/maintenance: introduce "reflog-expire" task builtin/gc: split out function to expire reflog entries builtin/reflog: make functions regarding `reflog_expire_options` public builtin/reflog: stop storing per-reflog expiry dates globally builtin/reflog: stop storing default reflog expiry dates globally reflog: rename `cmd_reflog_expire_cb` to `reflog_expire_options`	2025-04-16 13:54:19 -07:00
Patrick Steinhardt	68cd492a3e	object-store: merge "object-store-ll.h" and "object-store.h" The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-15 08:24:37 -07:00
Patrick Steinhardt	1a99fe8010	object-file: move `safe_create_leading_directories()` into "path.c" The `safe_create_leading_directories()` function and its relatives are located in "object-file.c", which is not a good fit as they provide generic functionality not related to objects at all. Move them into "path.c", which already hosts `safe_create_dir()` and its relative `safe_create_dir_in_gitdir()`. "path.c" is free of `the_repository`, but the moved functions depend on `the_repository` to read the "core.sharedRepository" config. Adapt the function signature to accept a repository as argument to fix the issue and adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-15 08:24:35 -07:00
Patrick Steinhardt	8e0a1ec076	builtin/maintenance: introduce "reflog-expire" task By default, git-maintenance(1) uses the "gc" task to ensure that the repository is well-maintained. This can be changed, for example by either explicitly configuring which tasks should be enabled or by using the "incremental" maintenance strategy. If so, git-maintenance(1) does not know to expire reflog entries, which is a subtask that git-gc(1) knows to perform for the user. Consequently, the reflog will grow indefinitely unless the user manually trims it. Introduce a new "reflog-expire" task that plugs this gap: - When running the task directly, then we simply execute `git reflog expire --all`, which is the same as git-gc(1). - When running git-maintenance(1) with the `--auto` flag, then we only run the task in case the "HEAD" reflog has at least N reflog entries that would be discarded. By default, N is set to 100, but this can be configured via "maintenance.reflog-expire.auto". When a negative integer has been provided we always expire entries, zero causes us to never expire entries, and a positive value specifies how many entries need to exist before we consider pruning the entries. Note that the condition for the `--auto` flags is merely a heuristic and optimized for being fast. This is because `git maintenance run --auto` will be executed quite regularly, so scanning through all reflogs would likely be too expensive in many repositories. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-08 07:53:27 -07:00
Patrick Steinhardt	3fef24ac3f	builtin/gc: split out function to expire reflog entries We're about to introduce a new task for git-maintenance(1) that knows to expire reflog entries. The logic will be shared with git-gc(1), which already knows how to do this. Pull out the common logic into a separate function so that we can share the implementation between both builtins. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-08 07:53:27 -07:00
Derrick Stolee	6540560fd6	maintenance: add loose-objects.batchSize config The 'loose-objects' task of 'git maintenance run' first deletes loose objects that exit within packfiles and then collects loose objects into a packfile. This second step uses an implicit limit of fifty thousand that cannot be modified by users. Add a new config option that allows this limit to be adjusted or ignored entirely. While creating tests for this option, I noticed that actually there was an off-by-one error due to the strict comparison in the limit check. I considered making the limit check turn true on equality, but instead I thought to use INT_MAX as a "no limit" barrier which should mean it's never possible to hit the limit. Thus, a new decrement to the limit is provided if the value is positive. (The restriction to positive values is to avoid underflow if INT_MIN is configured.) Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-23 23:06:01 -07:00
Derrick Stolee	286183da99	maintenance: force progress/no-quiet to children The --no-quiet option for 'git maintenance run' is supposed to indicate that progress should happen even while ignoring the value of isatty(2). However, Git implicitly asks child processes to check isatty(2) since these arguments are not passed through. The pass through of --no-quiet will be useful in a test in the next change. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-23 23:06:01 -07:00
Junio C Hamano	feffb34257	Merge branch 'ps/path-sans-the-repository' The path.[ch] API takes an explicit repository parameter passed throughout the callchain, instead of relying on the_repository singleton instance. * ps/path-sans-the-repository: path: adjust last remaining users of `the_repository` environment: move access to "core.sharedRepository" into repo settings environment: move access to "core.hooksPath" into repo settings repo-settings: introduce function to clear struct path: drop `git_path()` in favor of `repo_git_path()` rerere: let `rerere_path()` write paths into a caller-provided buffer path: drop `git_common_path()` in favor of `repo_common_path()` worktree: return allocated string from `get_worktree_git_dir()` path: drop `git_path_buf()` in favor of `repo_git_path_replace()` path: drop `git_pathdup()` in favor of `repo_git_path()` path: drop unused `strbuf_git_path()` function path: refactor `repo_submodule_path()` family of functions submodule: refactor `submodule_to_gitdir()` to accept a repo path: refactor `repo_worktree_path()` family of functions path: refactor `repo_git_path()` family of functions path: refactor `repo_common_path()` family of functions	2025-03-05 10:37:43 -08:00
Patrick Steinhardt	88dd321cfe	path: drop `git_path()` in favor of `repo_git_path()` Remove `git_path()` in favor of the `repo_git_path()` family of functions, which makes the implicit dependency on `the_repository` go away. Note that `git_path()` returned a string allocated via `get_pathname()`, which uses a rotating set of statically allocated buffers. Consequently, callers didn't have to free the returned string. The same isn't true for `repo_common_path()`, so we also have to add logic to free the returned strings. This refactoring also allows us to remove `repo_common_pathv()` as well as `get_pathname()` from the public interface. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-28 13:54:11 -08:00
Junio C Hamano	5b9d01bc4d	Merge branch 'zh/gc-expire-to' "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". * zh/gc-expire-to: gc: add `--expire-to` option	2025-02-12 10:08:53 -08:00
Patrick Steinhardt	bba59f58a4	path: drop `git_pathdup()` in favor of `repo_git_path()` Remove `git_pathdup()` in favor of `repo_git_path()`. The latter does essentially the same, with the only exception that it does not rely on `the_repository` but takes the repo as separate parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-07 09:59:22 -08:00
Junio C Hamano	f0a371a39d	Merge branch 'jc/show-usage-help' The help text from "git $cmd -h" appear on the standard output for some $cmd and the standard error for others. The built-in commands have been fixed to show them on the standard output consistently. * jc/show-usage-help: builtin: send usage() help text to standard output oddballs: send usage() help text to standard output builtins: send usage_with_options() help text to standard output usage: add show_usage_if_asked() parse-options: add show_usage_with_options_if_asked() t0012: optionally check that "-h" output goes to stdout	2025-01-28 13:02:22 -08:00
ZheNing Hu	08032fa30f	gc: add `--expire-to` option This commit extends the functionality of `git gc` by adding a new option, `--expire-to=<dir>`. Previously, this feature was implemented in `91badeba32` (builtin/repack.c: implement `--expire-to` for storing pruned objects, 2022-10-24), which allowing users to specify a directory where unreachable and expired cruft packs are stored during garbage collection. However, users had to run `git repack --cruft --expire-to=<dir>` followed by `git prune` to achieve similar results within `git gc`. By introducing `--expire-to=<dir>` directly into `git gc`, we simplify the process for users who wish to manage their repository's cleanup more efficiently. This change involves passing the `--expire-to=<dir>` parameter through to `git repack`, making it easier for users to set up a backup location for cruft packs that will be pruned. Due to the original `git gc --prune=now` deleting all unreachable objects by passing the `-a` parameter to git repack. With the addition of the `--cruft` and `--expire-to` options, it is necessary to modify this default behavior: instead of deleting these unreachable objects, they should be merged into a cruft pack and collected in a specified directory. Therefore, we do not pass `-a` to the repack command but instead pass `--cruft`, `--expire-to`, and `--cruft-expiration=now` to repack. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-24 14:32:28 -08:00
Junio C Hamano	b821c999ca	builtins: send usage_with_options() help text to standard output Using the show_usage_with_options_if_asked() helper we introduced earlier, fix callers of usage_with_options() that want to show the help text when explicitly asked by the end-user. The help text now goes to the standard output stream for them. The test in t7600 for "git merge -h" may want to be retired, as the same is covered by t0012 already, but it is specifically testing that the "-h" option gets a response even with a corrupt index file, so for now let's leave it there. Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-17 13:30:03 -08:00
Patrick Steinhardt	1568d1562e	wrapper: allow generating insecure random bytes The `csprng_bytes()` function generates randomness and writes it into a caller-provided buffer. It abstracts over a couple of implementations, where the exact one that is used depends on the platform. These implementations have different guarantees: while some guarantee to never fail (arc4random(3)), others may fail. There are two significant failures to distinguish from one another: - Systemic failure, where e.g. opening "/dev/urandom" fails or when OpenSSL doesn't have a provider configured. - Entropy failure, where the entropy pool is exhausted, and thus the function cannot guarantee strong cryptographic randomness. While we cannot do anything about the former, the latter failure can be acceptable in some situations where we don't care whether or not the randomness can be predicted. Introduce a new `CSPRNG_BYTES_INSECURE` flag that allows callers to opt into weak cryptographic randomness. The exact behaviour of the flag depends on the underlying implementation: - `arc4random_buf()` never returns an error, so it doesn't change. - `getrandom()` pulls from "/dev/urandom" by default, which never blocks on modern systems even when the entropy pool is empty. - `getentropy()` seems to block when there is not enough randomness available, and there is no way of changing that behaviour. - `GtlGenRandom()` doesn't mention anything about its specific failure mode. - The fallback reads from "/dev/urandom", which also returns bytes in case the entropy pool is drained in modern Linux systems. That only leaves OpenSSL with `RAND_bytes()`, which returns an error in case the returned data wouldn't be cryptographically safe. This function is replaced with a call to `RAND_pseudo_bytes()`, which can indicate whether or not the returned data is cryptographically secure via its return value. If it is insecure, and if the `CSPRNG_BYTES_INSECURE` flag is set, then we ignore the insecurity and return the data regardless. It is somewhat questionable whether we really need the flag in the first place, or whether we wouldn't just ignore the potentially-insecure data. But the risk of doing that is that we might have or grow callsites that aren't aware of the potential insecureness of the data in places where it really matters. So using a flag to opt-in to that behaviour feels like the more secure choice. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 09:04:18 -08:00
Junio C Hamano	4156b6a741	Merge branch 'ps/build-sign-compare' Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings	2024-12-23 09:32:11 -08:00
Junio C Hamano	ca43bd2562	Merge branch 'kn/midx-wo-the-repository' Yet another "pass the repository through the callchain" topic. * kn/midx-wo-the-repository: midx: inline the `MIDX_MIN_SIZE` definition midx: pass down `hash_algo` to functions using global variables midx: pass `repository` to `load_multi_pack_index` midx: cleanup internal usage of `the_repository` and `the_hash_algo` midx-write: pass down repository to `write_midx_file[_only]` write-midx: add repository field to `write_midx_context` midx-write: use `revs->repo` inside `read_refs_snapshot` midx-write: pass down repository to static functions packfile.c: remove unnecessary prepare_packed_git() call midx: add repository to `multi_pack_index` struct config: make `packed_git_(limit\|window_size)` non-global variables config: make `delta_base_cache_limit` a non-global variable packfile: pass down repository to `for_each_packed_object` packfile: pass down repository to `has_object[_kept]_pack` packfile: pass down repository to `odb_pack_name` packfile: pass `repository` to static function in the file packfile: use `repository` from `packed_git` directly packfile: add repository to struct `packed_git`	2024-12-13 07:33:44 -08:00
Patrick Steinhardt	41f43b8243	global: mark code units that generate warnings with `-Wsign-compare` Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-06 20:20:02 +09:00
Junio C Hamano	aaafb67ba9	Merge branch 'kn/pass-repo-to-builtin-sub-sub-commands' into kn/midx-wo-the-repository * kn/pass-repo-to-builtin-sub-sub-commands: builtin: pass repository to sub commands Git 2.47.1 Makefile(s): avoid recipe prefix in conditional statements doc: switch links to https doc: update links to current pages The eleventh batch pack-objects: only perform verbatim reuse on the preferred pack t5332-multi-pack-reuse.sh: demonstrate duplicate packing failure test-lib: move malloc-debug setup after $PATH setup builtin/difftool: intialize some hashmap variables refspec: store raw refspecs inside refspec_item refspec: drop separate raw_nr count fetch: adjust refspec->raw_nr when filtering prefetch refspecs test-lib: check malloc debug LD_PRELOAD before using	2024-12-04 10:32:02 +09:00
Junio C Hamano	1e18cf4310	Merge branch 'kn/pass-repo-to-builtin-sub-sub-commands' Built-in Git subcommands are supplied the repository object to work with; they learned to do the same when they invoke sub-subcommands. * kn/pass-repo-to-builtin-sub-sub-commands: builtin: pass repository to sub commands	2024-12-04 10:14:47 +09:00
Junio C Hamano	2f605347da	Merge branch 'ps/gc-stale-lock-warning' Give a bit of advice/hint message when "git maintenance" stops finding a lock file left by another instance that still is potentially running. * ps/gc-stale-lock-warning: t7900: fix host-dependent behaviour when testing git-maintenance(1) builtin/gc: provide hint when maintenance hits a stale schedule lock	2024-12-04 10:14:37 +09:00
Karthik Nayak	d6b2d21fbf	config: make `delta_base_cache_limit` a non-global variable The `delta_base_cache_limit` variable is a global config variable used by multiple subsystems. Let's make this non-global, by adding this variable independently to the subsystems where it is used. First, add the setting to the `repo_settings` struct, this provides access to the config in places where the repository is available. Use this in `packfile.c`. In `index-pack.c` we add it to the `pack_idx_option` struct and its constructor. While the repository struct is available here, it may not be set because `git index-pack` can be used without a repository. In `gc.c` add it to the `gc_config` struct and also the constructor function. The gc functions currently do not have direct access to a repository struct. These changes are made to remove the usage of `delta_base_cache_limit` as a global variable in `packfile.c`. This brings us one step closer to removing the `USE_THE_REPOSITORY_VARIABLE` definition in `packfile.c` which we complete in the next patch. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-04 08:21:55 +09:00
Karthik Nayak	6f33d8e255	builtin: pass repository to sub commands In `9b1cb5070f` (builtin: add a repository parameter for builtin functions, 2024-09-13) the repository was passed down to all builtin commands. This allowed the repository to be passed down to lower layers without depending on the global `the_repository` variable. Continue this work by also passing down the repository parameter from the command to sub-commands. This will help pass down the repository to other subsystems and cleanup usage of global variables like 'the_repository' and 'the_hash_algo'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-26 10:36:08 +09:00
Junio C Hamano	76c1953395	Merge branch 'ps/maintenance-start-crash-fix' into maint-2.47 "git maintenance start" crashed due to an uninitialized variable reference, which has been corrected. * ps/maintenance-start-crash-fix: builtin/gc: fix crash when running `git maintenance start`	2024-11-20 14:42:58 +09:00
Patrick Steinhardt	656ca9204a	builtin/gc: provide hint when maintenance hits a stale schedule lock When running scheduled maintenance via `git maintenance start`, we acquire a lockfile to ensure that no other scheduled maintenance task is running in the repository concurrently. If so, we do provide an error to the user hinting that another process seems to be running in this repo. There are two important cases why such a lockfile may exist: - An actual git-maintenance(1) process is still running in this repository. - An earlier process may have crashed or was interrupted part way through and has left a stale lockfile behind. In `c95547a394` (builtin/gc: fix crash when running `git maintenance start`, 2024-10-10), we have fixed an issue where git-maintenance(1) would crash with the "start" subcommand, and the underlying bug causes the second scenario to trigger quite often now. Most users don't know how to get out of that situation again though. Ideally, we'd be removing the stale lock for our users automatically. But in the context of repository maintenance this is rather risky, as it can easily run for hours or even days. So finding a clear point where we know that the old process has exited is basically impossible. We have the same issue in other subsystems, e.g. when locking refs. Our lockfile interfaces thus provide the `unable_to_lock_message()` function for exactly this purpose: it provides a nice hint to the user that explains what is going on and how to get out of that situation again by manually removing the file. Adapt git-maintenance(1) to print a similar hint. While we could use the above function, we can provide a bit more context as we know exactly what kind of process would create the lockfile. Reported-by: Miguel Rincon Barahona <mrincon@gitlab.com> Reported-by: Kev Kloss <kkloss@gitlab.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-20 10:26:12 +09:00
Taylor Blau	c1662a00b6	Merge branch 'ps/maintenance-start-crash-fix' "git maintenance start" crashed due to an uninitialized variable reference, which has been corrected. * ps/maintenance-start-crash-fix: builtin/gc: fix crash when running `git maintenance start`	2024-10-18 13:56:26 -04:00
Patrick Steinhardt	c95547a394	builtin/gc: fix crash when running `git maintenance start` It was reported on the mailing list that running `git maintenance start` immediately segfaults starting with `b6c3f8e12c` (builtin/maintenance: fix leak in `get_schedule_cmd()`, 2024-09-26). And indeed, this segfault is trivial to reproduce up to a point where one is scratching their head why we didn't catch this regression in our test suite. The root cause of this error is `get_schedule_cmd()`, which does not populate the `out` parameter in all cases anymore starting with the mentioned commit. Callers do assume it to always be populated though and will e.g. call `strvec_split()` on the returned value, which will of course segfault when the variable is uninitialized. So why didn't we catch this trivial regression? The reason is that our tests always set up the "GIT_TEST_MAINT_SCHEDULER" environment variable via "t/test-lib.sh", which allows us to override the scheduler command with a custom one so that we don't accidentally modify the developer's system. But the faulty code where we don't set the `out` parameter will only get hit in case that environment variable is _not_ set, which is never the case when executing our tests. Fix the regression by again unconditionally allocating the value in the `out` parameter, if provided. Add a test that unsets the environment variable to catch future regressions in this area. Reported-by: Shubham Kanodia <shubham.kanodia10@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-10-10 10:04:43 -07:00
Junio C Hamano	365529e1ea	Merge branch 'ps/leakfixes-part-7' More leak-fixes. * ps/leakfixes-part-7: (23 commits) diffcore-break: fix leaking filespecs when merging broken pairs revision: fix leaking parents when simplifying commits builtin/maintenance: fix leak in `get_schedule_cmd()` builtin/maintenance: fix leaking config string promisor-remote: fix leaking partial clone filter grep: fix leaking grep pattern submodule: fix leaking submodule ODB paths trace2: destroy context stored in thread-local storage builtin/difftool: plug several trivial memory leaks builtin/repack: fix leaking configuration diffcore-order: fix leaking buffer when parsing orderfiles parse-options: free previous value of `OPTION_FILENAME` diff: fix leaking orderfile option builtin/pull: fix leaking "ff" option dir: fix off by one errors for ignored and untracked entries builtin/submodule--helper: fix leaking remote ref on errors t/helper: fix leaking subrepo in nested submodule config helper builtin/submodule--helper: fix leaking error buffer builtin/submodule--helper: clear child process when not running it submodule: fix leaking update strategy ...	2024-10-02 07:46:26 -07:00
Junio C Hamano	4251403327	Merge branch 'ds/background-maintenance-with-credential' Background tasks "git maintenance" runs may need to use credential information when going over the network, but a credential helper may work only in an interactive environment, and end up blocking a scheduled task waiting for UI. Credential helpers can now behave differently when they are not running interactively. * ds/background-maintenance-with-credential: scalar: configure maintenance during 'reconfigure' maintenance: add custom config to background jobs credential: add new interactive config option	2024-09-30 16:16:16 -07:00
Patrick Steinhardt	b6c3f8e12c	builtin/maintenance: fix leak in `get_schedule_cmd()` The `get_schedule_cmd()` function allows us to override the schedule command with a specific test command such that we can verify the underlying logic in a platform-independent way. Its memory management is somewhat wild though, because it basically gives up and assigns an allocated string to the string constant output pointer. While this part is marked with `UNLEAK()` to mask this, we also leak the local string lists. Rework the function such that it has a separate out parameter. If set, we will assign it the final allocated command. Plug the other memory leaks and create a common exit path. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-09-27 08:25:37 -07:00
Patrick Steinhardt	84e9fc361d	builtin/maintenance: fix leaking config string When parsing the maintenance strategy from config we allocate a config string, but do not free it after parsing it. Plug this leak by instead using `git_config_get_string_tmp()`, which does not allocate any memory. This leak is exposed by t7900, but plugging it alone does not make the test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-09-27 08:25:37 -07:00
Junio C Hamano	b8e318ea58	Merge branch 'jc/pass-repo-to-builtins' The convention to calling into built-in command implementation has been updated to pass the repository, if known, together with the prefix value. * jc/pass-repo-to-builtins: add: pass in repo variable instead of global the_repository builtin: remove USE_THE_REPOSITORY for those without the_repository builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h builtin: add a repository parameter for builtin functions	2024-09-23 10:35:09 -07:00
Junio C Hamano	3eb6679959	Merge branch 'ps/environ-wo-the-repository' Code clean-up. * ps/environ-wo-the-repository: (21 commits) environment: stop storing "core.notesRef" globally environment: stop storing "core.warnAmbiguousRefs" globally environment: stop storing "core.preferSymlinkRefs" globally environment: stop storing "core.logAllRefUpdates" globally refs: stop modifying global `log_all_ref_updates` variable branch: stop modifying `log_all_ref_updates` variable repo-settings: track defaults close to `struct repo_settings` repo-settings: split out declarations into a standalone header environment: guard state depending on a repository environment: reorder header to split out `the_repository`-free section environment: move `set_git_dir()` and related into setup layer environment: make `get_git_namespace()` self-contained environment: move object database functions into object layer config: make dependency on repo in `read_early_config()` explicit config: document `read_early_config()` and `read_very_early_config()` environment: make `get_git_work_tree()` accept a repository environment: make `get_graft_file()` accept a repository environment: make `get_index_file()` accept a repository environment: make `get_object_directory()` accept a repository environment: make `get_git_common_dir()` accept a repository ...	2024-09-23 10:35:05 -07:00
Derrick Stolee	4f5551957d	maintenance: add custom config to background jobs At the moment, some background jobs are getting blocked on credentials during the 'prefetch' task. This leads to other tasks, such as incremental repacks, getting blocked. Further, if a user manages to fix their credentials, then they still need to cancel the background process before their background maintenance can continue working. Update the background schedules for our four scheduler integrations to include these config options via '-c' options: * 'credential.interactive=false' will stop Git and some credential helpers from prompting in the UI (assuming the '-c' parameters are carried through and respected by GCM). * 'core.askPass=true' will replace the text fallback for a username and password into the 'true' command, which will return a success in its exit code, but Git will treat the empty string returned as an invalid password and move on. We can do some testing that the credentials are passed, at least in the systemd case due to writing the service files. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-09-20 14:44:31 -07:00
John Cai	03eae9afb4	builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h Instead of including USE_THE_REPOSITORY_VARIABLE by default on every builtin, remove it from builtin.h and add it to all the builtins that include builtin.h (by definition, that means all builtins/*.c). Also, remove the include statement for repository.h since it gets brought in through builtin.h. The next step will be to migrate each builtin from having to use the_repository. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-09-13 14:32:24 -07:00
John Cai	9b1cb5070f	builtin: add a repository parameter for builtin functions In order to reduce the usage of the global the_repository, add a parameter to builtin functions that will get passed a repository variable. This commit uses UNUSED on most of the builtin functions, as subsequent commits will modify the actual builtins to pass the repository parameter down. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-09-13 14:27:08 -07:00
Patrick Steinhardt	661624a4f6	environment: make `get_git_common_dir()` accept a repository The `get_git_common_dir()` function retrieves the path to the common directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-09-12 10:15:39 -07:00
Junio C Hamano	f1160b2700	Merge branch 'jk/maybe-unused-cleanup' Code clean-up. * jk/maybe-unused-cleanup: grep: prefer UNUSED to MAYBE_UNUSED for pcre allocators gc: drop MAYBE_UNUSED annotation from used parameter	2024-09-06 10:38:52 -07:00
Jeff King	3cdddcf6b2	gc: drop MAYBE_UNUSED annotation from used parameter The "opts" parameter is always used, so marking it with MAYBE_UNUSED is just confusing. This annotation goes back to `41abfe15d9` (maintenance: add pack-refs task, 2021-02-09), when it really was unused. Back then we did not have the UNUSED macro that would complain if the code changed to use the parameter. So when we started using it in `bfc2f9eb8e` (builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs, 2024-03-25), nobody noticed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-29 13:56:46 -07:00
Jeff King	551e4de8e1	gc: mark unused config parameter in virtual functions Commit `d1ae15d68b` (builtin/gc: refactor to read config into structure, 2024-08-16) added a new parameter to the maintenance_task virtual functions, but most of them don't need to look at it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-28 09:51:17 -07:00
Junio C Hamano	6e6f68b59b	Merge branch 'ps/maintenance-detach-fix-more' A tests for "git maintenance" that were broken on Windows have been corrected. * ps/maintenance-detach-fix-more: builtin/maintenance: fix loose objects task emitting pack hash t7900: exercise detaching via trace2 regions t7900: fix flaky test due to leaking background job	2024-08-26 11:32:20 -07:00
Junio C Hamano	1e8962ee08	Merge branch 'ps/maintenance-detach-fix' Maintenance tasks other than "gc" now properly go background when "git maintenance" runs them. * ps/maintenance-detach-fix: run-command: fix detaching when running auto maintenance builtin/maintenance: add a `--detach` flag builtin/gc: add a `--detach` flag builtin/gc: stop processing log file on signal builtin/gc: fix leaking config values builtin/gc: refactor to read config into structure config: fix constness of out parameter for `git_config_get_expiry()`	2024-08-26 11:32:20 -07:00
Junio C Hamano	5e56a39e6a	Merge branch 'ps/config-wo-the-repository' Use of API functions that implicitly depend on the_repository object in the config subsystem has been rewritten to pass a repository object through the callchain. * ps/config-wo-the-repository: config: hide functions using `the_repository` by default global: prepare for hiding away repo-less config functions config: don't depend on `the_repository` with branch conditions config: don't have setters depend on `the_repository` config: pass repo to functions that rename or copy sections config: pass repo to `git_die_config()` config: pass repo to `git_config_get_expiry_in_days()` config: pass repo to `git_config_get_expiry()` config: pass repo to `git_config_get_max_percent_split_change()` config: pass repo to `git_config_get_split_index()` config: pass repo to `git_config_get_index_threads()` config: expose `repo_config_clear()` config: introduce missing setters that take repo as parameter path: hide functions using `the_repository` by default path: stop relying on `the_repository` in `worktree_git_path()` path: stop relying on `the_repository` when reporting garbage hooks: remove implicit dependency on `the_repository` editor: do not rely on `the_repository` for interactive edits path: expose `do_git_common_path()` as `repo_common_pathv()` path: expose `do_git_path()` as `repo_git_pathv()`	2024-08-23 09:02:34 -07:00
Patrick Steinhardt	8311e3b551	builtin/maintenance: fix loose objects task emitting pack hash The "loose-objects" maintenance tasks executes git-pack-objects(1) to pack all loose objects into a new packfile. This command ends up printing the hash of the packfile to stdout though, which clutters the output of `git maintenance run`. Fix this issue by disabling stdout of the child process. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-21 11:33:22 -07:00
Patrick Steinhardt	51a0b8a2a7	t7900: exercise detaching via trace2 regions In t7900, we exercise the `--detach` logic by checking whether the command ended up writing anything to its output or not. This supposedly works because we close stdin, stdout and stderr when daemonizing. But one, it breaks on platforms where daemonize is a no-op, like Windows. And second, that git-maintenance(1) outputs anything at all in these tests is a bug in the first place that we'll fix in a subsequent commit. Introduce a new trace2 region around the detach which allows us to more explicitly check whether the detaching logic was executed. This is a much more direct way to exercise the logic, provides a potentially useful signal to tracing logs and also works alright on platforms which do not have the ability to daemonize. Signed-off-by: Patrick Steinhardt <ps@pks.im> [jc: dropped a stale in-code comment from a test] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-21 11:33:02 -07:00
Patrick Steinhardt	98077d06b2	run-command: fix detaching when running auto maintenance In the past, we used to execute `git gc --auto` as part of our automatic housekeeping routines. As git-gc(1) may require quite some time to perform the housekeeping, it knows to detach itself and run in the background so that the user can continue their work. Eventually, we refactored our automatic housekeeping to instead use the more flexible git-maintenance(1) command. The upside of this new infra is that the user can configure which maintenance tasks are performed, at least to a certain degree. So while it continues to run git-gc(1) by default, it can also be adapted to e.g. use git-multi-pack-index(1) for maintenance of the object database. The auto-detach of the new infra is somewhat broken though once the user configures non-standard tasks. The problem is essentially that we detach at the wrong level in the process hierarchy: git-maintenance(1) never detaches itself, but instead it continues to be git-gc(1) which does. When configured to only run the git-gc(1) maintenance task, then the result is basically the same as before. But when configured to run other tasks, then git-maintenance(1) will wait for these to run to completion. Even worse, it may be that git-gc(1) runs concurrently with other housekeeping tasks, stomping on each others feet. Fix this bug by asking git-gc(1) to not detach when it is being invoked via git-maintenance(1). Instead, git-maintenance(1) now respects a new config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and detaches itself into the background when running as part of our auto maintenance. This should continue to behave the same for all users which use the git-gc(1) task, only. For others though, it means that we now properly perform all tasks in the background. The default behaviour of git-maintenance(1) when executed by the user does not change, it will remain in the foreground unless they pass the `--detach` option. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-16 09:46:26 -07:00
Patrick Steinhardt	a6affd3343	builtin/maintenance: add a `--detach` flag Same as the preceding commit, add a `--[no-]detach` flag to the git-maintenance(1) command. This will be used in a subsequent commit to fix backgrounding of that command when configured with a non-standard set of tasks. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-16 09:46:26 -07:00
Patrick Steinhardt	c7185df01b	builtin/gc: add a `--detach` flag When running `git gc --auto`, the command will by default detach and continue running in the background. This behaviour can be tweaked via the `gc.autoDetach` config, but not via a command line switch. We need that in a subsequent commit though, where git-maintenance(1) will want to ask its git-gc(1) child process to not detach anymore. Add a `--[no-]detach` flag that does this for us. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-16 09:46:25 -07:00
Patrick Steinhardt	9b6b994f90	builtin/gc: stop processing log file on signal When detaching, git-gc(1) will redirect its stderr to a "gc.log" log file, which is then used to surface errors of a backgrounded process to the user. To ensure that the file is properly managed on abnormal exit paths, we install both signal and exit handlers that try to either commit the underlying lock file or roll it back in case there wasn't any error. This logic is severly broken when handling signals though, as we end up calling all kinds of functions that are not signal safe. This includes malloc(3P) via `git_path()`, fprintf(3P), fflush(3P) and many more functions. The consequence can be anything, from deadlocks to crashes. Unfortunately, we cannot really do much about this without a larger refactoring. The least-worst thing we can do is to not set up the signal handler in the first place. This will still cause us to remove the lockfile, as the underlying tempfile subsystem already knows to unlink locks when receiving a signal. But it may cause us to remove the lock even in the case where it would have contained actual errors, which is a change in behaviour. The consequence is that "gc.log" will not be committed, and thus subsequent calls to `git gc --auto` won't bail out because of this. Arguably though, it is better to retry garbage collection rather than having the process run into a potentially-corrupted state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-16 09:46:25 -07:00
Patrick Steinhardt	0ce44e2293	builtin/gc: fix leaking config values We're leaking config values in git-gc(1) when those values are tracked as strings. Introduce a new `gc_config_release()` function that releases this memory to plug those leaks and release old values before populating the config fields via `git_config_string()` et al. Note that there is one small gotcha here with the "--prune" option. Next to passing a string, this option also accepts the "--no-prune" option that overrides the default or configured value. We thus need to discern between the option not having been passed by the user and the negative variant of it. This is done by using a simple sentinel value that lets us discern these cases. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-16 09:46:25 -07:00
Patrick Steinhardt	d1ae15d68b	builtin/gc: refactor to read config into structure The git-gc(1) command knows to read a bunch of config keys to tweak its own behaviour. The values are parsed into global variables, which makes it hard to correctly manage the lifecycle of values that may require a memory allocation. Refactor the code to use a `struct gc_config` that gets populated and passed around. For one, this makes previously-implicit dependencies on these config values clear. Second, it will allow us to properly manage the lifecycle in the next commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-16 09:46:25 -07:00
Patrick Steinhardt	a70a9bf6ee	config: fix constness of out parameter for `git_config_get_expiry()` The type of the out parameter of `git_config_get_expiry()` is a pointer to a constant string, which creates the impression that ownership of the returned data wasn't transferred to the caller. This isn't true though and thus quite misleading. Adapt the parameter to be of type `char **` and adjust callers accordingly. While at it, refactor `get_shared_index_expire_date()` to drop the static `shared_index_expire` variable. It is only used in that function, and furthermore we would only hit the code where we parse the expiry date a single time because we already use a static `prepared` variable to track whether we did parse it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-16 09:46:24 -07:00
Patrick Steinhardt	87aace129e	config: pass repo to `git_config_get_expiry()` Refactor `git_config_get_expiry()` to accept a `struct repository` such that we can get rid of the implicit dependency on `the_repository`. Rename the function accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-13 10:01:03 -07:00
Patrick Steinhardt	169c979771	hooks: remove implicit dependency on `the_repository` We implicitly depend on `the_repository` in our hook subsystem because we use `strbuf_git_path()` to compute hook paths. Remove this dependency by accepting a `struct repository` as parameter instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-13 10:01:01 -07:00
John Cai	e8207717f1	refs: add referent to each_ref_fn Add a parameter to each_ref_fn so that callers to the ref APIs that use this function as a callback can have acess to the unresolved value of a symbolic ref. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-09 08:47:34 -07:00
Patrick Steinhardt	30aaff437f	refs: pass repo when peeling objects Both `peel_object()` and `peel_iterated_oid()` implicitly rely on `the_repository` to look up objects. Despite the fact that we want to get rid of `the_repository`, it also leads to some restrictions in our ref iterators when trying to retrieve the peeled value for a repository other than `the_repository`. Refactor these functions such that both take a repository as argument and remove the now-unnecessary restrictions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-17 10:33:39 -07:00
Patrick Steinhardt	2e5c4758b7	cocci: apply rules to rewrite callers of "refs" interfaces Apply the rules that rewrite callers of "refs" interfaces to explicitly pass `struct ref_store`. The resulting patch has been applied with the `--whitespace=fix` option. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-07 10:06:59 -07:00
Junio C Hamano	75b182d34e	Merge branch 'js/for-each-repo-keep-going' A scheduled "git maintenance" job is expected to work on all repositories it knows about, but it stopped at the first one that errored out. Now it keeps going. * js/for-each-repo-keep-going: maintenance: running maintenance should not stop on errors for-each-repo: optionally keep going on an error	2024-04-30 14:49:45 -07:00
Johannes Schindelin	c75662bfc9	maintenance: running maintenance should not stop on errors In https://github.com/microsoft/git/issues/623, it was reported that maintenance stops on a missing repository, omitting the remaining repositories that were scheduled for maintenance. This is undesirable, as it should be a best effort type of operation. It should still fail due to the missing repository, of course, but not leave the non-missing repositories in unmaintained shapes. Let's use `for-each-repo`'s shiny new `--keep-going` option that we just introduced for that very purpose. This change will be picked up when running `git maintenance start`, which is run implicitly by `scalar reconfigure`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-24 10:46:03 -07:00
Junio C Hamano	eacfd581d2	Merge branch 'ps/pack-refs-auto' "git pack-refs" learned the "--auto" option, which is a useful addition to be triggered from "git gc --auto". Acked-by: Karthik Nayak <karthik.188@gmail.com> cf. <CAOLa=ZRAEA7rSUoYL0h-2qfEELdbPHbeGpgBJRqesyhHi9Q6WQ@mail.gmail.com> * ps/pack-refs-auto: builtin/gc: pack refs when using `git maintenance run --auto` builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs t6500: extract objects with "17" prefix builtin/gc: move `struct maintenance_run_opts` builtin/pack-refs: introduce new "--auto" flag builtin/pack-refs: release allocated memory refs/reftable: expose auto compaction via new flag refs: remove `PACK_REFS_ALL` flag refs: move `struct pack_refs_opts` to where it's used t/helper: drop pack-refs wrapper refs/reftable: print errors on compaction failure reftable/stack: gracefully handle failed auto-compaction due to locks reftable/stack: use error codes when locking fails during compaction reftable/error: discern locked/outdated errors reftable/stack: fix error handling in `reftable_stack_init_addition()`	2024-04-09 14:31:45 -07:00
Patrick Steinhardt	9f6714ab3e	builtin/gc: pack refs when using `git maintenance run --auto` When running `git maintenance run --auto`, then the various subtasks will only run as needed. Thus, we for example end up only packing loose objects if we hit a certain threshold. Interestingly enough, the "pack-refs" task is actually _never_ executed when the auto-flag is set because it does not have a condition at all. As `41abfe15d9` (maintenance: add pack-refs task, 2021-02-09) mentions: The 'auto_condition' function pointer is left NULL for now. We could extend this in the future to have a condition check if pack-refs should be run during 'git maintenance run --auto'. It is not quite clear from that quote whether it is actually intended that the task doesn't run at all in this mode. Also, no test was added to verify this behaviour. Ultimately though, it feels quite surprising that `git maintenance run --auto --task=pack-refs` would quietly never do anything at all. In any case, now that we do have the logic in place to let ref backends decide whether or not to repack refs, it does make sense to wire it up accordingly. With the "reftable" backend we will thus now perform auto-compaction, which optimizes the refdb as needed. But for the "files" backend we now unconditionally pack refs as it does not yet know to handle the "auto" flag. Arguably, this can be seen as a bug fix given that previously the task never did anything at all. Eventually though we should amend the "files" backend to use some heuristics for auto compaction, as well. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-25 09:54:07 -07:00
Patrick Steinhardt	bfc2f9eb8e	builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs Forward the `--auto` flag to git-pack-refs(1) when it has been invoked with this flag itself. This does not change anything for the "files" backend, which will continue to eagerly pack refs. But it does ensure that the "reftable" backend only compacts refs as required. This change does not impact git-maintenance(1) because this command will in fact never run the pack-refs task when run with `--auto`. This issue will be addressed in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-25 09:54:07 -07:00
Patrick Steinhardt	0e05d53992	builtin/gc: move `struct maintenance_run_opts` We're about to start using `struct maintenance_run_opts` in `maintenance_task_pack_refs()`. Move its definition up to prepare for this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-25 09:54:07 -07:00
Ralph Seichter	42d5c03394	config: add --comment option to add a comment Introduce the ability to append comments to modifications made using git-config. Example usage: git config --comment "changed via script" \ --add safe.directory /home/alice/repo.git based on the proposed patch, the output produced is: [safe] directory = /home/alice/repo.git #changed via script Users need to be able to distinguish between config entries made using automation and entries made by a human. Automation can add comments containing a URL pointing to explanations for the change made, avoiding questions from users as to why their config file was changed by a third party. The implementation ensures that a # character is unconditionally prepended to the provided comment string, and that the comment text is appended as a suffix to the changed key-value-pair in the same line of text. Multi-line comments (i.e. comments containing linefeed) are rejected as errors, causing Git to exit without making changes. Comments are aimed at humans who inspect or change their Git config using a pager or editor. Comments are not meant to be read or displayed by git-config at a later time. Signed-off-by: Ralph Seichter <github@seichter.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-15 12:25:35 -07:00
Kristoffer Haugsbakk	74e12192e6	maintenance: use XDG config if it exists `git maintenance register` registers the repository in the user's global config. `$XDG_CONFIG_HOME/git/config` is supposed to be used if `~/.gitconfig` does not exist. However, this command creates a `~/.gitconfig` file and writes to that one even though the XDG variant exists. This used to work correctly until `50a044f1e4` (gc: replace config subprocesses with API calls, 2022-09-27), when the command started calling the config API instead of git-config(1). Also change `unregister` accordingly. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-01-18 12:17:42 -08:00
Kristoffer Haugsbakk	ecffa3ed51	config: rename global config function Rename this function to a more descriptive name since we want to use the existing name for a new function. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-01-18 12:17:41 -08:00
Junio C Hamano	79861babe2	Merge branch 'tb/repack-max-cruft-size' "git repack" learned "--max-cruft-size" to prevent cruft packs from growing without bounds. * tb/repack-max-cruft-size: repack: free existing_cruft array after use builtin/repack.c: avoid making cruft packs preferred builtin/repack.c: implement support for `--max-cruft-size` builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE t7700: split cruft-related tests to t7704	2023-10-18 13:25:41 -07:00
Taylor Blau	37dc6d8104	builtin/repack.c: implement support for `--max-cruft-size` Cruft packs are an alternative mechanism for storing a collection of unreachable objects whose mtimes are recent enough to avoid being pruned out of the repository. When cruft packs were first introduced back in `b757353676` (builtin/pack-objects.c: --cruft without expiration, 2022-05-20) and `a7d493833f` (builtin/pack-objects.c: --cruft with expiration, 2022-05-20), the recommended workflow consisted of: - Repacking periodically, either by packing anything loose in the repository (via `git repack -d`) or producing a geometric sequence of packs (via `git repack --geometric=<d> -d`). - Every so often, splitting the repository into two packs, one cruft to store the unreachable objects, and another non-cruft pack to store the reachable objects. Repositories may (out of band with the above) choose periodically to prune out some unreachable objects which have aged out of the grace period by generating a pack with `--cruft-expiration=<approxidate>`. This allowed repositories to maintain relatively few packs on average, and quarantine unreachable objects together in a cruft pack, avoiding the pitfalls of holding unreachable objects as loose while they age out (for more, see some of the details in `3d89a8c118` (Documentation/technical: add cruft-packs.txt, 2022-05-20)). This all works, but can be costly from an I/O-perspective when frequently repacking a repository that has many unreachable objects. This problem is exacerbated when those unreachable objects are rarely (if every) pruned. Since there is at most one cruft pack in the above scheme, each time we update the cruft pack it must be rewritten from scratch. Because much of the pack is reused, this is a relatively inexpensive operation from a CPU-perspective, but is very costly in terms of I/O since we end up rewriting basically the same pack (plus any new unreachable objects that have entered the repository since the last time a cruft pack was generated). At the time, we decided against implementing more robust support for multiple cruft packs. This patch implements that support which we were lacking. Introduce a new option `--max-cruft-size` which allows repositories to accumulate cruft packs up to a given size, after which point a new generation of cruft packs can accumulate until it reaches the maximum size, and so on. To generate a new cruft pack, the process works like so: - Sort a list of any existing cruft packs in ascending order of pack size. - Starting from the beginning of the list, group cruft packs together while the accumulated size is smaller than the maximum specified pack size. - Combine the objects in these cruft packs together into a new cruft pack, along with any other unreachable objects which have since entered the repository. Once a cruft pack grows beyond the size specified via `--max-cruft-size` the pack is effectively frozen. This limits the I/O churn up to a quadratic function of the value specified by the `--max-cruft-size` option, instead of behaving quadratically in the number of total unreachable objects. When pruning unreachable objects, we bypass the new code paths which combine small cruft packs together, and instead start from scratch, passing in the appropriate `--max-pack-size` down to `pack-objects`, putting it in charge of keeping the resulting set of cruft packs sized correctly. This may seem like further I/O churn, but in practice it isn't so bad. We could prune old cruft packs for whom all or most objects are removed, and then generate a new cruft pack with just the remaining set of objects. But this additional complexity buys us relatively little, because most objects end up being pruned anyway, so the I/O churn is well contained. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-10-05 13:26:11 -07:00
Christian Couder	9b96046b92	gc: add `gc.repackFilterTo` config option A previous commit implemented the `gc.repackFilter` config option to specify a filter that should be used by `git gc` when performing repacks. Another previous commit has implemented `git repack --filter-to=<dir>` to specify the location of the packfile containing filtered out objects when using a filter. Let's implement the `gc.repackFilterTo` config option to specify that location in the config when `gc.repackFilter` is used. Now when `git gc` will perform a repack with a <dir> configured through this option and not empty, the repack process will be passed a corresponding `--filter-to=<dir>` argument. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-10-02 14:54:31 -07:00
Christian Couder	1cd43a9ed9	gc: add `gc.repackFilter` config option A previous commit has implemented `git repack --filter=<filter-spec>` to allow users to filter out some objects from the main pack and move them into a new different pack. Users might want to perform such a cleanup regularly at the same time as they perform other repacks and cleanups, so as part of `git gc`. Let's allow them to configure a <filter-spec> for that purpose using a new gc.repackFilter config option. Now when `git gc` will perform a repack with a <filter-spec> configured through this option and not empty, the repack process will be passed a corresponding `--filter=<filter-spec>` argument. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-10-02 14:54:30 -07:00
Junio C Hamano	bd49a2998a	Merge branch 'js/systemd-timers-wsl-fix' Update "git maintainance" timers' implementation based on systemd timers to work with WSL. * js/systemd-timers-wsl-fix: maintenance(systemd): support the Windows Subsystem for Linux	2023-09-20 10:44:57 -07:00
Junio C Hamano	c52a02a0f0	Merge branch 'jk/unused-post-2.42-part2' Unused parameters to functions are marked as such, and/or removed, in order to bring us closer to -Wunused-parameter clean. * jk/unused-post-2.42-part2: parse-options: mark unused parameters in noop callback interpret-trailers: mark unused "unset" parameters in option callbacks parse-options: add more BUG_ON() annotations merge: do not pass unused opt->value parameter parse-options: mark unused "opt" parameter in callbacks parse-options: prefer opt->value to globals in callbacks checkout-index: delay automatic setting of to_tempfile format-patch: use OPT_STRING_LIST for to/cc options merge: simplify parsing of "-n" option merge: make xopts a strvec	2023-09-13 10:07:56 -07:00
Johannes Schindelin	5e8515e8e8	maintenance(systemd): support the Windows Subsystem for Linux When running in the Windows Subsystem for Linux (WSL), it is usually necessary to use the Git Credential Manager for authentication when performing the background fetches. This requires interoperability between the Windows Subsystem for Linux and the Windows host to work, which uses so-called vsocks, i.e. sockets intended for communcations between virtual machines and the host they are running on. However, when Git is configured to run background maintenance via `systemd`, the address families available to those maintenance processes are restricted, and did not include `AF_VSOCK`. This leads to problems e.g. when a background fetch tries to access github.com: systemd[437]: Starting Optimize Git repositories data... git[747387]: WSL (747387) ERROR: UtilBindVsockAnyPort:285: socket failed 97 git[747381]: fatal: could not read Username for 'https://github.com': No such device or address git[747381]: error: failed to prefetch remotes git[747381]: error: task 'prefetch' failed systemd[437]: git-maintenance@hourly.service: Main process exited, code=exited, status=1/FAILURE systemd[437]: git-maintenance@hourly.service: Failed with result 'exit-code'. systemd[437]: Failed to start Optimize Git repositories data. Address this (pun intended) by adding the `AF_VSOCK` address family to the allow list. This fixes https://github.com/microsoft/git/issues/604. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-09-11 12:41:30 -07:00
Jeff King	34bf44f2d5	parse-options: mark unused "opt" parameter in callbacks The previous commit argued that parse-options callbacks should try to use opt->value rather than touching globals directly. In some cases, however, that's awkward to do. Some callbacks touch multiple variables, or may even just call into an abstracted function that does so. In some of these cases we _could_ convert them by stuffing the multiple variables into a single struct and passing the struct pointer through opt->value. But that may make other parts of the code less readable, as the struct relationship has to be mentioned everywhere. Let's just accept that these cases are special and leave them as-is. But we do need to mark their "opt" parameters to satisfy -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-09-05 14:48:17 -07:00
Jeff King	316b3a226a	gc: mark unused descriptors in scheduler callbacks Each of the scheduler update callbacks gets the descriptor of the lock file, but only the crontab updater needs it. We have to retain the unused descriptors because these are dispatched from a table of function pointers, but we should mark them to silence -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-29 17:56:26 -07:00
Junio C Hamano	c7b6a6c0be	Merge branch 'ds/maintenance-schedule-fuzz' Hourly and other schedule of "git maintenance" jobs are randomly distributed now. * ds/maintenance-schedule-fuzz: maintenance: update schedule before config maintenance: fix systemd schedule overlaps maintenance: use random minute in systemd scheduler maintenance: swap method locations maintenance: use random minute in cron scheduler maintenance: use random minute in Windows scheduler maintenance: use random minute in launchctl scheduler maintenance: add get_random_minute()	2023-08-24 09:32:34 -07:00
Junio C Hamano	32f4fa8d3b	Merge branch 'ds/maintenance-on-windows-fix' Windows updates. * ds/maintenance-on-windows-fix: git maintenance: avoid console window in scheduled tasks on Windows win32: add a helper to run `git.exe` without a foreground window	2023-08-15 10:19:47 -07:00
Derrick Stolee	69ecfcacfd	maintenance: update schedule before config When running 'git maintenance start', the current pattern is to configure global config settings to enable maintenance on the current repository and set 'maintenance.auto' to false and _then_ to set up the schedule with the system scheduler. This has a problematic error condition: if the scheduler fails to initialize, the repository still will not use automatic maintenance due to the 'maintenance.auto' setting. Fix this gap by swapping the order of operations. If Git fails to initialize maintenance, then the config changes should never happen. Reported-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-10 14:04:17 -07:00
Derrick Stolee	c97ec0378b	maintenance: fix systemd schedule overlaps The 'git maintenance run' command prevents concurrent runs in the same repository using a 'maintenance.lock' file. However, when using systemd the hourly maintenance runs the same time as the daily and weekly runs. (Similarly, daily maintenance runs at the same time as weekly maintenance.) These competing commands result in some maintenance not actually being run. This overlap was something we could not fix until we made the recent change to not use the builting 'hourly', 'daily', and 'weekly' schedules in systemd. We can adjust the schedules such that: 1. Hourly runs avoid the 0th hour. 2. Daily runs avoid Monday. This will keep maintenance runs from colliding when using systemd. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-10 14:04:17 -07:00
Derrick Stolee	daa787010c	maintenance: use random minute in systemd scheduler The get_random_minute() method was created to allow maintenance schedules to be fixed to a random minute of the hour. This randomness is only intended to spread out the load from a number of clients, but each client should have an hour between each maintenance cycle. Add this random minute to the systemd integration. This integration is more complicated than similar changes for other schedulers because of a neat trick that systemd allows: templating. The previous implementation generated two template files with names of the form 'git-maintenance@.(timer\|service)'. The '.timer' or '.service' indicates that this is a template that is picked up when we later specify '...@<schedule>.timer' or '...@<schedule>.service'. The '<schedule>' string is then used to insert into the template both the 'OnCalendar' schedule setting and the '--schedule' parameter of the 'git maintenance run' command. In order to set these schedules to a given minute, we can no longer use the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead need to abandon the template model for the .timer files. We can still use templates for the .service files. For this reason, we split these writes into two methods. Modify the template with a custom schedule in the 'OnCalendar' setting. This schedule has some interesting differences from cron-like patterns, but is relatively easy to figure out from context. The one that might be confusing is that '--*' is a date-based pattern, but this must be omitted when using 'Mon' to signal that we care about the day of the week. Monday is used since that matches the day used for the 'weekly' schedule used previously. Now that the timer files are not templates, we might want to abandon the '@' symbol in the file names. However, this would cause users with existing schedules to get two competing schedules due to different names. The work to remove the old schedule name is one thing that we can avoid by keeping the '@' symbol in our unit names. Since we are locked into this name, it makes sense that we keep the template model for the .service files. The rest of the change involves making sure we are writing these .timer and .service files before initializing the schedule with 'systemctl' and deleting the files when we are done. Some changes are also made to share the random minute along with a single computation of the execution path of the current Git executable. In addition, older Git versions may have written a 'git-maintenance@.timer' template file. Be sure to remove this when successfully enabling maintenance (or disabling maintenance). Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-10 14:04:16 -07:00
Derrick Stolee	f44d7d00e5	maintenance: swap method locations The systemd_timer_write_unit_templates() method writes a single template that is then used to start the hourly, daily, and weekly schedules with systemd. However, in order to schedule systemd maintenance on a given minute, these templates need to be replaced with specific schedules for each of these jobs. Before modifying the schedules, move the writing method above the systemd_timer_enable_unit() method, so we can write a specific schedule for each unit. The diff is computed smaller by showing systemd_timer_enable_unit() and systemd_timer_delete_units() move instead of systemd_timer_write_unit_templates() and systemd_timer_delete_unit_templates(). Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-10 14:04:16 -07:00
Derrick Stolee	9b43399057	maintenance: use random minute in cron scheduler The get_random_minute() method was created to allow maintenance schedules to be fixed to a random minute of the hour. This randomness is only intended to spread out the load from a number of clients, but each client should have an hour between each maintenance cycle. Add this random minute to the cron integration. The cron schedule specification starts with a minute indicator, which was previously inserted as the "0" string but now takes the given minute as an integer parameter. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-10 14:04:16 -07:00
Derrick Stolee	62a239987c	maintenance: use random minute in Windows scheduler The get_random_minute() method was created to allow maintenance schedules to be fixed to a random minute of the hour. This randomness is only intended to spread out the load from a number of clients, but each client should have an hour between each maintenance cycle. Add this random minute to the Windows scheduler integration. We need only to modify the minute value for the 'StartBoundary' tag across the three schedules. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-10 14:04:16 -07:00
Derrick Stolee	ec5d9d684c	maintenance: use random minute in launchctl scheduler The get_random_minute() method was created to allow maintenance schedules to be fixed to a random minute of the hour. This randomness is only intended to spread out the load from a number of clients, but each client should have an hour between each maintenance cycle. Use get_random_minute() when constructing the schedules for launchctl. The format already includes a 'Minute' key which is modified from 0 to the random minute. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-10 14:04:16 -07:00
Derrick Stolee	89024a0ab0	maintenance: add get_random_minute() When we initially created background maintenance -- with its hourly, daily, and weekly schedules -- we considered the effects of all clients launching fetches to the server every hour on the hour. The worry of DDoSing server hosts was noted, but left as something we would consider for a future update. As background maintenance has gained more adoption over the past three years, our worries about DDoSing the big Git hosts has been unfounded. Those systems, especially those serving public repositories, are already resilient to thundering herds of much smaller scale. However, sometimes organizations spin up specific custom server infrastructure either in addition to or on top of their Git host. Some of these technologies are built for a different range of scale, and can hit concurrency limits sooner. Organizations with such custom infrastructures are more likely to recommend tools like `scalar` which furthers their adoption of background maintenance. To help solve for this, create get_random_minute() as a method to help Git select a random minute when creating schedules in the future. The integrations with this method do not yet exist, but will follow in future changes. To avoid multiple sources of randomness in the Git codebase, create a new helper function, git_rand(), that returns a random uint32_t. This is similar to how rand() returns a random nonnegative value, except it is based on csprng_bytes() which is cryptographic and will return values larger than RAND_MAX. One thing that is important for testability is that we notice when we are under a test scenario and return a predictable result. The schedules themselves are not checked for this value, but at least one launchctl test checks that we do not unnecessarily reboot the schedule if it has not changed from a previous version. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-10 14:04:16 -07:00
Johannes Schindelin	0050f8e401	git maintenance: avoid console window in scheduled tasks on Windows We just introduced a helper to avoid showing a console window when the scheduled task runs `git.exe`. Let's actually use it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-09 13:58:15 -07:00
Calvin Wan	da9502ff4d	treewide: remove unnecessary includes for wrapper.h Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-07-05 11:41:59 -07:00
Elijah Newren	a034e9106f	object-store-ll.h: split this header out of object-store.h The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store \| sort \| uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-21 13:39:54 -07:00
Elijah Newren	c339932bd8	repository: remove unnecessary include of path.h This also made it clear that several .c files that depended upon path.h were missing a #include for it; add the missing includes while at it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-21 13:39:53 -07:00
Junio C Hamano	ccd12a3d6c	Merge branch 'en/header-split-cache-h-part-2' More header clean-up. * en/header-split-cache-h-part-2: (22 commits) reftable: ensure git-compat-util.h is the first (indirect) include diff.h: reduce unnecessary includes object-store.h: reduce unnecessary includes commit.h: reduce unnecessary includes fsmonitor: reduce includes of cache.h cache.h: remove unnecessary headers treewide: remove cache.h inclusion due to previous changes cache,tree: move basic name compare functions from read-cache to tree cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c hash-ll.h: split out of hash.h to remove dependency on repository.h tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h dir.h: move DTYPE defines from cache.h versioncmp.h: move declarations for versioncmp.c functions from cache.h ws.h: move declarations for ws.c functions from cache.h match-trees.h: move declarations for match-trees.c functions from cache.h pkt-line.h: move declarations for pkt-line.c functions from cache.h base85.h: move declarations for base85.c functions from cache.h copy.h: move declarations for copy.c functions from cache.h server-info.h: move declarations for server-info.c functions from cache.h packfile.h: move pack_window and pack_entry from cache.h ...	2023-05-09 16:45:46 -07:00
Junio C Hamano	d699e27bd4	Merge branch 'tb/ban-strtok' Mark strtok() and strtok_r() to be banned. * tb/ban-strtok: banned.h: mark `strtok()` and `strtok_r()` as banned t/helper/test-json-writer.c: avoid using `strtok()` t/helper/test-oidmap.c: avoid using `strtok()` t/helper/test-hashmap.c: avoid using `strtok()` string-list: introduce `string_list_setlen()` string-list: multi-delimiter `string_list_split_in_place()`	2023-05-02 10:13:35 -07:00
Junio C Hamano	fc23c397c7	Merge branch 'tb/enable-cruft-packs-by-default' When "gc" needs to retain unreachable objects, packing them into cruft packs (instead of exploding them into loose object files) has been offered as a more efficient option for some time. Now the use of cruft packs has been made the default and no longer considered an experimental feature. * tb/enable-cruft-packs-by-default: repository.h: drop unused `gc_cruft_packs` builtin/gc.c: make `gc.cruftPacks` enabled by default t/t9300-fast-import.sh: prepare for `gc --cruft` by default t/t6500-gc.sh: add additional test cases t/t6500-gc.sh: refactor cruft pack tests t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default t/t5304-prune.sh: prepare for `gc --cruft` by default builtin/gc.c: ignore cruft packs with `--keep-largest-pack` builtin/repack.c: fix incorrect reference to '-C' pack-write.c: plug a leak in stage_tmp_packfiles()	2023-04-28 16:03:03 -07:00
Junio C Hamano	0807e57807	Merge branch 'en/header-split-cache-h' Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...	2023-04-25 13:56:20 -07:00
Taylor Blau	52acddf36c	string-list: multi-delimiter `string_list_split_in_place()` Enhance `string_list_split_in_place()` to accept multiple characters as delimiters instead of a single character. Instead of using `strchr(2)` to locate the first occurrence of the given delimiter character, `string_list_split_in_place_multi()` uses `strcspn(2)` to move past the initial segment of characters comprised of any characters in the delimiting set. When only a single delimiting character is provided, `strpbrk(2)` (which is implemented with `strcspn(2)`) has equivalent performance to `strchr(2)`. Modern `strcspn(2)` implementations treat an empty delimiter or the singleton delimiter as a special case and fall back to calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)` this way. This change is one step to removing `strtok(2)` from the tree. Note that `string_list_split_in_place()` is not a strict replacement for `strtok()`, since it will happily turn sequential delimiter characters into empty entries in the resulting string_list. For example: string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1) would yield a string list of: ["foo", "", "", "bar", "", "", "baz"] Callers that wish to emulate the behavior of strtok(2) more directly should call `string_list_remove_empty_items()` after splitting. To avoid regressions for the new multi-character delimter cases, update t0063 in this patch as well. [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35 [2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11 Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 16:01:28 -07:00
Elijah Newren	d4a4f9291d	commit.h: reduce unnecessary includes Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:33 -07:00

1 2 3 4 5 ...

419 Commits (main)