Commit Graph

397 Commits (55547380384c77071b2fc8ff4cc570434a4793d1)

Author SHA1 Message Date
Junio C Hamano 1fbfabfa71 Merge branch 'pw/3.0-commentchar-auto-deprecation'
"core.commentChar=auto" that attempts to dynamically pick a
suitable comment character is non-workable, as it is too much
trouble to support for little benefit, and is marked as deprecated.

* pw/3.0-commentchar-auto-deprecation:
  commit: print advice when core.commentString=auto
  config: warn on core.commentString=auto
  breaking-changes: deprecate support for core.commentString=auto
2025-09-18 10:07:00 -07:00
Phillip Wood a0e6aaea7d config: warn on core.commentString=auto
As support for this setting was deprecated in the last commit print a
warning (or die when WITH_BREAKING_CHANGES is enabled) if it is set.
Avoid bombarding the user with warnings by only printing it (a) when
running commands that call "git commit" and (b) only once per command.

Some scaffolding is added to repo_read_config() to allow it to
detect deprecated config settings and warn about them. As both
"core.commentChar" and "core.commentString" set the comment
character we record which one of them is used and tailor the
warning message appropriately.

Note the odd combination of die_message() followed by die(NULL)
is to allow the next commit to insert a call to advise() in the middle.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-26 08:52:44 -07:00
Phillip Wood fdae4114a6 breaking-changes: deprecate support for core.commentString=auto
When "core.commentString" is set to "auto" then "git commit" will
automatically select the comment character ensuring that it is not the
first character on any of the lines in the commit message. This was
introduced by commit 84c9dc2c5a (commit: allow core.commentChar=auto
for character auto selection, 2014-05-17). The motivation seems to be
to avoid commenting out lines from the existing message when amending
a commit that was created with a message from a file.

Unfortunately this feature does not work with:

 * commit message templates that contain comments.

 * prepare-commit-msg hooks that introduce comments.

 * "git commit --cleanup=strip --edit -F <file>" which means that it
   is incompatible with

   - the "fixup" and "squash" commands of "git rebase -i" as the
     comments added by those commands are then treated as part of
     the commit message.

   - the conflict comments added to the commit message by "git
     cherry-pick", "git rebase" etc. as these comments are then
     treated as part of the commit message.

It is also ignored by "git notes" when amending a note.

The issues with comments coming from a template, hook or file are a
consequence of the design of this feature and are therefore hard to
fix.

As the costs of this feature outweigh the benefits, deprecate it and
remove it in Git 3.0. If someone comes up with some patches that fix
all the issues in a maintainable way then I'd be happy to see this
change reverted.

The next commits will add a warning and some advice for users on how
they can update their config settings.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-26 08:47:37 -07:00
Junio C Hamano 9d6e319ec5 Merge branch 'ac/deglobal-fmt-merge-log-config'
Code clean-up.

* ac/deglobal-fmt-merge-log-config:
  builtin/fmt-merge-msg: stop depending on 'the_repository'
  environment: remove the global variable 'merge_log_config'
2025-08-22 13:13:21 -07:00
Junio C Hamano 54fef16542 Merge branch 'jc/strbuf-split'
Arrays of strbuf is often a wrong data structure to use, and
strbuf_split*() family of functions that create them often have
better alternatives.

Update several code paths and replace strbuf_split*().

* jc/strbuf-split:
  trace2: do not use strbuf_split*()
  trace2: trim_trailing_newline followed by trim is a no-op
  sub-process: do not use strbuf_split*()
  environment: do not use strbuf_split*()
  config: do not use strbuf_split()
  notes: do not use strbuf_split*()
  merge-tree: do not use strbuf_split*()
  clean: do not use strbuf_split*() [part 2]
  clean: do not pass the whole structure when it is not necessary
  clean: do not use strbuf_split*() [part 1]
  clean: do not pass strbuf by value
  wt-status: avoid strbuf_split*()
2025-08-21 13:47:00 -07:00
Ayush Chandekar 9a49aef8dc environment: remove the global variable 'merge_log_config'
The global variable 'merge_log_config', set via the "merge.log" or
"merge.summary" settings, is only used in 'cmd_fmt_merge_msg()' and
'cmd_merge()' to adjust the 'shortlog_len' variable.

Remove 'merge_log_config' globally and localize it in
'cmd_fmt_merge_msg()' and 'cmd_merge()'. Set its value by passing it in
'fmt_merge_msg_config()' by passing its pointer to the function via the
callback parameter.

This change is part of an ongoing effort to eliminate global variables,
improve modularity and help libify the codebase.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11 09:16:55 -07:00
Junio C Hamano b894d4481f environment: do not use strbuf_split*()
environment.c:get_git_namespace() learns the raw namespace from an
environment variable, splits it at "/", and appends them after
"refs/namespaces/"; the reason why it splits first is so that an
empty string resulting from double slashes can be omitted.

The split pieces do not need to be edited in any way, so an array of
strbufs is a wrong data structure to use.  Instead split into a
string list and use the pieces from there.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-02 22:44:58 -07:00
Junio C Hamano 084681b1b0 Merge branch 'ps/config-wo-the-repository' into pw/3.0-commentchar-auto-deprecation
* ps/config-wo-the-repository: (21 commits)
  config: fix sign comparison warnings
  config: move Git config parsing into "environment.c"
  config: remove unused `the_repository` wrappers
  config: drop `git_config_set_multivar()` wrapper
  config: drop `git_config_get_multivar_gently()` wrapper
  config: drop `git_config_set_multivar_in_file_gently()` wrapper
  config: drop `git_config_set_in_file_gently()` wrapper
  config: drop `git_config_set()` wrapper
  config: drop `git_config_set_gently()` wrapper
  config: drop `git_config_set_in_file()` wrapper
  config: drop `git_config_get_bool()` wrapper
  config: drop `git_config_get_ulong()` wrapper
  config: drop `git_config_get_int()` wrapper
  config: drop `git_config_get_string()` wrapper
  config: drop `git_config_get_string()` wrapper
  config: drop `git_config_get_string_multi()` wrapper
  config: drop `git_config_get_value()` wrapper
  config: drop `git_config_get_value()` wrapper
  config: drop `git_config_get()` wrapper
  config: drop `git_config_clear()` wrapper
  ...
2025-07-31 12:28:51 -07:00
Patrick Steinhardt 08b775864e config: move Git config parsing into "environment.c"
In "config.c" we host both the business logic to read and write config
files as well as the logic to parse specific Git-related variables. On
the one hand this is mixing concerns, but even more importantly it means
that we cannot easily remove the dependency on `the_repository` in our
config parsing logic.

Move the logic into "environment.c". This file is a grab bag of all
kinds of global state already, so it is quite a good fit. Furthermore,
it also hosts most of the global variables that we're parsing the config
values into, making this an even better fit.

Note that there is one hidden change: in `parse_fsync_components()` we
use an `int` to iterate through `ARRAY_SIZE(fsync_component_names)`. But
as -Wsign-compare warnings are enabled in this file this causes a
compiler warning. The issue is fixed by using a `size_t` instead.

This change allows us to drop the `USE_THE_REPOSITORY_VARIABLE`
declaration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23 08:15:22 -07:00
Ayush Chandekar 44e300a974 repository: move 'repository_format_precious_objects' to repo scope
The 'extensions.preciousObjects' setting when set true, prevents
operations that might drop objects from the object storage. This setting
is populated in the global variable
'repository_format_precious_objects'.

Move this global variable to repo scope by adding it to 'struct
repository and also refactor all the occurences accordingly.

This change is part of an ongoing effort to eliminate global variables,
improve modularity and help libify the codebase.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07 08:31:13 -07:00
Ayush Chandekar b1d47b464e environment: remove the global variable 'core_preload_index'
The global variable 'core_preload_index' is used in a single function
named 'preload_index()' in "preload-index.c". Move its declaration inside
that function, removing unnecessary global state.

This change is part of an ongoing effort to eliminate global variables,
improve modularity and help libify the codebase.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-10 10:10:38 -07:00
Junio C Hamano b50795db79 Merge branch 'js/windows-arm64'
Update to arm64 Windows port.

* js/windows-arm64:
  max_tree_depth: lower it for clangarm64 on Windows
  mingw(arm64): do move the `/etc/git*` location
  msvc: do handle builds on Windows/ARM64
  mingw: do not use nedmalloc on Windows/ARM64
  config.mak.uname: add support for clangarm64
  bswap.h: add support for built-in bswap functions
2025-05-05 14:56:24 -07:00
Junio C Hamano dd45c2e48f Merge branch 'as/typofix-in-env-h-header'
Typofix.

* as/typofix-in-env-h-header:
  environment: fix typo: 'setup_git_directory_gently'
2025-04-29 14:21:27 -07:00
Johannes Schindelin 436a42215e max_tree_depth: lower it for clangarm64 on Windows
Just as in b64d78ad02 (max_tree_depth: lower it for MSVC to avoid
stack overflows, 2023-11-01), I encountered the same problem with the
clang builds on Windows/ARM64.

The symptom is an exit code 127 when t6700 tries to verify that `git
archive big` fails.

This exit code is reserved on Unix/Linux to mean "command not found".
Unfortunately in this case, it is the fall-back chosen by
Cygwin's `pinfo::status_exit()` method when encountering
the NSTATUS `STATUS_STACK_OVERFLOW`, see
https://github.com/cygwin/cygwin/blob/cygwin-3.6.1/winsup/cygwin/pinfo.cc#L171

I verified manually that the stack overflow always happens somewhere
around tree depth 1403, therefore 1280 should be a safe bound in these
instances.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-23 09:16:24 -07:00
Abhijeet Sonar ff4a749354 environment: fix typo: 'setup_git_directory_gently'
Above the declaration of git_work_tree_cfg, we have:

  /* This is set by setup_git_dir_gently() and/or git_default_config() */
  char *git_work_tree_cfg;

It can be verified that there is no function called
'setup_git_dir_gently' by running grep on the codebase:

  $ grep -R setup_git_dir_gently .
  ./environment.c:/* This is set by setup_git_dir_gently() and/or git_default_config() */

The comment, introduced in e90fdc39b6 (Clean up work-tree handling), is
the only occurrence of the name 'setup_git_dir_gently'.

It probably meant 'setup_git_directory_gently' as that is a name of a
real function in setup.c. Correct it.

Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-18 14:04:08 -07:00
Patrick Steinhardt 7835ee75cd environment: move access to "core.bigFileThreshold" into repo settings
The "core.bigFileThreshold" setting is stored in a global variable and
populated via `git_default_core_config()`. This may cause issues in
the case where one is handling multiple different repositories in a
single process with different values for that config key, as we may or
may not see the correct value in that case. Furthermore, global state
blocks our path towards libification.

Refactor the code so that we instead store the value in `struct
repo_settings`, where the value is computed as-needed and cached.

Note that this change requires us to adapt one test in t1050 that
verifies that we die when parsing an invalid "core.bigFileThreshold"
value. The exercised Git command doesn't use the value at all, and thus
it won't hit the new code path that parses the value. This is addressed
by using git-hash-object(1) instead, which does read the value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10 13:16:18 -07:00
Junio C Hamano feffb34257 Merge branch 'ps/path-sans-the-repository'
The path.[ch] API takes an explicit repository parameter passed
throughout the callchain, instead of relying on the_repository
singleton instance.

* ps/path-sans-the-repository:
  path: adjust last remaining users of `the_repository`
  environment: move access to "core.sharedRepository" into repo settings
  environment: move access to "core.hooksPath" into repo settings
  repo-settings: introduce function to clear struct
  path: drop `git_path()` in favor of `repo_git_path()`
  rerere: let `rerere_path()` write paths into a caller-provided buffer
  path: drop `git_common_path()` in favor of `repo_common_path()`
  worktree: return allocated string from `get_worktree_git_dir()`
  path: drop `git_path_buf()` in favor of `repo_git_path_replace()`
  path: drop `git_pathdup()` in favor of `repo_git_path()`
  path: drop unused `strbuf_git_path()` function
  path: refactor `repo_submodule_path()` family of functions
  submodule: refactor `submodule_to_gitdir()` to accept a repo
  path: refactor `repo_worktree_path()` family of functions
  path: refactor `repo_git_path()` family of functions
  path: refactor `repo_common_path()` family of functions
2025-03-05 10:37:43 -08:00
Patrick Steinhardt f1ce861c34 environment: move access to "core.sharedRepository" into repo settings
Similar as with the preceding commit, we track "core.sharedRepository"
via a pair of global variables. Move them into `struct repo_settings` so
that we can instead track them per-repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
Patrick Steinhardt 6f3fbed8ed environment: move access to "core.hooksPath" into repo settings
The "core.hooksPath" setting is stored in a global variable and
populated via the `git_default_core_config`. This may cause issues in
the case where one is handling multiple different repositories in a
single process with different values for that config key, as we may or
may not see the correct value in that case. Furthermore, global state
blocks our path towards libification.

Refactor the code so that we instead store the value in `struct
repo_settings`. The value is computed as-needed and cached. The result
should be functionally the same as there aren't ever any code paths
where we'd execute hooks outside the context of a repository.

Note that this requires us to change the passed-in repository in the
`repo_git_path()` family of functions to be non-constant, as we call
`adjust_git_path()` there.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
Patrick Steinhardt 41f1a8435a git-compat-util: move include of "compat/zlib.h" into "git-zlib.h"
We include "compat/zlib.h" in "git-compat-util.h", which is
unnecessarily broad given that we only have a small handful of files
that use the zlib library. Move the header into "git-zlib.h" instead and
adapt users of zlib to include that header.

One exception is the reftable library, as we don't want to use the
Git-specific wrapper of zlib there, so we include "compat/zlib.h"
instead. Furthermore, we move the include into "reftable/system.h" so
that users of the library other than Git can wire up zlib themselves.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:22 -08:00
Karthik Nayak d284713bae config: make `packed_git_(limit|window_size)` non-global variables
The variables `packed_git_window_size` and `packed_git_limit` are global
config variables used in the `packfile.c` file. Since it is only used in
this file, let's change it from being a global config variable to a
local variable for the subsystem.

With this, we rid `packfile.c` from all global variable usage and this
means we can also remove the `USE_THE_REPOSITORY_VARIABLE` guard from
the file.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:55 +09:00
Karthik Nayak d6b2d21fbf config: make `delta_base_cache_limit` a non-global variable
The `delta_base_cache_limit` variable is a global config variable used
by multiple subsystems. Let's make this non-global, by adding this
variable independently to the subsystems where it is used.

First, add the setting to the `repo_settings` struct, this provides
access to the config in places where the repository is available. Use
this in `packfile.c`.

In `index-pack.c` we add it to the `pack_idx_option` struct and its
constructor. While the repository struct is available here, it may not
be set  because `git index-pack` can be used without a repository.

In `gc.c` add it to the `gc_config` struct and also the constructor
function. The gc functions currently do not have direct access to a
repository struct.

These changes are made to remove the usage of `delta_base_cache_limit`
as a global variable in `packfile.c`. This brings us one step closer to
removing the `USE_THE_REPOSITORY_VARIABLE` definition in `packfile.c`
which we complete in the next patch.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:55 +09:00
Patrick Steinhardt 1e7e4a111f environment: stop storing "core.notesRef" globally
Stop storing the "core.notesRef" config value globally. Instead,
retrieve the value in `default_notes_ref()`. The code is never called in
a hot loop anyway, so doing this on every invocation should be perfectly
fine.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:44 -07:00
Patrick Steinhardt 11dbb4ace3 environment: stop storing "core.warnAmbiguousRefs" globally
Same as the preceding commits, storing the "core.warnAmbiguousRefs"
value globally is misdesigned as this setting may be set per repository.

Move the logic into the repo-settings subsystem. The usual pattern here
is that users are expected to call `prepare_repo_settings()` before they
access the settings themselves. This seems somewhat fragile though, as
it is easy to miss and leads to somewhat ugly code patterns at the call
sites.

Instead, introduce a new function that encapsulates this logic for us.
This also allows us to change how exactly the lazy initialization works
in the future, e.g. by only partially initializing values as requested
by the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:44 -07:00
Patrick Steinhardt 8e2e8a33f3 environment: stop storing "core.preferSymlinkRefs" globally
Same as the preceding commit, storing the "core.preferSymlinkRefs" value
globally is misdesigned as this setting may be set per repository.

There is only a single user of this value anyway, namely the "files"
backend. So let's just remove the global variable and read the value of
this setting when initializing the backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:43 -07:00
Patrick Steinhardt eafb126456 environment: stop storing "core.logAllRefUpdates" globally
The value of "core.logAllRefUpdates" is being stored in the global
variable `log_all_ref_updates`. This design is somewhat aged nowadays,
where it is entirely possible to access multiple repositories in the
same process which all have different values for this setting. So using
a single global variable to track it is plain wrong.

Remove the global variable. Instead, we now provide a new function part
of the repo-settings subsystem that parses the value for a specific
repository. While that may require us to read the value multiple times,
we work around this by reading it once when the ref backends are set up
and caching the value there.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:43 -07:00
Patrick Steinhardt a52beae3a3 environment: move `set_git_dir()` and related into setup layer
The functions `set_git_dir()` and friends are used to set up
repositories. As such, they are quite clearly part of the setup
subsystem, but still live in "environment.c". Move them over, which also
helps to get rid of dependencies on `the_repository` in the environment
subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:41 -07:00
Patrick Steinhardt c22d183b01 environment: make `get_git_namespace()` self-contained
The logic to set up and retrieve `git_namespace` is distributed across
different functions which communicate with each other via a global
environment variable. This is rather pointless though, as the value is
always derived from an environment variable, and this environment
variable does not change after we have parsed global options.

Convert the function to be fully self-contained such that it lazily
populates once called.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:41 -07:00
Patrick Steinhardt 26b4df907b environment: move object database functions into object layer
The `odb_mkstemp()` and `odb_pack_keep()` functions are quite clearly
tied to the object store, but regardless of that they are located in
"environment.c". Move them over, which also helps to get rid of
dependencies on `the_repository` in the environment subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:40 -07:00
Patrick Steinhardt edc2c92624 environment: make `get_git_work_tree()` accept a repository
The `get_git_work_tree()` function retrieves the path of the work tree
of `the_repository`. Make it accept a `struct repository` such that it
can work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:40 -07:00
Patrick Steinhardt 14c90ac088 environment: make `get_graft_file()` accept a repository
The `get_graft_file()` function retrieves the path to the graft file of
`the_repository`. Make it accept a `struct repository` such that it can
work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:40 -07:00
Patrick Steinhardt 1dc4ec2102 environment: make `get_index_file()` accept a repository
The `get_index_file()` function retrieves the path to the index file
of `the_repository`. Make it accept a `struct repository` such that it
can work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:39 -07:00
Patrick Steinhardt a3673f4898 environment: make `get_object_directory()` accept a repository
The `get_object_directory()` function retrieves the path to the object
directory for `the_repository`. Make it accept a `struct repository`
such that it can work on arbitrary repositories and make it part of the
repository subsystem. This reduces our reliance on `the_repository` and
clarifies scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:39 -07:00
Patrick Steinhardt 661624a4f6 environment: make `get_git_common_dir()` accept a repository
The `get_git_common_dir()` function retrieves the path to the common
directory for `the_repository`. Make it accept a `struct repository`
such that it can work on arbitrary repositories and make it part of the
repository subsystem. This reduces our reliance on `the_repository` and
clarifies scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:39 -07:00
Patrick Steinhardt 246deeac95 environment: make `get_git_dir()` accept a repository
The `get_git_dir()` function retrieves the path to the Git directory for
`the_repository`. Make it accept a `struct repository` such that it can
work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12 10:15:39 -07:00
Patrick Steinhardt 648abbe22d config: fix leaking comment character config
When the comment line character has been specified multiple times in the
configuration, then `git_default_core_config()` will cause a memory leak
because it unconditionally copies the string into `comment_line_str`
without free'ing the previous value. In fact, it can't easily free the
value in the first place because it may contain a string constant.

Refactor the code such that we track allocated comment character strings
via a separate non-constant variable `comment_line_str_to_free`. Adapt
sites that set `comment_line_str` to set both and free the old value
that was stored in `comment_line_str_to_free`.

This memory leak is being hit in t3404. As there are still other memory
leaks in that file we cannot yet mark it as passing with leak checking
enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
Patrick Steinhardt e7da938570 global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.

It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.

Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit

For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).

Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
Patrick Steinhardt 1b261c20ed config: clarify memory ownership in `git_config_string()`
The out parameter of `git_config_string()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:00 -07:00
Patrick Steinhardt a6cb0cc610 convert: refactor code to clarify ownership of check_roundtrip_encoding
The `check_roundtrip_encoding` variable is tracked in a `const char *`
even though it may contain allocated strings at times. The result is
that those strings may be leaking because we never free them.

Refactor the code to always store allocated strings in this variable.
The default value is handled in `check_roundtrip()` now, which is the
only user of the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
Patrick Steinhardt 6073b3b5c3 config: clarify memory ownership in `git_config_pathname()`
The out parameter of `git_config_pathname()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
Junio C Hamano dce1e0b6da Merge branch 'jk/core-comment-string'
core.commentChar used to be limited to a single byte, but has been
updated to allow an arbitrary multi-byte sequence.

* jk/core-comment-string:
  config: add core.commentString
  config: allow multi-byte core.commentChar
  environment: drop comment_line_char compatibility macro
  wt-status: drop custom comment-char stringification
  sequencer: handle multi-byte comment characters when writing todo list
  find multi-byte comment chars in unterminated buffers
  find multi-byte comment chars in NUL-terminated strings
  prefer comment_line_str to comment_line_char for printing
  strbuf: accept a comment string for strbuf_add_commented_lines()
  strbuf: accept a comment string for strbuf_commented_addf()
  strbuf: accept a comment string for strbuf_stripspace()
  environment: store comment_line_char as a string
  strbuf: avoid shadowing global comment_line_char name
  commit: refactor base-case of adjust_comment_line_char()
  strbuf: avoid static variables in strbuf_add_commented_lines()
  strbuf: simplify comment-handling in add_lines() helper
  config: forbid newline as core.commentChar
2024-04-05 10:49:49 -07:00
Jeff King 72a7d5d97f environment: store comment_line_char as a string
We'd like to eventually support multi-byte comment prefixes, but the
comment_line_char variable is referenced in many spots, making the
transition difficult.

Let's start by storing the character in a NUL-terminated string. That
will let us switch code over incrementally to the string format, and we
can easily support the existing code with a macro wrapper (since we'll
continue to allow only a single-byte prefix, this will behave
identically).

Once all references to the "char" variable have been converted, we can
drop it and enable longer strings.

We'll still have to touch all of the spots that create or set the
variable in this patch, but there are only a few (reading the config,
and the "auto" character selector).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:10 -07:00
Junio C Hamano 2c206fc82a Merge branch 'jc/no-lazy-fetch'
"git --no-lazy-fetch cmd" allows to run "cmd" while disabling lazy
fetching of objects from the promisor remote, which may be handy
for debugging.

* jc/no-lazy-fetch:
  git: extend --no-lazy-fetch to work across subprocesses
  git: document GIT_NO_REPLACE_OBJECTS environment variable
  git: --no-lazy-fetch option
2024-03-07 15:59:40 -08:00
Junio C Hamano e6d5479e7a git: extend --no-lazy-fetch to work across subprocesses
Modeling after how the `--no-replace-objects` option is made usable
across subprocess spawning (e.g., cURL based remote helpers are
spawned as a separate process while running "git fetch"), allow the
`--no-lazy-fetch` option to be passed across process boundaries.

Do not model how the value of GIT_NO_REPLACE_OBJECTS environment
variable is ignored, though.  Just use the usual git_env_bool() to
allow "export GIT_NO_LAZY_FETCH=0" and "unset GIT_NO_LAZY_FETCH"
to be equivalents.

Also do not model how the request is not propagated to subprocesses
we spawn (e.g. "git clone --local" that spawns a new process to work
in the origin repository, while the original one working in the
newly created one) by the "--no-replace-objects" option, as this "do
not lazily fetch from the promisor" is more about a per-request
debugging aid, not "this repository's promisor should not be relied
upon" property specific to a repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 09:53:14 -08:00
Jeff King be6bc048d7 config: use git_config_string() for core.checkRoundTripEncoding
Since this code path was recently converted to check for a NULL value,
it now behaves exactly like git_config_string(). We can shorten the code
a bit by using that helper.

Note that git_config_string() takes a const pointer, but our storage
variable is non-const. We're better off making this "const", though,
since the default value points to a string literal (and thus it would be
an error if anybody tried to write to it).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:26:22 +09:00
Johannes Schindelin b64d78ad02 max_tree_depth: lower it for MSVC to avoid stack overflows
There seems to be some internal stack overflow detection in MSVC's
`malloc()` machinery that seems to be independent of the `stack reserve`
and `heap reserve` sizes specified in the executable (editable via
`EDITBIN /STACK:<n> <exe>` and `EDITBIN /HEAP:<n> <exe>`).

In the newly test cases added by `jk/tree-name-and-depth-limit`, this
stack overflow detection is unfortunately triggered before Git can print
out the error message about too-deep trees and exit gracefully. Instead,
it exits with `STATUS_STACK_OVERFLOW`. This corresponds to the numeric
value -1073741571, something the MSYS2 runtime we sadly need to use to
run Git's test suite cannot handle and which it internally maps to the
exit code 127. Git's test suite, in turn, mistakes this to mean that the
command was not found, and fails both test cases.

Here is an example stack trace from an example run:

    [0x0]   ntdll!RtlpAllocateHeap+0x31   0x4212603f50   0x7ff9d6d4cd49
    [0x1]   ntdll!RtlpAllocateHeapInternal+0x6c9   0x42126041b0   0x7ff9d6e14512
    [0x2]   ntdll!RtlDebugAllocateHeap+0x102   0x42126042b0   0x7ff9d6dcd8b0
    [0x3]   ntdll!RtlpAllocateHeap+0x7ec70   0x4212604350   0x7ff9d6d4cd49
    [0x4]   ntdll!RtlpAllocateHeapInternal+0x6c9   0x42126045b0   0x7ff9596ed480
    [0x5]   ucrtbased!heap_alloc_dbg_internal+0x210   0x42126046b0   0x7ff9596ed20d
    [0x6]   ucrtbased!heap_alloc_dbg+0x4d   0x4212604750   0x7ff9596f037f
    [0x7]   ucrtbased!_malloc_dbg+0x2f   0x42126047a0   0x7ff9596f0dee
    [0x8]   ucrtbased!malloc+0x1e   0x42126047d0   0x7ff730fcc1ef
    [0x9]   git!do_xmalloc+0x2f   0x4212604800   0x7ff730fcc2b9
    [0xa]   git!do_xmallocz+0x59   0x4212604840   0x7ff730fca779
    [0xb]   git!xmallocz_gently+0x19   0x4212604880   0x7ff7311b0883
    [0xc]   git!unpack_compressed_entry+0x43   0x42126048b0   0x7ff7311ac9a4
    [0xd]   git!unpack_entry+0x554   0x42126049a0   0x7ff7311b0628
    [0xe]   git!cache_or_unpack_entry+0x58   0x4212605250   0x7ff7311ad3a8
    [0xf]   git!packed_object_info+0x98   0x42126052a0   0x7ff7310a92da
    [0x10]   git!do_oid_object_info_extended+0x3fa   0x42126053b0   0x7ff7310a44e7
    [0x11]   git!oid_object_info_extended+0x37   0x4212605460   0x7ff7310a38ba
    [0x12]   git!repo_read_object_file+0x9a   0x42126054a0   0x7ff7310a6147
    [0x13]   git!read_object_with_reference+0x97   0x4212605560   0x7ff7310b4656
    [0x14]   git!fill_tree_descriptor+0x66   0x4212605620   0x7ff7310dc0a5
    [0x15]   git!traverse_trees_recursive+0x3f5   0x4212605680   0x7ff7310dd831
    [0x16]   git!unpack_callback+0x441   0x4212605790   0x7ff7310b4c95
    [0x17]   git!traverse_trees+0x5d5   0x42126058a0   0x7ff7310dc0f2
    [0x18]   git!traverse_trees_recursive+0x442   0x4212605980   0x7ff7310dd831
    [0x19]   git!unpack_callback+0x441   0x4212605a90   0x7ff7310b4c95
    [0x1a]   git!traverse_trees+0x5d5   0x4212605ba0   0x7ff7310dc0f2
    [0x1b]   git!traverse_trees_recursive+0x442   0x4212605c80   0x7ff7310dd831
    [0x1c]   git!unpack_callback+0x441   0x4212605d90   0x7ff7310b4c95
    [0x1d]   git!traverse_trees+0x5d5   0x4212605ea0   0x7ff7310dc0f2
    [0x1e]   git!traverse_trees_recursive+0x442   0x4212605f80   0x7ff7310dd831
    [0x1f]   git!unpack_callback+0x441   0x4212606090   0x7ff7310b4c95
    [0x20]   git!traverse_trees+0x5d5   0x42126061a0   0x7ff7310dc0f2
    [0x21]   git!traverse_trees_recursive+0x442   0x4212606280   0x7ff7310dd831
    [...]
    [0xfad]   git!cmd_main+0x2a2   0x42126ff740   0x7ff730fb6345
    [0xfae]   git!main+0xe5   0x42126ff7c0   0x7ff730fbff93
    [0xfaf]   git!wmain+0x2a3   0x42126ff830   0x7ff731318859
    [0xfb0]   git!invoke_main+0x39   0x42126ff8a0   0x7ff7313186fe
    [0xfb1]   git!__scrt_common_main_seh+0x12e   0x42126ff8f0   0x7ff7313185be
    [0xfb2]   git!__scrt_common_main+0xe   0x42126ff960   0x7ff7313188ee
    [0xfb3]   git!wmainCRTStartup+0xe   0x42126ff990   0x7ff9d5ed257d
    [0xfb4]   KERNEL32!BaseThreadInitThunk+0x1d   0x42126ff9c0   0x7ff9d6d6aa78
    [0xfb5]   ntdll!RtlUserThreadStart+0x28   0x42126ff9f0   0x0

I verified manually that `traverse_trees_cur_depth` was 562 when that
happened, which is far below the 2048 that were already accepted into
Git as a hard limit.

Despite many attempts to figure out which of the internals trigger this
`STATUS_STACK_OVERFLOW` and how to maybe increase certain sizes to avoid
running into this issue and let Git behave the same way as under Linux,
I failed to find any build-time/runtime knob we could turn to that
effect.

Note: even switching to using a different allocator (I used mimalloc
because that's what Git for Windows uses for its GCC builds) does not
help, as the zlib code used to unpack compressed pack entries _still_
uses the regular `malloc()`. And runs into the same issue.

Note also: switching to using a different allocator _also_ for zlib code
seems _also_ not to help. I tried that, and it still exited with
`STATUS_STACK_OVERFLOW` that seems to have been triggered by a
`mi_assert_internal()`, i.e. an internal assertion of mimalloc...

So the best bet to work around this for now seems to just lower the
maximum allowed tree depth _even further_ for MSVC builds.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02 08:58:28 +09:00
Jeff King 4d5693ba05 lower core.maxTreeDepth default to 2048
On my Linux system, all of our recursive tree walking algorithms can run
up to the 4096 default limit without segfaulting. But not all platforms
will have stack sizes as generous (nor might even Linux if we kick off a
recursive walk within a thread).

In particular, several of the tests added in the previous few commits
fail in our Windows CI environment. Through some guess-and-check
pushing, I found that 3072 is still too much, but 2048 is OK.

These are obviously vague heuristics, and there is nothing to promise
that another system might not have trouble at even lower values. But it
seems unlikely anybody will be too angry about a 2048-depth limit (this
is close to the default max-pathname limit on Linux even for a
pathological path like "a/a/a/..."). So let's just lower it.

Some alternatives are:

  - configure separate defaults for Windows versus other platforms.

  - just skip the tests on Windows. This leaves Windows users with the
    annoying case that they can be crashed by running out of stack
    space, but there shouldn't be any security implications (they can't
    go deep enough to hit integer overflow problems).

Since the original default was arbitrary, it seems less confusing to
just lower it, keeping behavior consistent across platforms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:08 -07:00
Jeff King be20128bfa add core.maxTreeDepth config
Most of our tree traversal algorithms use recursion to visit sub-trees.
For pathologically large trees, this can cause us to run out of stack
space and abort in an uncontrolled way. Let's put our own limit here so
that we can fail gracefully rather than segfaulting.

In similar cases where we recursed along the commit graph, we rewrote
the algorithms to avoid recursion and keep any stack data on the heap.
But the commit graph is meant to grow without bound, whereas it's not an
imposition to put a limit on the maximum size of tree we'll handle.

And this has a bonus side effect: coupled with a limit on individual
tree entry names, this limits the total size of a path we may encounter.
This gives us an extra protection against code handling long path names
which may suffer from integer overflows in the size (which could then be
exploited by malicious trees).

The default of 4096 is set to be much longer than anybody would care
about in the real world. Even with single-letter interior tree names
(like "a/b/c"), such a path is at least 8191 bytes. While most operating
systems will let you create such a path incrementally, trying to
reference the whole thing in a system call (as Git would do when
actually trying to access it) will result in ENAMETOOLONG. Coupled with
the recent fsck.largePathname warning, the maximum total pathname Git
will handle is (by default) 16MB.

This config option doesn't do anything yet; future patches will convert
various algorithms to respect the limit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:07 -07:00
Junio C Hamano ddcb8fd8b9 Merge branch 'rs/pack-objects-parseopt-fix'
Command line parser fix.

* rs/pack-objects-parseopt-fix:
  pack-objects: fix --no-quiet
  pack-objects: fix --no-keep-true-parents
2023-07-28 09:45:22 -07:00
René Scharfe 3a5f308741 pack-objects: fix --no-keep-true-parents
Since 99fb6e04cb (pack-objects: convert to use parse_options(),
2012-02-01) git pack-objects has accepted --no-keep-true-parents, but
this option does the same as --keep-true-parents.  That's because it's
defined using OPT_SET_INT with a value of 0, which sets 0 when negated
as well.

Turn --no-keep-true-parents into the opposite of --keep-true-parents by
using OPT_BOOL and storing the option's status directly in a variable
named "grafts_keep_true_parents" instead of in negative form in
"grafts_replace_parents".

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-21 10:02:59 -07:00