Commit Graph

12420 Commits (v2.48.0-rc0)

Author SHA1 Message Date
Patrick Steinhardt 51a0b8a2a7 t7900: exercise detaching via trace2 regions
In t7900, we exercise the `--detach` logic by checking whether the
command ended up writing anything to its output or not. This supposedly
works because we close stdin, stdout and stderr when daemonizing. But
one, it breaks on platforms where daemonize is a no-op, like Windows.
And second, that git-maintenance(1) outputs anything at all in these
tests is a bug in the first place that we'll fix in a subsequent commit.

Introduce a new trace2 region around the detach which allows us to more
explicitly check whether the detaching logic was executed. This is a
much more direct way to exercise the logic, provides a potentially
useful signal to tracing logs and also works alright on platforms which
do not have the ability to daemonize.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
[jc: dropped a stale in-code comment from a test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 11:33:02 -07:00
Junio C Hamano 2df380c280 Merge branch 'ps/leakfixes-part-4' into ps/leakfixes-part-5
* ps/leakfixes-part-4: (22 commits)
  builtin/diff: free symmetric diff members
  diff: free state populated via options
  builtin/log: fix leak when showing converted blob contents
  userdiff: fix leaking memory for configured diff drivers
  builtin/format-patch: fix various trivial memory leaks
  diff: fix leak when parsing invalid ignore regex option
  unpack-trees: clear index when not propagating it
  sequencer: release todo list on error paths
  merge-ort: unconditionally release attributes index
  builtin/fast-export: plug leaking tag names
  builtin/fast-export: fix leaking diff options
  builtin/fast-import: plug trivial memory leaks
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/rebase: fix leaking `commit.gpgsign` value
  config: fix leaking comment character config
  submodule-config: fix leaking name entry when traversing submodules
  read-cache: fix leaking hashfile when writing index fails
  bulk-checkin: fix leaking state TODO
  object-name: fix leaking symlink paths in object context
  object-file: fix memory leak when reading corrupted headers
  ...
2024-08-20 10:15:27 -07:00
Junio C Hamano b9497848df Merge branch 'tb/incremental-midx-part-1'
Incremental updates of multi-pack index files.

* tb/incremental-midx-part-1:
  midx: implement support for writing incremental MIDX chains
  t/t5313-pack-bounds-checks.sh: prepare for sub-directories
  t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  midx: implement verification support for incremental MIDXs
  midx: support reading incremental MIDX chains
  midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs
  midx: teach `midx_preferred_pack()` about incremental MIDXs
  midx: teach `midx_contains_pack()` about incremental MIDXs
  midx: remove unused `midx_locate_pack()`
  midx: teach `fill_midx_entry()` about incremental MIDXs
  midx: teach `nth_midxed_offset()` about incremental MIDXs
  midx: teach `bsearch_midx()` about incremental MIDXs
  midx: introduce `bsearch_one_midx()`
  midx: teach `nth_bitmapped_pack()` about incremental MIDXs
  midx: teach `nth_midxed_object_oid()` about incremental MIDXs
  midx: teach `prepare_midx_pack()` about incremental MIDXs
  midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs
  midx: add new fields for incremental MIDX chains
  Documentation: describe incremental MIDX format
2024-08-19 11:07:37 -07:00
Jeff King 9dc1e748ef update-ref: mark more unused parameters in parser callbacks
This is a continuation of 44ad082968 (update-ref: mark unused parameter
in parser callbacks, 2023-08-29), as we've grown a few more virtual
functions since then.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:10 -07:00
Junio C Hamano b3d175409d Merge branch 'sj/ref-fsck'
"git fsck" infrastructure has been taught to also check the sanity
of the ref database, in addition to the object database.

* sj/ref-fsck:
  fsck: add ref name check for files backend
  files-backend: add unified interface for refs scanning
  builtin/refs: add verify subcommand
  refs: set up ref consistency check infrastructure
  fsck: add refs report function
  fsck: add a unified interface for reporting fsck messages
  fsck: make "fsck_error" callback generic
  fsck: rename objects-related fsck error functions
  fsck: rename "skiplist" to "skip_oids"
2024-08-16 12:51:51 -07:00
Junio C Hamano c09721cb63 Merge branch 'dd/notes-empty-no-edit-by-default' into maint-2.46
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.

* dd/notes-empty-no-edit-by-default:
  notes: do not trigger editor when adding an empty note
2024-08-16 12:50:55 -07:00
Junio C Hamano b74d885b11 Merge branch 'tn/doc-commit-fix' into maint-2.46
Docfix.

* tn/doc-commit-fix:
  doc: remove dangling closing parenthesis
2024-08-16 12:50:54 -07:00
Junio C Hamano bb250b5378 Merge branch 'jc/checkout-no-op-switch-errors' into maint-2.46
"git checkout --ours" (no other arguments) complained that the
option is incompatible with branch switching, which is technically
correct, but found confusing by some users.  It now says that the
user needs to give pathspec to specify what paths to checkout.

* jc/checkout-no-op-switch-errors:
  checkout: special case error messages during noop switching
2024-08-16 12:50:51 -07:00
Patrick Steinhardt e3209bd4df builtin/stash: fix `--keep-index --include-untracked` with empty HEAD
It was reported that creating a stash with `--keep-index
--include-untracked` causes an error when HEAD points to a commit whose
tree is empty:

    $ git stash push --keep-index --include-untracked
    error: pathspec ':/' did not match any file(s) known to git

This error comes from `git checkout --no-overlay $i_tree -- :/`, which
we execute to reset the working tree to the state in our index. As the
tree generated from the index is empty in our case, ':/' does not match
any files and thus causes git-checkout(1) to error out.

Fix the issue by skipping the checkout when the index tree is empty. As
explained in the in-code comment, this should be the correct thing to do
as there is nothing that we'd have to reset in the first place.

Reported-by: Piotr Siupa <piotrsiupa@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:50:33 -07:00
Patrick Steinhardt 98077d06b2 run-command: fix detaching when running auto maintenance
In the past, we used to execute `git gc --auto` as part of our automatic
housekeeping routines. As git-gc(1) may require quite some time to
perform the housekeeping, it knows to detach itself and run in the
background so that the user can continue their work.

Eventually, we refactored our automatic housekeeping to instead use the
more flexible git-maintenance(1) command. The upside of this new infra
is that the user can configure which maintenance tasks are performed, at
least to a certain degree. So while it continues to run git-gc(1) by
default, it can also be adapted to e.g. use git-multi-pack-index(1) for
maintenance of the object database.

The auto-detach of the new infra is somewhat broken though once the user
configures non-standard tasks. The problem is essentially that we detach
at the wrong level in the process hierarchy: git-maintenance(1) never
detaches itself, but instead it continues to be git-gc(1) which does.

When configured to only run the git-gc(1) maintenance task, then the
result is basically the same as before. But when configured to run other
tasks, then git-maintenance(1) will wait for these to run to completion.
Even worse, it may be that git-gc(1) runs concurrently with other
housekeeping tasks, stomping on each others feet.

Fix this bug by asking git-gc(1) to not detach when it is being invoked
via git-maintenance(1). Instead, git-maintenance(1) now respects a new
config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
detaches itself into the background when running as part of our auto
maintenance. This should continue to behave the same for all users which
use the git-gc(1) task, only. For others though, it means that we now
properly perform all tasks in the background. The default behaviour of
git-maintenance(1) when executed by the user does not change, it will
remain in the foreground unless they pass the `--detach` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:26 -07:00
Patrick Steinhardt a6affd3343 builtin/maintenance: add a `--detach` flag
Same as the preceding commit, add a `--[no-]detach` flag to the
git-maintenance(1) command. This will be used in a subsequent commit to
fix backgrounding of that command when configured with a non-standard
set of tasks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:26 -07:00
Patrick Steinhardt c7185df01b builtin/gc: add a `--detach` flag
When running `git gc --auto`, the command will by default detach and
continue running in the background. This behaviour can be tweaked via
the `gc.autoDetach` config, but not via a command line switch. We need
that in a subsequent commit though, where git-maintenance(1) will want
to ask its git-gc(1) child process to not detach anymore.

Add a `--[no-]detach` flag that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Patrick Steinhardt 9b6b994f90 builtin/gc: stop processing log file on signal
When detaching, git-gc(1) will redirect its stderr to a "gc.log" log
file, which is then used to surface errors of a backgrounded process to
the user. To ensure that the file is properly managed on abnormal exit
paths, we install both signal and exit handlers that try to either
commit the underlying lock file or roll it back in case there wasn't any
error.

This logic is severly broken when handling signals though, as we end up
calling all kinds of functions that are not signal safe. This includes
malloc(3P) via `git_path()`, fprintf(3P), fflush(3P) and many more
functions. The consequence can be anything, from deadlocks to crashes.
Unfortunately, we cannot really do much about this without a larger
refactoring.

The least-worst thing we can do is to not set up the signal handler in
the first place. This will still cause us to remove the lockfile, as the
underlying tempfile subsystem already knows to unlink locks when
receiving a signal. But it may cause us to remove the lock even in the
case where it would have contained actual errors, which is a change in
behaviour.

The consequence is that "gc.log" will not be committed, and thus
subsequent calls to `git gc --auto` won't bail out because of this.
Arguably though, it is better to retry garbage collection rather than
having the process run into a potentially-corrupted state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Patrick Steinhardt 0ce44e2293 builtin/gc: fix leaking config values
We're leaking config values in git-gc(1) when those values are tracked
as strings. Introduce a new `gc_config_release()` function that releases
this memory to plug those leaks and release old values before populating
the config fields via `git_config_string()` et al.

Note that there is one small gotcha here with the "--prune" option. Next
to passing a string, this option also accepts the "--no-prune" option
that overrides the default or configured value. We thus need to discern
between the option not having been passed by the user and the negative
variant of it. This is done by using a simple sentinel value that lets
us discern these cases.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Patrick Steinhardt d1ae15d68b builtin/gc: refactor to read config into structure
The git-gc(1) command knows to read a bunch of config keys to tweak its
own behaviour. The values are parsed into global variables, which makes
it hard to correctly manage the lifecycle of values that may require a
memory allocation.

Refactor the code to use a `struct gc_config` that gets populated and
passed around. For one, this makes previously-implicit dependencies on
these config values clear. Second, it will allow us to properly manage
the lifecycle in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Patrick Steinhardt a70a9bf6ee config: fix constness of out parameter for `git_config_get_expiry()`
The type of the out parameter of `git_config_get_expiry()` is a pointer
to a constant string, which creates the impression that ownership of the
returned data wasn't transferred to the caller. This isn't true though
and thus quite misleading.

Adapt the parameter to be of type `char **` and adjust callers
accordingly. While at it, refactor `get_shared_index_expire_date()` to
drop the static `shared_index_expire` variable. It is only used in that
function, and furthermore we would only hit the code where we parse the
expiry date a single time because we already use a static `prepared`
variable to track whether we did parse it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:24 -07:00
Junio C Hamano 0da7673a51 Merge branch 'xx/diff-tree-remerge-diff-fix'
"git rev-list ... | git diff-tree -p --remerge-diff --stdin" should
behave more or less like "git log -p --remerge-diff" but instead it
crashed, forgetting to prepare a temporary object store needed.

* xx/diff-tree-remerge-diff-fix:
  diff-tree: fix crash when used with --remerge-diff
2024-08-15 13:22:16 -07:00
Junio C Hamano e7f86cb69d Merge branch 'jc/refs-symref-referent'
The refs API has been taught to give symref target information to
the users of ref iterators, allowing for-each-ref and friends to
avoid an extra ref_resolve_* API call per a symbolic ref.

* jc/refs-symref-referent:
  ref-filter: populate symref from iterator
  refs: add referent to each_ref_fn
  refs: keep track of unresolved reference value in iterators
2024-08-15 13:22:15 -07:00
Junio C Hamano 88457a6151 Merge branch 'ps/submodule-ref-format'
Support to specify ref backend for submodules has been enhanced.

* ps/submodule-ref-format:
  object: fix leaking packfiles when closing object store
  submodule: fix leaking seen submodule names
  submodule: fix leaking fetch tasks
  builtin/submodule: allow "add" to use different ref storage format
  refs: fix ref storage format for submodule ref stores
  builtin/clone: propagate ref storage format to submodules
  builtin/submodule: allow cloning with different ref storage format
  git-submodule.sh: break overly long command lines
2024-08-15 13:22:14 -07:00
Taylor Blau 11a08e8332 pack-bitmap: drop redundant args from `bitmap_writer_finish()`
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_finish()` function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:23:15 -07:00
Taylor Blau f00dda4849 pack-bitmap: drop redundant args from `bitmap_writer_build()`
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_build()` function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:22:27 -07:00
Taylor Blau 125ee4ae80 pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()`
The previous commit ensures that the bitmap_writer's "to_pack" field is
initialized early on, so the "to_pack" and "index_nr" arguments to
`bitmap_writer_build_type_index()` are redundant.

Drop them and adjust the callers accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:20:24 -07:00
Taylor Blau 01e9d12939 pack-bitmap: initialize `bitmap_writer_init()` with packing_data
In order to determine its object order, the pack-bitmap machinery keeps
a 'struct packing_data' corresponding to the pack or pseudo-pack (when
writing a MIDX bitmap) being written.

The to_pack field is provided to the bitmap machinery by callers of
bitmap_writer_build() and assigned to the bitmap_writer struct at that
point.

But a subsequent commit will want to have access to that data earlier on
during commit selection. Prepare for that by adding a 'to_pack' argument
to 'bitmap_writer_init()', and initializing the field during that
function.

Subsequent commits will clean up other functions which take
now-redundant arguments (like nr_objects, which is equivalent to
pdata->objects_nr, or pdata itself).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:18:04 -07:00
Junio C Hamano 61fd5de05f Merge branch 'kl/test-fixes'
A flakey test and incorrect calls to strtoX() functions have been
fixed.

* kl/test-fixes:
  t6421: fix test to work when repo dir contains d0
  set errno=0 before strtoX calls
2024-08-14 14:54:55 -07:00
Junio C Hamano 44773b9f70 Merge branch 'jc/patch-id'
The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.

* jc/patch-id:
  patch-id: tighten code to detect the patch header
  patch-id: rewrite code that detects the beginning of a patch
  patch-id: make get_one_patchid() more extensible
  patch-id: call flush_current_id() only when needed
  t4204: patch-id supports various input format
2024-08-14 14:54:53 -07:00
Junio C Hamano 760348212b Merge branch 'ps/ls-remote-out-of-repo-fix'
A recent update broke "git ls-remote" used outside a repository,
which has been corrected.

* ps/ls-remote-out-of-repo-fix:
  builtin/ls-remote: fall back to SHA1 outside of a repo
2024-08-14 14:54:49 -07:00
Junio C Hamano 4385f8a52d Merge branch 'ps/leakfixes-part-3'
More leakfixes.

* ps/leakfixes-part-3: (24 commits)
  commit-reach: fix trivial memory leak when computing reachability
  convert: fix leaking config strings
  entry: fix leaking pathnames during delayed checkout
  object-name: fix leaking commit list items
  t/test-repository: fix leaking repository
  builtin/credential-cache: fix trivial leaks
  builtin/worktree: fix leaking derived branch names
  builtin/shortlog: fix various trivial memory leaks
  builtin/rerere: fix various trivial memory leaks
  builtin/credential-store: fix leaking credential
  builtin/show-branch: fix several memory leaks
  builtin/rev-parse: fix memory leak with `--parseopt`
  builtin/stash: fix various trivial memory leaks
  builtin/remote: fix various trivial memory leaks
  builtin/remote: fix leaking strings in `branch_list`
  builtin/ls-remote: fix leaking `pattern` strings
  builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
  builtin/submodule--helper: fix leaking clone depth parameter
  builtin/name-rev: fix various trivial memory leaks
  builtin/describe: fix trivial memory leak when describing blob
  ...
2024-08-14 14:54:47 -07:00
Patrick Steinhardt 77d4b3dd73 builtin/diff: free symmetric diff members
We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates which commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:02 -07:00
Patrick Steinhardt 0aaca0ec09 builtin/log: fix leak when showing converted blob contents
In `show_blob_object()`, we proactively call `textconv_object()`. In
case we have a textconv driver for this blob we will end up showing the
converted contents, otherwise we'll show the un-converted contents of it
instead.

When the object has been converted we never free the buffer containing
the converted contents. Fix this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt 1bc158e750 builtin/format-patch: fix various trivial memory leaks
There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt a0b82622cb builtin/fast-export: plug leaking tag names
When resolving revisions in `get_tags_and_duplicates()`, we only
partially manage the lifetime of `full_name`. In fact, managing its
lifetime properly is almost impossible because we put direct pointers to
that variable into multiple lists without duplicating the string. The
consequence is that these strings will ultimately leak.

Refactor the code to make the lists we put those names into duplicate
the memory. This allows us to properly free the string as required and
thus plugs the memory leak.

While this requires us to allocate more data overall, it shouldn't be
all that bad given that the number of allocations corresponds with the
number of command line parameters, which typically aren't all that many.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt 8ed4e96b5b builtin/fast-export: fix leaking diff options
Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
such that its contents aren't getting freed inside of `handle_commit()`.
We never unset that flag though, which means that the structure's
allocated resources will ultimately leak.

Fix this by unsetting the flag after the loop such that we release its
resources via `release_revisions()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt 0662f0dacb builtin/fast-import: plug trivial memory leaks
Plug some trivial memory leaks in git-fast-import(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt 187b623eef builtin/notes: fix leaking `struct notes_tree` when merging notes
We allocate a `struct notes_tree` in `merge_commit()` which we then
initialize via `init_notes()`. It's not really necessary to allocate the
structure though given that we never pass ownership to the caller.
Furthermore, the allocation leads to a memory leak because despite its
name, `free_notes()` doesn't free the `notes_tree` but only clears it.

Fix this issue by converting the code to use an on-stack variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt 1ca57bea4a builtin/rebase: fix leaking `commit.gpgsign` value
In `get_replay_opts()`, we override the `gpg_sign` field that already
got populated by `sequencer_init_config()` in case the user has
"commit.gpgsign" set in their config. This creates a memory leak because
we overwrite the previously assigned value, which may have already
pointed to an allocated string.

Let's plug the memory leak by freeing the value before we overwrite it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
Patrick Steinhardt 648abbe22d config: fix leaking comment character config
When the comment line character has been specified multiple times in the
configuration, then `git_default_core_config()` will cause a memory leak
because it unconditionally copies the string into `comment_line_str`
without free'ing the previous value. In fact, it can't easily free the
value in the first place because it may contain a string constant.

Refactor the code such that we track allocated comment character strings
via a separate non-constant variable `comment_line_str_to_free`. Adapt
sites that set `comment_line_str` to set both and free the old value
that was stored in `comment_line_str_to_free`.

This memory leak is being hit in t3404. As there are still other memory
leaks in that file we cannot yet mark it as passing with leak checking
enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
Patrick Steinhardt 7298bcc573 builtin/bundle: have unbundle check for repo before opening its bundle
The `git bundle unbundle` subcommand requires a repository to unbundle
the contents into. As thus, the subcommand checks whether we have a
startup repository in the first place, and if not it dies.

This check happens after we have already opened the bundle though. This
causes a segfault when running outside of a repository starting with
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07) because we have no hash function set up, but we do try to
parse refs advertised by the bundle's header.

The next commit will fix that underlying issue by defaulting to the SHA1
object format for bundles, which will also fix the described segfault here.
But as we know that we will die anyway, we can do better than that and
avoid some vain work by moving the check for a repository before we try
to open the bundle.

Reported-by: ArcticLampyrid <ArcticLampyrid@outlook.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:26:20 -07:00
Patrick Steinhardt 76fc9906f2 config: pass repo to functions that rename or copy sections
Refactor functions that rename or copy config sections to accept a
`struct repository` such that we can get rid of the implicit dependency
on `the_repository`. Rename the functions accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:04 -07:00
Patrick Steinhardt 0c2c37d16b config: pass repo to `git_die_config()`
Refactor `git_die_config()` to accept a `struct repository` such that we
can get rid of the implicit dependency on `the_repository`. Rename the
function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:04 -07:00
Patrick Steinhardt 87aace129e config: pass repo to `git_config_get_expiry()`
Refactor `git_config_get_expiry()` to accept a `struct repository` such
that we can get rid of the implicit dependency on `the_repository`.
Rename the function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:03 -07:00
Patrick Steinhardt be7537e6a9 config: pass repo to `git_config_get_split_index()`
Refactor `git_config_get_split_index()` to accept a `struct repository`
such that we can get rid of the implicit dependency on `the_repository`.
Rename the function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:03 -07:00
Patrick Steinhardt a973f60dc7 path: stop relying on `the_repository` in `worktree_git_path()`
When not provided a worktree, then `worktree_git_path()` will fall back
to returning a path relative to the main repository. In this case, we
implicitly rely on `the_repository` to derive the path. Remove this
dependency by passing a `struct repository` as parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
Patrick Steinhardt 78f2210b3c path: stop relying on `the_repository` when reporting garbage
We access `the_repository` in `report_linked_checkout_garbage()` both
directly and indirectly via `get_git_dir()`. Remove this dependency by
instead passing a `struct repository` as parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
Patrick Steinhardt 169c979771 hooks: remove implicit dependency on `the_repository`
We implicitly depend on `the_repository` in our hook subsystem because
we use `strbuf_git_path()` to compute hook paths. Remove this dependency
by accepting a `struct repository` as parameter instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
Junio C Hamano 4460e052e0 remerge-diff: clean up temporary objdir at a central place
After running a diff between two things, or a series of diffs while
walking the history, the diff computation is concluded by a call to
diff_result_code() to extract the exit status of the diff machinery.

The function can work on "struct diffopt", but all the callers
historically and currently pass "struct diffopt" that is embedded in
the "struct rev_info" that is used to hold the remerge_diff bit and
the remerge_objdir variable that points at the temporary object
directory in use.

Redefine diff_result_code() to take the whole "struct rev_info" to
give it an access to these members related to remerge-diff, so that
it can get rid of the temporary object directory for any and all
callers that used the feature.  We can lose the equivalent code to
do so from the code paths for individual commands, diff-tree, diff,
and log.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 15:42:40 -07:00
Junio C Hamano 245cac5c33 remerge-diff: lazily prepare temporary objdir on demand
It is error prone for each caller that sets revs.remerge_diff bit
to be responsible for preparing a temporary object directory and
rotate it into the list of alternate object stores, making it the
primary object store.

Instead, remove the code to set up and arrange the temporary object
directory from the current callers and implement it in the code that
runs remerge-diff logic.  The code to undo the futzing of the list
of alternate object store is still spread across the callers, but we
will deal with it in future steps.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 15:42:35 -07:00
John Cai e8207717f1 refs: add referent to each_ref_fn
Add a parameter to each_ref_fn so that callers to the ref APIs
that use this function as a callback can have acess to the
unresolved value of a symbolic ref.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:47:34 -07:00
Xing Xin a77554ea09 diff-tree: fix crash when used with --remerge-diff
When using "git-diff-tree" to get the tree diff for merge commits with
the diff format set to `remerge`, a bug is triggered as shown below:

  $ git diff-tree -r --remerge-diff 363337e6eb
  363337e6eb
  BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!?

This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check
added in commit 7b90ab467a (log: clean unneeded objects during log
--remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when
attempting to clean up temporary objects generated during the remerge
process.

After some further digging, I find that the remerge-related diff options
were introduced in db757e8b8d (show, log: provide a --remerge-diff
capability, 2022-02-02), which also affect the setup of `rev_info` for
"git-diff-tree", but were not accounted for in the original
implementation (inferred from the commit message).

Elijah Newren, the author of the remerge diff feature, notes that other
callers of `log-tree.c:log_tree_commit` (the only caller of
`log-tree.c:do_remerge_diff`) also exist, but:

  `builtin/am.c`: manually sets all flags; remerge_diff is not among them
  `sequencer.c`: manually sets all flags; remerge_diff is not among them

so `builtin/diff-tree.c` really is the only caller that was overlooked
when remerge-diff functionality was added.

This commit resolves the crash by adding `remerge_objdir` setup logic to
`builtin/diff-tree.c`, mirroring `builtin/log.c:cmd_log_walk_no_free`.
It also includes the necessary cleanup for `remerge_objdir`.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:07:44 -07:00
Junio C Hamano 028cf22904 Merge branch 'dd/notes-empty-no-edit-by-default'
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.

* dd/notes-empty-no-edit-by-default:
  notes: do not trigger editor when adding an empty note
2024-08-08 10:41:19 -07:00
shejialuo bf061d26c7 builtin/refs: add verify subcommand
Introduce a new subcommand "verify" in git-refs(1) to allow the user to
check the reference database consistency and also this subcommand will
be used as the entry point of checking refs for "git-fsck(1)".

Add "verbose" field into "fsck_options" to indicate whether we should
print verbose messages when checking refs and objects consistency.

Remove bit-field for "strict" field, this is because we cannot take
address of a bit-field which makes it unhandy to set member variables
when parsing the command line options.

The "git-fsck(1)" declares "fsck_options" variable with "static"
identifier which avoids complaint by the leak-checker. However, in
"git-refs verify", we need to do memory clean manually. Thus add
"fsck_options_clear" function in "fsck.c" to provide memory clean
operation.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:53 -07:00
shejialuo 0ec5dfe8c4 fsck: make "fsck_error" callback generic
The "fsck_error" callback is designed to report the objects-related
error messages. It accepts two parameter "oid" and "object_type" which
is not generic. In order to provide a unified callback which can report
either objects or refs, remove the objects-related parameters and add
the generic parameter "void *fsck_report".

Create a new "fsck_object_report" structure which incorporates the
removed parameters "oid" and "object_type". Then change the
corresponding references to adapt to new "fsck_error" callback.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
shejialuo 8cd4a447b8 fsck: rename objects-related fsck error functions
The names of objects-related fsck error functions are generic. It's OK
when there is only object database check. However, we are going to
introduce refs database check report function. To avoid ambiguity,
rename object-related fsck error functions to explicitly indicate these
functions are used to report objects-related messages.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
Patrick Steinhardt c369fc46d0 builtin/submodule: allow "add" to use different ref storage format
Same as with "clone", users may want to add a submodule to a repository
with a non-default ref storage format. Wire up a new `--ref-format=`
option that works the same as for `git submodule clone`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:22:21 -07:00
Patrick Steinhardt 69814846ab builtin/clone: propagate ref storage format to submodules
When recursively cloning a repository with a non-default ref storage
format, e.g. by passing the `--ref-format=` option, then only the
top-level repository will end up using that ref storage format, and
all recursively cloned submodules will instead use the default format.

While mixed-format constellations are expected to work alright, the
outcome still is somewhat surprising as we have essentially ignored
the user's request.

Fix this by propagating the requested ref format to cloned submodules.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:21:39 -07:00
Patrick Steinhardt 5ac781ad62 builtin/submodule: allow cloning with different ref storage format
As submodules are proper self-contained repositories, it is perfectly
valid for them to have a different ref storage format than their parent
repository. There is no obvious way for users to ask for the ref storage
format when initializing submodules though. Whether the setup of such
mixed-ref-storage-format constellations is all that useful remains to be
seen. But there is no good reason to not expose such an option, and we
will require it in a subsequent patch.

Introduce a new `--ref-format=` option for git-submodule(1) that allows
the user to pick the ref storage format. This option will also be used
in a subsequent commit, where we start to propagate the same flag from
git-clone(1) to cloning submodules with the `--recursive` switch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:20:49 -07:00
Junio C Hamano 90b801d8ff Merge branch 'ps/leakfixes-part-3' into ps/leakfixes-part-4
* ps/leakfixes-part-3: (24 commits)
  commit-reach: fix trivial memory leak when computing reachability
  convert: fix leaking config strings
  entry: fix leaking pathnames during delayed checkout
  object-name: fix leaking commit list items
  t/test-repository: fix leaking repository
  builtin/credential-cache: fix trivial leaks
  builtin/worktree: fix leaking derived branch names
  builtin/shortlog: fix various trivial memory leaks
  builtin/rerere: fix various trivial memory leaks
  builtin/credential-store: fix leaking credential
  builtin/show-branch: fix several memory leaks
  builtin/rev-parse: fix memory leak with `--parseopt`
  builtin/stash: fix various trivial memory leaks
  builtin/remote: fix various trivial memory leaks
  builtin/remote: fix leaking strings in `branch_list`
  builtin/ls-remote: fix leaking `pattern` strings
  builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
  builtin/submodule--helper: fix leaking clone depth parameter
  builtin/name-rev: fix various trivial memory leaks
  builtin/describe: fix trivial memory leak when describing blob
  ...
2024-08-06 12:40:41 -07:00
Taylor Blau fcb2205b77 midx: implement support for writing incremental MIDX chains
Now that the rest of the MIDX subsystem and relevant callers have been
updated to learn about how to read and process incremental MIDX chains,
let's finally update the implementation in `write_midx_internal()` to be
able to write incremental MIDX chains.

This new feature is available behind the `--incremental` option for the
`multi-pack-index` builtin, like so:

    $ git multi-pack-index write --incremental

The implementation for doing so is relatively straightforward, and boils
down to a handful of different kinds of changes implemented in this
patch:

  - The `compute_sorted_entries()` function is taught to reject objects
    which appear in any existing MIDX layer.

  - Functions like `write_midx_revindex()` are adjusted to write
    pack_order values which are offset by the number of objects in the
    base MIDX layer.

  - The end of `write_midx_internal()` is adjusted to move
    non-incremental MIDX files when necessary (i.e. when creating an
    incremental chain with an existing non-incremental MIDX in the
    repository).

There are a handful of other changes that are introduced, like new
functions to clear incremental MIDX files that are unrelated to the
current chain (using the same "keep_hash" mechanism as in the
non-incremental case).

The tests explicitly exercising the new incremental MIDX feature are
relatively limited for two reasons:

  1. Most of the "interesting" behavior is already thoroughly covered in
     t5319-multi-pack-index.sh, which handles the core logic of reading
     objects through a MIDX.

     The new tests in t5334-incremental-multi-pack-index.sh are mostly
     focused on creating and destroying incremental MIDXs, as well as
     stitching their results together across layers.

  2. A new GIT_TEST environment variable is added called
     "GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL", which modifies the
     entire test suite to write incremental MIDXs after repacking when
     combined with the "GIT_TEST_MULTI_PACK_INDEX" variable.

     This exercises the long tail of other interesting behavior that is
     defined implicitly throughout the rest of the CI suite. It is
     likewise added to the linux-TEST-vars job.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:39 -07:00
Taylor Blau 9552c3595a t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
Two years ago, commit ff1e653c8e (midx: respect
'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP', 2021-08-31) introduced a new
environment variable which caused the test suite to write MIDX bitmaps
after any 'git repack' invocation.

At the time, this was done to help flush out any bugs with MIDX bitmaps
that weren't explicitly covered in the t5326-multi-pack-bitmap.sh
script.

Two years later, that flag has served us well and is no longer providing
meaningful coverage, as the script in t5326 has matured substantially
and covers many more interesting cases than it did back when ff1e653c8e
was originally written.

Remove the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable
as it is no longer serving a useful purpose. More importantly, removing
this variable clears the way for us to introduce a new one to help
similarly flush out bugs related to incremental MIDX chains.

Because these incremental MIDX chains are (for now) incompatible with
MIDX bitmaps, we cannot have both.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
Kyle Lippincott b928d57ca9 set errno=0 before strtoX calls
To detect conversion failure after calls to functions like `strtod`, one
can check `errno == ERANGE`. These functions are not guaranteed to set
`errno` to `0` on successful conversion, however. Manual manipulation of
`errno` can likely be avoided by checking that the output pointer
differs from the input pointer, but that's not how other locations, such
as parse.c:139, handle this issue; they set errno to 0 prior to
executing the function.

For every place I could find a strtoX function with an ERANGE check
following it, set `errno = 0;` prior to executing the conversion
function.

Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-05 10:59:20 -07:00
Patrick Steinhardt 9e89dcb66a builtin/ls-remote: fall back to SHA1 outside of a repo
In c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), we have stopped setting the default hash algorithm for
`the_repository`. Consequently, code that relies on `the_hash_algo` will
now crash when it hasn't explicitly been initialized, which may be the
case when running outside of a Git repository.

It was reported that git-ls-remote(1) may crash in such a way when using
a remote helper that advertises refspecs. This is because the refspec
announced by the helper will get parsed during capability negotiation.
At that point we haven't yet figured out what object format the remote
uses though, so when run outside of a repository then we will fail.

The course of action is somewhat dubious in the first place. Ideally, we
should only parse object IDs once we have asked the remote helper for
the object format. And if the helper didn't announce the "object-format"
capability, then we should always assume SHA256. But instead, we used to
take either SHA1 if there was no repository, or we used the hash of the
local repository, which is wrong.

Arguably though, crashing hard may not be in the best interest of our
users, either. So while the old behaviour was buggy, let's restore it
for now as a short-term fix. We should eventually revisit, potentially
by deferring the point in time when we parse the refspec until after we
have figured out the remote's object hash.

Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-02 08:24:05 -07:00
Junio C Hamano 363337e6eb Merge branch 'as/show-ref-option-help-update'
A few descriptions in "git show-ref -h" have been clarified.

* as/show-ref-option-help-update:
  show-ref: improve short help messages of options
2024-08-01 10:18:12 -07:00
Patrick Steinhardt 145c979020 builtin/credential-cache: fix trivial leaks
There are two trivial leaks in git-credential-cache(1):

  - We leak the child process in `spawn_daemon()`. As we do not call
    `finish_command()` and instead let the created process daemonize, we
    have to clear the process manually.

  - We do not free the computed socket path in case it wasn't given via
    `--socket=`.

Plug both of these memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt cd6d7630fa builtin/worktree: fix leaking derived branch names
There are several heuristics that git-worktree(1) uses to derive the
name of the newly created branch when not given explicitly. These
heuristics all allocate a new string, but we only end up freeing that
string in a subset of cases.

Fix the remaining cases where we didn't yet free the derived branch
names. While at it, also free `opt_track`, which is being populated via
an `OPT_PASSTHRU()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt 06da42beec builtin/shortlog: fix various trivial memory leaks
There is a trivial memory leak in git-shortlog(1). Fix it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt 50ef4e09c3 builtin/rerere: fix various trivial memory leaks
There are multiple trivial memory leaks in git-rerere(1). Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt 1d615afa8d builtin/credential-store: fix leaking credential
We never free credentials read by the credential store, leading to a
memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt 11d6a81c01 builtin/show-branch: fix several memory leaks
There are several memory leaks in git-show-branch(1). Fix them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt 2d197e4a0f builtin/rev-parse: fix memory leak with `--parseopt`
The `--parseopt` mode allows shell scripts to have the same option
parsing mode as we have in C builtins. It soaks up a set of option
descriptions via stdin and massages them into proper `struct option`s
that we can then use to parse a set of arguments.

We only partially free those options when done though, creating a memory
leak. Interestingly, we only end up free'ing the first option's help,
which is of course wrong.

Fix this by freeing all option's help fields as well as their `argh`
fields to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt 2e875b6cb4 builtin/stash: fix various trivial memory leaks
There are multiple trivial memory leaks in git-stash(1). Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt fc68633352 builtin/remote: fix various trivial memory leaks
There are multiple trivial memory leaks in git-remote(1). Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt e06c1d1640 builtin/remote: fix leaking strings in `branch_list`
The `struct string_list branch_list` is declared as `NODUP`, which makes
it not copy strings inserted into it. This causes memory leaks though,
as this means it also won't be responsible for _freeing_ inserted
strings. Thus, every branch we add to this will leak.

Fix this by marking the list as `DUP` instead and free the local copy we
have of the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt 4119fc08e2 builtin/ls-remote: fix leaking `pattern` strings
Users can pass patterns to git-ls-remote(1), which allows them to filter
the list of printed references. We assemble those patterns into an array
and prefix them with "*/", but never free either the array nor the
allocated strings.

Refactor the code to use a `struct strvec` instead of manually tracking
the strings in an array. Like this, we can easily use `strvec_clear()`
to release both the vector and the contained string for us, plugging the
leak.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt 6771e2012e builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
The `rev` buffer in `is_tip_reachable()` is being populated with the
output of git-rev-list(1) -- if either the command fails or the buffer
contains any data, then the input commit is not reachable.

The buffer isn't used for anything else, but neither do we free it,
causing a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt 5535b3f3d3 builtin/submodule--helper: fix leaking clone depth parameter
The submodule helper supports a `--depth` parameter for both its "add"
and "clone" subcommands, which in both cases end up being forwarded to
git-clone(1). But while the former subcommand uses an `OPT_INTEGER()` to
parse the depth, the latter uses `OPT_STRING()`. Consequently, it is
possible to pass non-integer input to "--depth" when calling the "clone"
subcommand, where the value will then ultimately cause git-clone(1) to
bail out.

Besides the fact that the parameter verification should happen earlier,
the submodule helper infrastructure also internally tracks the depth via
a string. This requires us to convert the integer in the "add"
subcommand into an allocated string, and this string ultimately leaks.

Refactor the code to consistently track the clone depth as an integer.
This plugs the memory leak, simplifies the code and allows us to use
`OPT_INTEGER()` instead of `OPT_STRING()`, validating the input before
we shell out to git--clone(1).

Original-patch-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt ac3b143370 builtin/name-rev: fix various trivial memory leaks
There are several structures that we don't release after
`cmd_name_rev()` is done. Plug those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt ed041007f0 builtin/describe: fix trivial memory leak when describing blob
We never free the `struct strvec args` variable in `describe_blob()`,
which thus causes a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt 5a1e1e5d40 builtin/describe: fix leaking array when running diff-index
When running git-describe(1) with `--dirty`, we will set up a `struct
rev_info` with arguments for git-diff-index(1). The way we assemble the
arguments it causes two memory leaks though:

  - We never release the `struct strvec`.

  - `setup_revisions()` may end up removing some entries from the
    `strvec`, which we wouldn't free even if we released the struct.

While we could plug those leaks, this is ultimately unnecessary as the
arguments we pass are part of a static array anyway. So instead,
refactor the code to drop the `struct strvec` and just pass this static
array directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt 8e2e28799d builtin/describe: fix memory leak with `--contains=`
When calling `git describe --contains=`, we end up invoking
`cmd_name_rev()` with some munged argv array. This array may contain
allocated strings and furthermore will likely be modified by the called
function. This results in two memory leaks:

  - First, we leak the array that we use to assemble the arguments.

  - Second, we leak the allocated strings that we may have put into the
    array.

Fix those leaks by creating a separate copy of the array that we can
hand over to `cmd_name_rev()`. This allows us to free all strings
contained in the `strvec`, as the original vector will not be modified
anymore.

Furthermore, free both the `strvec` and the copied array to fix the
first memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt 7935a02613 builtin/log: fix leaking branch name when creating cover letters
When calling `make_cover_letter()` without a branch name, we try to
derive the branch name by calling `find_branch_name()`. But while this
function returns an allocated string, we never free the result and thus
have a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt 34968e56de builtin/replay: plug leaking `advance_name` variable
The `advance_name` variable can either contain a static string when
parsed via the `--advance` command line option or it may be an allocated
string when set via `determine_replay_mode()`. Because we cannot be sure
whether it is allocated or not we just didn't free it at all, resulting
in a memory leak.

Split up the variables such that we can track the static and allocated
strings separately and then free the allocated one to fix the memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:34 -07:00
Junio C Hamano d18eb5ba79 Merge branch 'tn/doc-commit-fix'
Docfix.

* tn/doc-commit-fix:
  doc: remove dangling closing parenthesis
2024-07-31 13:34:20 -07:00
Junio C Hamano 90139ae377 Merge branch 'jc/checkout-no-op-switch-errors'
"git checkout --ours" (no other arguments) complained that the
option is incompatible with branch switching, which is technically
correct, but found confusing by some users.  It now says that the
user needs to give pathspec to specify what paths to checkout.

* jc/checkout-no-op-switch-errors:
  checkout: special case error messages during noop switching
2024-07-31 13:34:18 -07:00
Junio C Hamano f084c50de6 Merge branch 'ad/merge-with-diff-algorithm'
Many Porcelain commands that internally use the merge machinery
were taught to consistently honor the diff.algorithm configuration.

* ad/merge-with-diff-algorithm:
  merge-recursive: honor diff.algorithm
2024-07-31 13:34:16 -07:00
Junio C Hamano a6e9429f72 patch-id: tighten code to detect the patch header
The get_one_patchid() function unconditionally takes a line that
matches the patch header (namely, a line that begins with a full
object name, possibly prefixed by "commit" or "From" plus a space)
as the beginning of a patch.  Even when it is *not* looking for one
(namely, when the previous call found the patch header and returned,
and then we are called again to skip the log message and process the
patch whose header was found by the previous invocation).

As a consequence, a line in the commit log message that begins with
one of these patterns can be mistaken to start another patch, with
current message entirely skipped (because we haven't even reached
the patch at all).

Allow the caller to tell us if it called us already and saw the
patch header (in which case we shouldn't be looking for another one,
until we see the "diff" part of the patch; instead we simply should
be skipping these lines as part of the commit log message), and skip
the header processing logic when that is the case.  In the helper
function, it also needs to flip this "are we looking for a header?"
bit, once it finished skipping the commit log message and started
processing the patches, as the patch header of the _next_ message is
the only clue in the input that the current patch is done.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 18:19:14 -07:00
Junio C Hamano 3f288b6faf patch-id: rewrite code that detects the beginning of a patch
The get_one_patchid() function reads input lines until it finds a
patch header (the line that begins a patch), whose beginning is one
of:

 (1) an "<object name>", which is what "git diff-tree --stdin" shows;
 (2) "commit <object name>", which is what "git log" shows; or
 (3) "From <object name>",  which is what "git log --format=email" shows.

When it finds such a line, it returns to the caller, reporting the
<object name> it found, and the size of the "patch" it processed.

The caller then calls the function again, which then ignores the
commit log message, and then processes the lines in the patch part
until it hits another "beginning of a patch".

The above logic was fairly easy to see until 2bb73ae8 (patch-id: use
starts_with() and skip_prefix(), 2016-05-28) reorganized the code,
which made another logic that has nothing to do with the "where does
the next patch begin?" logic, which came from 2485eab5
(git-patch-id: do not trip over "no newline" markers, 2011-02-17)
that ignores the "\ No newline at the end", rolled into the same
single if() statement.

Let's split it out.  The "\ No newline at the end" marker is part of
the patch, should not appear before we start reading the patch part,
and does not belong to the detection of patch header.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 18:19:14 -07:00
Junio C Hamano 2438294a13 patch-id: make get_one_patchid() more extensible
We pass two independent Boolean flags (i.e. do we want the stable
variant of patch-id?  do we want to hash the stuff verbatim?) into
the function as two separate parameters.  Before adding the third
one and make the interface even wider, let's consolidate them into
a single flag word.

No changes in behaviour.  Just a trivial interface change.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 18:19:14 -07:00
Junio C Hamano c92f3195ad patch-id: call flush_current_id() only when needed
The caller passes a flag that is used to become no-op when calling
flush_current_id().  Instead of calling something that becomes a
no-op, teach the caller not to call it in the first place.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 18:19:14 -07:00
David Disseldorp 8b426c84f3 notes: do not trigger editor when adding an empty note
With "git notes add -C $blob", the given blob contents are to be made
into a note without involving an editor.  But when "--allow-empty" is
given, the editor is invoked, which can cause problems for
non-interactive callers[1].

This behaviour started with 90bc19b3ae (notes.c: introduce
'--separator=<paragraph-break>' option, 2023-05-27), which changed
editor invocation logic to check for a zero length note_data buffer.

Restore the original behaviour of "git note" that takes the contents
given via the "-m", "-C", "-F" options without invoking an editor, by
checking for any prior parameter callbacks, indicated by a non-zero
note_data.msg_nr.  Remove the now-unneeded note_data.given flag.

Add a test for this regression by checking whether GIT_EDITOR is
invoked alongside "git notes add -C $empty_blob --allow-empty"

[1] https://github.com/ddiss/icyci/issues/12

Signed-off-by: David Disseldorp <ddiss@suse.de>
[jc: enhanced the test with -m/-F options]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-29 15:31:30 -07:00
Alexander Shopov 9885871248 show-ref: improve short help messages of options
Trivial change to indicate that branches and tags are real options
that can be used combined to get more information.  This helps with
linting translations and prompting the user that the terms represent
options.

Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-25 08:04:34 -07:00
Tomas Nordin 1c473dd6af doc: remove dangling closing parenthesis
The second line of the synopsis, starting with [--dry-run] has a
dangling closing paren in the second optional group. Probably added by
mistake, so remove it.

Signed-off-by: Tomas Nordin <tomasn@posteo.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-22 17:32:36 -07:00
Junio C Hamano 76e018b9a1 Merge branch 'js/var-git-shell-path'
"git var GIT_SHELL_PATH" should report the path to the shell used
to spawn external commands, but it didn't do so on Windows, which
has been corrected.

* js/var-git-shell-path:
  var(win32): do report the GIT_SHELL_PATH that is actually used
  run-command: declare the `git_shell_path()` function globally
  run-command(win32): resolve the path to the Unix shell early
  mingw(is_msys2_sh): handle forward slashes in the `sh.exe` path, too
  win32: override `fspathcmp()` with a directory separator-aware version
  strvec: declare the `strvec_push_nodup()` function globally
  run-command: refactor getting the Unix shell path into its own function
2024-07-17 10:47:27 -07:00
Junio C Hamano e13feda98f Merge branch 'kn/push-empty-fix'
"git push '' HEAD:there" used to hit a BUG(); it has been corrected
to die with "fatal: bad repository ''".

* kn/push-empty-fix:
  builtin/push: call set_refspecs after validating remote
2024-07-17 10:47:26 -07:00
Junio C Hamano b227482ea0 Merge branch 'as/describe-broken-refresh-index-fix'
"git describe --dirty --broken" forgot to refresh the index before
seeing if there is any chang, ("git describe --dirty" correctly did
so), which has been corrected.

* as/describe-broken-refresh-index-fix:
  describe: refresh the index when 'broken' flag is used
2024-07-15 10:11:40 -07:00
Antonin Delpeuch 9c93ba4d0a merge-recursive: honor diff.algorithm
The documentation claims that "recursive defaults to the diff.algorithm
config setting", but this is currently not the case. This fixes it,
ensuring that diff.algorithm is used when -Xdiff-algorithm is not
supplied. This affects the following porcelain commands: "merge",
"rebase", "cherry-pick", "pull", "stash", "log", "am" and "checkout".
It also affects the "merge-tree" ancillary interrogator.

This change refactors the initialization of merge options to introduce
two functions, "init_merge_ui_options" and "init_merge_basic_options"
instead of just one "init_merge_options". This design follows the
approach used in diff.c, providing initialization methods for
porcelain and plumbing commands respectively. Thanks to that, the
"replay" and "merge-recursive" plumbing commands remain unaffected by
diff.algorithm.

Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 18:10:49 -07:00
Johannes Schindelin 9ed143ee33 var(win32): do report the GIT_SHELL_PATH that is actually used
On Windows, Unix-like paths like `/bin/sh` make very little sense. In
the best case, they simply don't work, in the worst case they are
misinterpreted as absolute paths that are relative to the drive
associated with the current directory.

To that end, Git does not actually use the path `/bin/sh` that is
recorded e.g. when `run_command()` is called with a Unix shell
command-line. Instead, as of 776297548e (Do not use SHELL_PATH from
build system in prepare_shell_cmd on Windows, 2012-04-17), it
re-interprets `/bin/sh` as "look up `sh` on the `PATH` and use the
result instead".

This is the logic users expect to be followed when running `git var
GIT_SHELL_PATH`.

However, when 1e65721227 (var: add support for listing the shell,
2023-06-27) introduced support for `git var GIT_SHELL_PATH`, Windows was
not special-cased as above, which is why it outputs `/bin/sh` even
though that disagrees with what Git actually uses.

Let's fix this by using the exact same logic as `prepare_shell_cmd()`,
adjusting the Windows-specific `git var GIT_SHELL_PATH` test case to
verify that it actually finds a working executable.

Reported-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-13 16:23:37 -07:00
Karthik Nayak 757c6ee7a3 builtin/push: call set_refspecs after validating remote
When an end-user runs "git push" with an empty string for the remote
repository name, e.g.

    $ git push '' main

"git push" fails with a BUG(). Even though this is a nonsense request
that we want to fail, we shouldn't hit a BUG().  Instead we want to give
a sensible error message, e.g., 'bad repository'".

This is because since 9badf97c42 (remote: allow resetting url list,
2024-06-14), we reset the remote URL if the provided URL is empty. When
a user of 'remotes_remote_get' tries to fetch a remote with an empty
repo name, the function initializes the remote via 'make_remote'. But
the remote is still not a valid remote, since the URL is empty, so it
tries to add the URL alias using 'add_url_alias'. This in-turn will call
'add_url', but since the URL is empty we call 'strvec_clear' on the
`remote->url`. Back in 'remotes_remote_get', we again check if the
remote is valid, which fails, so we return 'NULL' for the 'struct
remote *' value.

The 'builtin/push.c' code, calls 'set_refspecs' before validating the
remote. This worked with empty repo names earlier since we would get a
remote, albeit with an empty URL. With the new changes, we get a 'NULL'
remote value, this causes the check for remote to fail and raises the
BUG in 'set_refspecs'.

Do a simple fix by doing remote validation first. Also add a test to
validate the bug fix. With this, we can also now directly pass remote to
'set_refspecs' instead of it trying to lazily obtain it.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 09:14:11 -07:00
Junio C Hamano e6ae4d6efe Merge branch 'rs/simplify-submodule-helper-super-prefix-invocation'
Code clean-up.

* rs/simplify-submodule-helper-super-prefix-invocation:
  submodule--helper: use strvec_pushf() for --super-prefix
2024-07-12 08:41:58 -07:00
Junio C Hamano 3997614c24 Merge branch 'ps/leakfixes-more'
More memory leaks have been plugged.

* ps/leakfixes-more: (29 commits)
  builtin/blame: fix leaking ignore revs files
  builtin/blame: fix leaking prefixed paths
  blame: fix leaking data for blame scoreboards
  line-range: plug leaking find functions
  merge: fix leaking merge bases
  builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
  sequencer: fix memory leaks in `make_script_with_merges()`
  builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
  apply: fix leaking string in `match_fragment()`
  sequencer: fix leaking string buffer in `commit_staged_changes()`
  commit: fix leaking parents when calling `commit_tree_extended()`
  config: fix leaking "core.notesref" variable
  rerere: fix various trivial leaks
  builtin/stash: fix leak in `show_stash()`
  revision: free diff options
  builtin/log: fix leaking commit list in git-cherry(1)
  merge-recursive: fix memory leak when finalizing merge
  builtin/merge-recursive: fix leaking object ID bases
  builtin/difftool: plug memory leaks in `run_dir_diff()`
  object-name: free leaking object contexts
  ...
2024-07-08 14:53:10 -07:00
Junio C Hamano d1e6c61272 checkout: special case error messages during noop switching
"git checkout" ran with no branch and no pathspec behaves like
switching the branch to the current branch (in other words, a
no-op, except that it gives a side-effect "here are the modified
paths" report).  But unlike "git checkout HEAD" or "git checkout
main" (when you are on the 'main' branch), the user is much less
conscious that they are "switching" to the current branch.

This twists end-user expectation in a strange way.  There are
options (like "--ours") that make sense only when we are checking
out paths out of either the tree-ish or out of the index.  So the
error message the command below gives

    $ git checkout --ours
    fatal: '--ours/theirs' cannot be used with switching branches

is technically correct, but because the end-user may not even be
aware of the fact that the command they are issuing is about no-op
branch switching [*], they may find the error confusing.

Let's refactor the code to make it easier to special case the "no-op
branch switching" situation, and then customize the exact error
message for "--ours/--theirs".  Since it is more likely that the
end-user forgot to give pathspec that is required by the option,
let's make it say

    $ git checkout --ours
    fatal: '--ours/theirs' needs the paths to check out

instead.

Among the other options that are incompatible with branch switching,
there may be some that benefit by having messages tweaked when a
no-op branch switching is done, but I'll leave them as #leftoverbits
material.

[Footnote]

 * Yes, the end-users are irrational.  When they did not give
   "--ours", they take it granted that "git checkout" gives a short
   status, e.g..

    $ git checkout
    M	builtin/checkout.c
    M	t/t7201-co.sh

   exactly as a branch switching command.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 13:53:56 -07:00
Junio C Hamano ca463101c8 Merge branch 'jk/remote-wo-url'
Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.

* jk/remote-wo-url:
  remote: drop checks for zero-url case
  remote: always require at least one url in a remote
  t5801: test remote.*.vcs config
  t5801: make remote-testgit GIT_DIR setup more robust
  remote: allow resetting url list
  config: document remote.*.url/pushurl interaction
  remote: simplify url/pushurl selection
  remote: use strvecs to store remote url/pushurl
  remote: transfer ownership of memory in add_url(), etc
  remote: refactor alias_url() memory ownership
  archive: fix check for missing url
2024-07-02 09:59:01 -07:00
Junio C Hamano 7b472da915 Merge branch 'ps/use-the-repository'
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.

* ps/use-the-repository:
  hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
  t/helper: remove dependency on `the_repository` in "proc-receive"
  t/helper: fix segfault in "oid-array" command without repository
  t/helper: use correct object hash in partial-clone helper
  compat/fsmonitor: fix socket path in networked SHA256 repos
  replace-object: use hash algorithm from passed-in repository
  protocol-caps: use hash algorithm from passed-in repository
  oidset: pass hash algorithm when parsing file
  http-fetch: don't crash when parsing packfile without a repo
  hash-ll: merge with "hash.h"
  refs: avoid include cycle with "repository.h"
  global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
  hash: require hash algorithm in `empty_tree_oid_hex()`
  hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
  hash: make `is_null_oid()` independent of `the_repository`
  hash: convert `oidcmp()` and `oideq()` to compare whole hash
  global: ensure that object IDs are always padded
  hash: require hash algorithm in `oidread()` and `oidclr()`
  hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
  hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
2024-07-02 09:59:00 -07:00
Junio C Hamano 3e50dfdfc9 Merge branch 'pw/rebase-i-error-message' into maint-2.45
When the user adds to "git rebase -i" instruction to "pick" a merge
commit, the error experience is not pleasant.  Such an error is now
caught earlier in the process that parses the todo list.

* pw/rebase-i-error-message:
  rebase -i: improve error message when picking merge
  rebase -i: pass struct replay_opts to parse_insn_line()
2024-07-02 09:27:56 -07:00
Junio C Hamano f13710e32e Merge branch 'ds/format-patch-rfc-and-k' into maint-2.45
The "-k" and "--rfc" options of "format-patch" will now error out
when used together, as one tells us not to add anything to the
title of the commit, and the other one tells us to add "RFC" in
addition to "PATCH".

* ds/format-patch-rfc-and-k:
  format-patch: ensure that --rfc and -k are mutually exclusive
2024-07-02 09:27:56 -07:00
René Scharfe 4b837f821e submodule--helper: use strvec_pushf() for --super-prefix
Use the strvec_pushf() call that already appends a slash to also produce
the stuck form of the option --super-prefix instead of adding the option
name in a separate call of strvec_push() or strvec_pushl().  This way we
can more easily see that these parts make up a single option with its
argument and save a function call.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 12:18:22 -07:00
Junio C Hamano 7b7db54b83 Merge branch 'rs/difftool-env-simplify' into maint-2.45
Code simplification.

* rs/difftool-env-simplify:
  difftool: add env vars directly in run_file_diff()
2024-06-28 15:53:16 -07:00
Junio C Hamano 1b1b4d490d Merge branch 'js/for-each-repo-keep-going' into maint-2.45
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out.  Now it keeps going.

* js/for-each-repo-keep-going:
  maintenance: running maintenance should not stop on errors
  for-each-repo: optionally keep going on an error
2024-06-28 15:53:08 -07:00
Junio C Hamano 2a78de0d9f Merge branch 'aj/stash-staged-fix' into maint-2.45
"git stash -S" did not handle binary files correctly, which has
been corrected.

* aj/stash-staged-fix:
  stash: fix "--staged" with binary files
2024-06-28 15:53:07 -07:00
Junio C Hamano a41463e437 Merge branch 'xx/disable-replace-when-building-midx' into maint-2.45
The procedure to build multi-pack-index got confused by the
replace-refs mechanism, which has been corrected by disabling the
latter.

* xx/disable-replace-when-building-midx:
  midx: disable replace objects
2024-06-28 15:53:07 -07:00
Junio C Hamano 6c0bfce914 Merge branch 'kz/merge-fail-early-upon-refresh-failure'
When "git merge" sees that the index cannot be refreshed (e.g. due
to another process doing the same in the background), it died but
after writing MERGE_HEAD etc. files, which was useless for the
purpose to recover from the failure.

* kz/merge-fail-early-upon-refresh-failure:
  merge: avoid write merge state when unable to write index
2024-06-27 09:19:58 -07:00
Abhijeet Sonar b8ae42e292 describe: refresh the index when 'broken' flag is used
When describe is run with 'dirty' flag, we refresh the index
to make sure it is in sync with the filesystem before
determining if the working tree is dirty.  However, this is
not done for the codepath where the 'broken' flag is used.

This causes `git describe --broken --dirty` to false
positively report the worktree being dirty if a file has
different stat info than what is recorded in the index.
Running `git update-index -q --refresh` to refresh the index
before running diff-index fixes the problem.

Also add tests to deliberately update stat info of a
file before running describe to verify it behaves correctly.

Reported-by: Paul Millar <paul.millar@desy.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 13:04:08 -07:00
Junio C Hamano 2c4aa7ad74 Merge branch 'jc/add-i-retire-usebuiltin-config'
For over a year, setting add.interactive.useBuiltin configuration
variable did nothing but giving a "this does not do anything"
warning.  Finally remove it.

* jc/add-i-retire-usebuiltin-config:
  add-i: finally retire add.interactive.useBuiltin
2024-06-24 16:39:14 -07:00
Junio C Hamano ffa47b75cf Merge branch 'tb/pseudo-merge-reachability-bitmap'
The pseudo-merge reachability bitmap to help more efficient storage
of the reachability bitmap in a repository with too many refs has
been added.

* tb/pseudo-merge-reachability-bitmap: (26 commits)
  pack-bitmap.c: ensure pseudo-merge offset reads are bounded
  Documentation/technical/bitmap-format.txt: add missing position table
  t/perf: implement performance tests for pseudo-merge bitmaps
  pseudo-merge: implement support for finding existing merges
  ewah: `bitmap_equals_ewah()`
  pack-bitmap: extra trace2 information
  pack-bitmap.c: use pseudo-merges during traversal
  t/test-lib-functions.sh: support `--notick` in `test_commit_bulk()`
  pack-bitmap: implement test helpers for pseudo-merge
  ewah: implement `ewah_bitmap_popcount()`
  pseudo-merge: implement support for reading pseudo-merge commits
  pack-bitmap.c: read pseudo-merge extension
  pseudo-merge: scaffolding for reads
  pack-bitmap: extract `read_bitmap()` function
  pack-bitmap-write.c: write pseudo-merge table
  pseudo-merge: implement support for selecting pseudo-merge commits
  config: introduce `git_config_double()`
  pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public
  pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()`
  pack-bitmap-write: support storing pseudo-merge commits
  ...
2024-06-24 16:39:13 -07:00
Junio C Hamano 892fd8b89f Merge branch 'jc/heads-are-branches'
The "--heads" option of "ls-remote" and "show-ref" has been been
deprecated; "--branches" replaces "--heads".

* jc/heads-are-branches:
  show-ref: introduce --branches and deprecate --heads
  ls-remote: introduce --branches and deprecate --heads
  refs: call branches branches
2024-06-20 15:45:17 -07:00
Junio C Hamano 83ac567781 Merge branch 'pw/rebase-i-error-message'
When the user adds to "git rebase -i" instruction to "pick" a merge
commit, the error experience is not pleasant.  Such an error is now
caught earlier in the process that parses the todo list.

* pw/rebase-i-error-message:
  rebase -i: improve error message when picking merge
  rebase -i: pass struct replay_opts to parse_insn_line()
2024-06-20 15:45:15 -07:00
Junio C Hamano 9071453ef6 Merge branch 'rj/format-patch-auto-cover-with-interdiff'
"git format-patch --interdiff" for multi-patch series learned to
turn on cover letters automatically (unless told never to enable
cover letter with "--no-cover-letter" and such).

* rj/format-patch-auto-cover-with-interdiff:
  format-patch: assume --cover-letter for diff in multi-patch series
  t4014: cleanups in a few tests
2024-06-20 15:45:12 -07:00
Junio C Hamano 5f14d20984 Merge branch 'kn/update-ref-symref'
"git update-ref --stdin" learned to handle transactional updates of
symbolic-refs.

* kn/update-ref-symref:
  update-ref: add support for 'symref-update' command
  reftable: pick either 'oid' or 'target' for new updates
  update-ref: add support for 'symref-create' command
  update-ref: add support for 'symref-delete' command
  update-ref: add support for 'symref-verify' command
  refs: specify error for regular refs with `old_target`
  refs: create and use `ref_update_expects_existing_old_ref()`
2024-06-20 15:45:12 -07:00
Kyle Zhao 2e5a636593 merge: avoid write merge state when unable to write index
Writing the merge state after the index write fails is meaningless and
could potentially cause Git to lose changes.

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-18 08:13:35 -07:00
Junio C Hamano 4216329457 Merge branch 'ps/no-writable-strings'
Building with "-Werror -Wwrite-strings" is now supported.

* ps/no-writable-strings: (27 commits)
  config.mak.dev: enable `-Wwrite-strings` warning
  builtin/merge: always store allocated strings in `pull_twohead`
  builtin/rebase: always store allocated string in `options.strategy`
  builtin/rebase: do not assign default backend to non-constant field
  imap-send: fix leaking memory in `imap_server_conf`
  imap-send: drop global `imap_server_conf` variable
  mailmap: always store allocated strings in mailmap blob
  revision: always store allocated strings in output encoding
  remote-curl: avoid assigning string constant to non-const variable
  send-pack: always allocate receive status
  parse-options: cast long name for OPTION_ALIAS
  http: do not assign string constant to non-const field
  compat/win32: fix const-correctness with string constants
  pretty: add casts for decoration option pointers
  object-file: make `buf` parameter of `index_mem()` a constant
  object-file: mark cached object buffers as const
  ident: add casts for fallback name and GECOS
  entry: refactor how we remove items for delayed checkouts
  line-log: always allocate the output prefix
  line-log: stop assigning string constant to file parent buffer
  ...
2024-06-17 15:55:58 -07:00
Junio C Hamano 42b8b5bfd0 Merge branch 'jk/am-retry'
"git am" has a safety feature to prevent it from starting a new
session when there already is a session going.  It reliably
triggers when a mbox is given on the command line, but it has to
rely on the tty-ness of the standard input.  Add an explicit way to
opt out of this safety with a command line option.

* jk/am-retry:
  test-terminal: drop stdin handling
  am: add explicit "--retry" option
2024-06-17 15:55:56 -07:00
Junio C Hamano 40a163f217 Merge branch 'ps/ref-storage-migration'
A new command has been added to migrate a repository that uses the
files backend for its ref storage to use the reftable backend, with
limitations.

* ps/ref-storage-migration:
  builtin/refs: new command to migrate ref storage formats
  refs: implement logic to migrate between ref storage formats
  refs: implement removal of ref storages
  worktree: don't store main worktree twice
  reftable: inline `merged_table_release()`
  refs/files: fix NULL pointer deref when releasing ref store
  refs/files: extract function to iterate through root refs
  refs/files: refactor `add_pseudoref_and_head_entries()`
  refs: allow to skip creation of reflog entries
  refs: pass storage format to `ref_store_init()` explicitly
  refs: convert ref storage format to an enum
  setup: unset ref storage when reinitializing repository version
2024-06-17 15:55:55 -07:00
Patrick Steinhardt f2c32a66f5 oidset: pass hash algorithm when parsing file
The `oidset_parse_file_carefully()` function implicitly depends on
`the_repository` when parsing object IDs. Fix this by having callers
pass in the hash algorithm to use.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:34 -07:00
Patrick Steinhardt e7da938570 global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.

It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.

Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit

For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).

Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
Patrick Steinhardt 7abbca0e74 hash: require hash algorithm in `empty_tree_oid_hex()`
The `empty_tree_oid_hex()` function use `the_repository` to derive the
hash function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.

While at it, remove the unused `empty_blob_oid_hex()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
Patrick Steinhardt 9c34eb93fb hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
Both functions `is_empty_{blob,tree}_oid()` use `the_repository` to
derive the hash function that shall be used. Require callers to pass in
the hash algorithm to get rid of this implicit dependency.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
Patrick Steinhardt 9da95bda74 hash: require hash algorithm in `oidread()` and `oidclr()`
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash
function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:32 -07:00
Patrick Steinhardt f4836570a7 hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
Many of our hash functions have two variants, one receiving a `struct
git_hash_algo` and one that derives it via `the_repository`. Adapt all
of those functions to always require the hash algorithm as input and
drop the variants that do not accept one.

As those functions are now independent of `the_repository`, we can move
them from "hash.h" to "hash-ll.h".

Note that both in this and subsequent commits in this series we always
just pass `the_repository->hash_algo` as input even if it is obvious
that there is a repository in the context that we should be using the
hash from instead. This is done to be on the safe side and not introduce
any regressions. All callsites should eventually be amended to use a
repo passed via parameters, but this is outside the scope of this patch
series.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:32 -07:00
Jeff King aecd794fca remote: drop checks for zero-url case
Now that the previous commit removed the possibility that a "struct
remote" will ever have zero url fields, we can drop a number of
redundant checks and untriggerable code paths.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:39 -07:00
Jeff King b68118d2e8 remote: simplify url/pushurl selection
When we want to know the push urls for a remote, there is some simple
logic:

  - if the user configured any remote.*.pushurl keys, then those make
    the complete set of push urls

  - otherwise we push to all urls in remote.*.url

Many spots implement this with a level of indirection, assigning to a
local url/url_nr pair. But since both arrays are now strvecs, we can
just use a pointer to select the appropriate strvec, shortening the code
a bit.

Even though this is now a one-liner, since it is application logic that
is present in so many places, it's worth abstracting a helper function.
In fact, we already have such a function, but it's local to
builtin/push.c. So we'll just make it available everywhere via remote.h.

There are two spots to pay special attention to here:

  1. in builtin/remote.c's get_url(), we are selecting first based on
     push_mode and then falling back to "url" when we're in push_mode
     but no pushurl is defined. The updated code makes that much more
     clear, compared to the original which had an "else" fall-through.

  2. likewise in that file's set_url(), we _only_ respect push_mode,
     sine the point is that we are adding to pushurl in that case
     (whether it is empty or not). And thus it does not use our helper
     function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
Jeff King 8e804415fd remote: use strvecs to store remote url/pushurl
Now that the url/pushurl fields of "struct remote" own their strings, we
can switch from bare arrays to strvecs. This has a few advantages:

  - push/clear are now one-liners

  - likewise the free+assigns in alias_all_urls() can use
    strvec_replace()

  - we now use size_t for storage, avoiding possible overflow

  - this will enable some further cleanups in future patches

There's quite a bit of fallout in the code that reads these fields, as
it tends to access these arrays directly. But it's mostly a mechanical
replacement of "url_nr" with "url.nr", and "url[i]" with "url.v[i]",
with a few variations (e.g. "*url" could become "*url.v", but I used
"url.v[0]" for consistency).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
Jeff King 0295ce7cbf archive: fix check for missing url
Running "git archive --remote" checks that we have at least one url for
the remote. It does so by looking at remote.url[0], but that won't work;
if we have no url at all, then remote.url will be NULL, and we'll
segfault.

Check url_nr instead, which is a more direct way of asking what we
want.

You can trigger the segfault like this:

  git -c remote.foo.vcs=bar archive --remote=foo

but I didn't bother adding a test. This is the tip of the iceberg for
no-url remotes, and a later patch will improve that situation. I just
wanted to clean up this bug so it didn't make further refactoring of
this code more confusing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:37 -07:00
Junio C Hamano 092b33da2b Merge branch 'ps/ref-storage-migration' into ps/use-the-repository
* ps/ref-storage-migration:
  builtin/refs: new command to migrate ref storage formats
  refs: implement logic to migrate between ref storage formats
  refs: implement removal of ref storages
  worktree: don't store main worktree twice
  reftable: inline `merged_table_release()`
  refs/files: fix NULL pointer deref when releasing ref store
  refs/files: extract function to iterate through root refs
  refs/files: refactor `add_pseudoref_and_head_entries()`
  refs: allow to skip creation of reflog entries
  refs: pass storage format to `ref_store_init()` explicitly
  refs: convert ref storage format to an enum
  setup: unset ref storage when reinitializing repository version
2024-06-13 09:39:08 -07:00
Junio C Hamano 51ea70c18a Merge branch 'jk/sparse-leakfix'
Many memory leaks in the sparse-checkout code paths have been
plugged.

* jk/sparse-leakfix:
  sparse-checkout: free duplicate hashmap entries
  sparse-checkout: free string list after displaying
  sparse-checkout: free pattern list in sparse_checkout_list()
  sparse-checkout: free sparse_filename after use
  sparse-checkout: refactor temporary sparse_checkout_patterns
  sparse-checkout: always free "line" strbuf after reading input
  sparse-checkout: reuse --stdin buffer when reading patterns
  dir.c: always copy input to add_pattern()
  dir.c: free removed sparse-pattern hashmap entries
  sparse-checkout: clear patterns when init() sees existing sparse file
  dir.c: free strings in sparse cone pattern hashmaps
  sparse-checkout: pass string literals directly to add_pattern()
  sparse-checkout: free string list in write_cone_to_file()
2024-06-12 13:37:17 -07:00
Junio C Hamano 22cf18fd9e Merge branch 'gt/t-hash-unit-test'
A pair of test helpers that essentially are unit tests on hash
algorithms have been rewritten using the unit-tests framework.

* gt/t-hash-unit-test:
  t/: migrate helper/test-{sha1, sha256} to unit-tests/t-hash
  strbuf: introduce strbuf_addstrings() to repeatedly add a string
2024-06-12 13:37:15 -07:00
Patrick Steinhardt fbf7a46d88 builtin/blame: fix leaking ignore revs files
When parsing the blame configuration we add "blame.ignoreRevsFile"
configs to a string list. This string list is declared as with `NODUP`,
and thus we hand over the allocated string to that list. We eventually
end up calling `string_list_clear()` on that list, but due to it being
declared as `NODUP` we will not release the associated strings and thus
leak memory.

Fix this issue by setting up the list as `DUP` instead and free the
config string after insertion.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
Patrick Steinhardt 3332f35577 builtin/blame: fix leaking prefixed paths
In `cmd_blame()` we compute prefixed paths by calling `add_prefix()`,
which itself calls `prefix_path()`. While `prefix_path()` returns an
allocated string, `add_prefix()` pretends to return a constant string.
Consequently, this path never gets freed.

Fix the return type to be `char *` and free the path to plug the memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
Patrick Steinhardt 44ec7c575f merge: fix leaking merge bases
When calling either the recursive or the ORT merge machineries we need
to provide a list of merge bases. The ownership of that parameter is
then implicitly transferred to the callee, which is somewhat fishy.
Furthermore, that list may leak in some cases where the merge machinery
runs into an error, thus causing a memory leak.

Refactor the code such that we stop transferring ownership. Instead, the
merge machinery will now create its own local copies of the passed in
list as required if they need to modify the list. Free the list at the
callsites as required.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
Patrick Steinhardt 77241a6b5e builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
In "builtin/merge.c" we use the helper infrastructure to figure out what
merge strategies there are. We never free contents of the `cmdnames`
structures though and thus leak their memory.

Fix this by exposing the already existing `clean_cmdnames()` function to
release their memory. As this name isn't quite idiomatic, rename it to
`cmdnames_release()` while at it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt 8909d6e1a1 builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
In `wanted_peer_refs()` we first create a copy of the "HEAD" ref. This
copy may not actually be passed back to the caller, but is not getting
freed in this case. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt 63c9bd372e commit: fix leaking parents when calling `commit_tree_extended()`
When creating commits via `commit_tree_extended()`, the caller passes in
a string list of parents. This call implicitly transfers ownership of
that list to the function, which is quite surprising to begin with. But
to make matters worse, `commit_tree_extended()` doesn't even bother to
free the list of parents in error cases. The result is a memory leak,
and one that the caller cannot fix by themselves because they do not
know whether parts of the string list have already been released.

Refactor the code such that callers can keep ownership of the list of
parents, which is getting indicated by parameter being a constant
pointer now. Free the lists at the calling site and add a common exit
path to those sites as required.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt 748bd0943b builtin/stash: fix leak in `show_stash()`
We leak the `revision_args()` variable. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt a90a089611 revision: free diff options
There is a todo comment in `release_revisions()` that mentions that we
need to free the diff options, which was added via 54c8a7c379 (revisions
API: add a TODO for diff_free(&revs->diffopt), 2022-04-14). Releasing
the diff options wasn't quite feasible at that time because some call
sites rely on its contents to remain even after the revisions have been
released.

In fact, there really only are a couple of callsites that misbehave
here:

  - `cmd_shortlog()` releases the revisions, but continues to access its
    file pointer.

  - `do_diff_cache()` creates a shallow copy of `struct diff_options`,
    but does not set the `no_free` member. Consequently, we end up
    releasing resources of the caller-provided diff options.

  - `diff_free()` and friends do not play nice when being called
    multiple times as they don't unset data structures that they have
    just released.

Fix all of those cases and enable the call to `diff_free()`, which plugs
a bunch of memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt a282dbeba7 builtin/log: fix leaking commit list in git-cherry(1)
We're storing the list of commits that git-cherry(1) is about to print
into a temporary list. This list is never getting free'd and thus leaks.
Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt 3199b22e7d builtin/merge-recursive: fix leaking object ID bases
In `cmd_merge_recursive()` we have a static array of object ID bases
that we pass to `merge_recursive_generic()`. This interface is somewhat
weird though because the latter function accepts a pointer to a pointer
of object IDs, which requires us to allocate the object IDs on the heap.
And as we never free those object IDs, the end result is a leak.

While we can easily solve this leak by just freeing the respective
object IDs, the whole calling convention is somewhat weird. Instead,
refactor `merge_recursive_generic()` to accept a plain pointer to object
IDs so that we can avoid allocating them altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt 9e903a5531 builtin/difftool: plug memory leaks in `run_dir_diff()`
We're leaking a bunch of memory leaks in `run_dir_diff()`. Plug them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt f87c55c264 object-name: free leaking object contexts
While it is documented in `struct object_context::path` that this
variable needs to be released by the caller, this fact is rather easy to
miss given that we do not ever provide a function to release the object
context. And of course, while some callers dutifully release the path,
many others don't.

Introduce a new `object_context_release()` function that releases the
path. Convert callsites that used to free the path to use that new
function and add missing calls to callsites that were leaking memory.
Refactor those callsites as required to have a single return path, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt 61f8bb1ec1 builtin/rev-list: fix leaking bitmap index when calculating disk usage
git-rev-list(1) can speed up its object size calculations for reachable
objects via a bitmap walk, if there is any bitmap. This is done in
`try_bitmap_disk_usage()`, which tries to optimistically load the bitmap
and then use it, if available. It never frees it though, leading to a
memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt afb0653d23 biultin/rev-parse: fix memory leaks in `--parseopt` mode
We have a bunch of memory leaks in git-rev-parse(1)'s `--parseopt` mode.
Refactor the code to use `struct strvec`s to make it easier for us to
track the lifecycle of those leaking variables and then free them.

While at it, remove the unneeded static lifetime for some of the
variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt 14da26230a parse-options: fix leaks for users of OPT_FILENAME
The `OPT_FILENAME()` option will, if set, put an allocated string into
the user-provided variable. Consequently, that variable thus needs to be
free'd by the caller of `parse_options()`. Some callsites don't though
and thus leak memory. Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:04 -07:00
Junio C Hamano 5235e56ea5 Merge branch 'jk/leakfixes'
Memory leaks in "git mv" has been plugged.

* jk/leakfixes:
  mv: replace src_dir with a strvec
  mv: factor out empty src_dir removal
  mv: move src_dir cleanup to end of cmd_mv()
  t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL
  t-strvec: use va_end() to match va_start()
2024-06-10 10:30:39 -07:00
Rubén Justo f96c385449 format-patch: assume --cover-letter for diff in multi-patch series
When we deal with a multi-patch series in git-format-patch(1), if we see
`--interdiff` or `--range-diff` but no `--cover-letter`, we return with
an error, saying:

    fatal: --range-diff requires --cover-letter or single patch

or:

    fatal: --interdiff requires --cover-letter or single patch

This makes sense because the cover-letter is where we place the diff
from the previous version.

However, considering that `format-patch` generates a multi-patch as
needed, let's adopt a similar "cover as necessary" approach when using
`--interdiff` or `--range-diff`.

Therefore, relax the requirement for an explicit `--cover-letter` in a
multi-patch series when the user says `--iterdiff` or `--range-diff`.

Still, if only to return the error, respect "format.coverLetter=no" and
`--no-cover-letter`.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 14:02:13 -07:00
Patrick Steinhardt 71e01a0ebd builtin/merge: always store allocated strings in `pull_twohead`
The `pull_twohead` configuration may sometimes contain an allocated
string, and sometimes it may contain a string constant. Refactor this to
instead always store an allocated string such that we can release its
resources without risk.

While at it, manage the lifetime of other config strings, as well. Note
that we explicitly don't free `cleanup_arg` here. This is because the
variable may be assigned a string constant via command line options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:56 -07:00
Patrick Steinhardt fc06676766 builtin/rebase: always store allocated string in `options.strategy`
The `struct rebase_options::strategy` field is a `char *`, but we do end
up assigning string constants to it in two cases:

  - When being passed a `--strategy=` option via the command line.

  - When being passed a strategy option via `--strategy-option=`, but
    not a strategy.

This will cause warnings once we enable `-Wwrite-strings`.

Ideally, we'd just convert the field to be a `const char *`. But we also
assign to this field via the GIT_TEST_MERGE_ALGORITHM envvar, which we
have to strdup(3P) into it.

Instead, refactor the code to make sure that we only ever assign
allocated strings to this field.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:55 -07:00
Patrick Steinhardt 25a47ffac0 builtin/rebase: do not assign default backend to non-constant field
The `struct rebase_options::default_backend` field is a non-constant
string, but is being assigned a constant via `REBASE_OPTIONS_INIT`.
Fix this by using `xstrdup()` to assign the variable and introduce a new
function `rebase_options_release()` that releases memory held by the
structure, including the newly-allocated variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:55 -07:00
Patrick Steinhardt 5bd0851d97 send-pack: always allocate receive status
In `receive_status()`, we record the reason why ref updates have been
rejected by the remote via the `remote_status`. But while we allocate
the assigned string when a reason was given, we assign a string constant
when no reason was given.

This has been working fine so far due to two reasons:

  - We don't ever free the refs in git-send-pack(1)'

  - Remotes always give a reason, at least as implemented by Git proper.

Adapt the code to always allocate the receive status string and free the
refs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:53 -07:00
Patrick Steinhardt 81654d27bf builtin/remote: cast away constness in `get_head_names()`
In `get_head_names()`, we assign the "refs/heads/*" string constant to
`struct refspec_item::{src,dst}`, which are both non-constant pointers.
Ideally, we'd refactor the code such that both of these fields were
constant. But `struct refspec_item` is used for two different usecases
with conflicting requirements:

  - To query for a source or destination based on the given refspec. The
    caller either sets `src` or `dst` as the branch that we want to
    search for, and the respective other field gets populated. The
    fields should be constant when being used as a query parameter,
    which is owned by the caller, and non-constant when being used as an
    out parameter, which is owned by the refspec item. This is is
    contradictory in itself already.

  - To store refspec items with their respective source and destination
    branches, in which case both fields should be owned by the struct.

Ideally, we'd split up this interface to clearly separate between
querying and storing, which would enable us to clarify lifetimes of the
strings. This would be a much bigger undertaking though.

Instead, accept the status quo for now and cast away the constness of
the source and destination patterns. We know that those are not being
written to or freed, so while this is ugly it certainly is fine for now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:50 -07:00
Patrick Steinhardt 235ac3f81a refspec: remove global tag refspec structure
We have a global tag refspec structure that is used by both git-clone(1)
and git-fetch(1). Initialization of the structure will break once we
enable `-Wwrite-strings`, even though the breakage is harmless. While we
could just add casts, the structure isn't really required in the first
place as we can simply initialize the structures at the respective
callsites.

Refactor the code accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:49 -07:00
Patrick Steinhardt b567004b4b global: improve const correctness when assigning string constants
We're about to enable `-Wwrite-strings`, which changes the type of
string constants to `const char[]`. Fix various sites where we assign
such constants to non-const variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:48 -07:00
Karthik Nayak 7dd4051b01 update-ref: add support for 'symref-update' command
Add 'symref-update' command to the '--stdin' mode of 'git-update-ref' to
allow updates of symbolic refs. The 'symref-update' command takes in a
<new-target>, which the <ref> will be updated to. If the <ref> doesn't
exist it will be created.

It also optionally takes either an `ref <old-target>` or `oid
<old-oid>`. If the <old-target> is provided, it checks to see if the
<ref> targets the <old-target> before the update. If <old-oid> is provided
it checks <ref> to ensure that it is a regular ref and <old-oid> is the
OID before the update. This by extension also means that this when a
zero <old-oid> is provided, it ensures that the ref didn't exist before.

The divergence in syntax from the regular `update` command is because if
we don't use a `(ref | oid)` prefix for the old_value, then there is
ambiguity around if the value provided should be treated as an oid or a
reference. This is more so the reason, because we allow anything
committish to be provided as an oid. While 'symref-verify' and
'symref-delete' also take in `<old-target>` we do not have this
divergence there as those commands only work with symrefs. Whereas
'symref-update' also works with regular refs and allows users to convert
regular refs to symrefs.

The command allows users to perform symbolic ref updates within a
transaction. This provides atomicity and allows users to perform a set
of operations together.

This command supports deref mode, to ensure that we can update
dereferenced regular refs to symrefs.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:45 -07:00
Karthik Nayak ed3272720e update-ref: add support for 'symref-create' command
Add 'symref-create' command to the '--stdin' mode 'git-update-ref' to
allow creation of symbolic refs in a transaction. The 'symref-create'
command takes in a <new-target>, which the created <ref> will point to.

Also, support the 'core.prefersymlinkrefs' config, wherein if the config
is set and the filesystem supports symlinks, we create the symbolic ref
as a symlink. We fallback to creating a regular symref if creating the
symlink is unsuccessful.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:45 -07:00
Karthik Nayak 2343720967 update-ref: add support for 'symref-delete' command
Add a new command 'symref-delete' to allow deletions of symbolic refs in
a transaction via the '--stdin' mode of the 'git-update-ref' command.
The 'symref-delete' command can, when given an <old-target>, delete the
provided <ref> only when it points to <old-target>.

This command is only compatible with the 'no-deref' mode because we
optionally want to check the 'old_target' of the ref being deleted.
De-referencing a symbolic ref would provide a regular ref and we already
have the 'delete' command for regular refs.

While users can also use 'git symbolic-ref -d' to delete symbolic refs,
the 'symref-delete' command in 'git-update-ref' allows users to do so
within a transaction, which promises atomicity of the operation and can
be batched with other commands.

When no 'old_target' is provided it can also delete regular refs,
similar to how the 'delete' command can delete symrefs when no 'old_oid'
is provided.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:44 -07:00
Karthik Nayak 1451ac734f update-ref: add support for 'symref-verify' command
The 'symref-verify' command allows users to verify if a provided <ref>
contains the provided <old-target> without changing the <ref>. If
<old-target> is not provided, the command will verify that the <ref>
doesn't exist.

The command allows users to verify symbolic refs within a transaction,
and this means users can perform a set of changes in a transaction only
when the verification holds good.

Since we're checking for symbolic refs, this command will only work with
the 'no-deref' mode. This is because any dereferenced symbolic ref will
point to an object and not a ref and the regular 'verify' command can be
used in such situations.

Add required tests for symref support in 'verify'. Since we're here,
also add reflog checks for the pre-existing 'verify' tests, there is no
divergence from behavior, but we never tested to ensure that reflog
wasn't affected by the 'verify' command.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:44 -07:00
Junio C Hamano df5c2c4962 Merge branch 'rs/difftool-env-simplify'
Code simplification.

* rs/difftool-env-simplify:
  difftool: add env vars directly in run_file_diff()
2024-06-06 12:49:24 -07:00
Junio C Hamano cf792653ad Merge branch 'ps/leakfixes'
Leakfixes.

* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-06-06 12:49:23 -07:00
Jeff King 53ce2e3f0a am: add explicit "--retry" option
After a patch fails, you can ask "git am" to try applying it again with
new options by running without any of the resume options. E.g.:

  git am <patch
  # oops, it failed; let's try again
  git am --3way

But since this second command has no explicit resume option (like
"--continue"), it looks just like an invocation to read a fresh patch
from stdin. To avoid confusing the two cases, there are some heuristics,
courtesy of 8d18550318 (builtin-am: reject patches when there's a
session in progress, 2015-08-04):

	if (in_progress) {
		/*
		 * Catch user error to feed us patches when there is a session
		 * in progress:
		 *
		 * 1. mbox path(s) are provided on the command-line.
		 * 2. stdin is not a tty: the user is trying to feed us a patch
		 *    from standard input. This is somewhat unreliable -- stdin
		 *    could be /dev/null for example and the caller did not
		 *    intend to feed us a patch but wanted to continue
		 *    unattended.
		 */
		if (argc || (resume_mode == RESUME_FALSE && !isatty(0)))
			die(_("previous rebase directory %s still exists but mbox given."),
				state.dir);

		if (resume_mode == RESUME_FALSE)
			resume_mode = RESUME_APPLY;
		[...]

So if no resume command is given, then we require that stdin be a tty,
and otherwise complain about (potentially) receiving an mbox on stdin.
But of course you might not actually have a terminal available! And
sadly there is no explicit way to hit this same code path; this is the
only place that sets RESUME_APPLY. So you're stuck, and scripts like our
test suite have to bend over backwards to create a pseudo-tty.

Let's provide an explicit option to trigger this mode. The code turns
out to be quite simple; just setting "resume_mode" to RESUME_FALSE is
enough to dodge the tty check, and then our state is the same as it
would be with the heuristic case (which we'll continue to allow).

When we don't have a session in progress, there's already code to
complain when resume_mode is set (but we'll add a new test to cover
that).

To test the new option, we'll convert the existing tests that rely on
the fake stdin tty. That lets us test them on more platforms, and will
let us simplify test_terminal a bit in a future patch.

It does, however, mean we're not testing the tty heuristic at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 10:07:41 -07:00
Patrick Steinhardt 25a0023f28 builtin/refs: new command to migrate ref storage formats
Introduce a new command that allows the user to migrate a repository
between ref storage formats. This new command is implemented as part of
a new git-refs(1) executable. This is due to two reasons:

  - There is no good place to put the migration logic in existing
    commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
    not the correct place to put it, either.

  - I had it in my mind to create a new low-level command for accessing
    refs for quite a while already. git-refs(1) is that command and can
    over time grow more functionality relating to refs. This should help
    discoverability by consolidating low-level access to refs into a
    single executable.

As mentioned in the preceding commit that introduces the ref storage
format migration logic, the new `git refs migrate` command still has a
bunch of restrictions. These restrictions are documented accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:34 -07:00
Patrick Steinhardt 318efb966b refs: convert ref storage format to an enum
The ref storage format is tracked as a simple unsigned integer, which
makes it harder than necessary to discover what that integer actually is
or where its values are defined.

Convert the ref storage format to instead be an enum.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:31 -07:00
Junio C Hamano a74c0686fa add-i: finally retire add.interactive.useBuiltin
The configuration variable stopped doing anything (other than
announcing itself as a variable that does not do anything useful,
when it is used) in Git 2.40.

At this point, it is not even worth giving the warning, which was
meant to be a way to help users notice they are carrying unused
cruft in their configuration files and give them a chance to
clean-up.

Let's remove the warning and documentation for it, and truly stop
paying attention to it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
               ---
 Documentation/config/add.txt |  6 ------
 builtin/add.c                |  6 +-----
 t/t3701-add-interactive.sh   | 15 ---------------
 3 files changed, 1 insertion(+), 26 deletions(-)
2024-06-05 14:53:26 -07:00
Jeff King 6d107751b2 sparse-checkout: free duplicate hashmap entries
In insert_recursive_pattern(), we create a new pattern_entry to insert
into the parent_hashmap. If we find that the same entry already exists
in the hashmap, we skip adding the new one. But we forget to free the new
one, creating a leak.

We can fix it by cleaning up the discarded entry. It would probably be
possible to avoid creating it in the first place, but it's non-trivial.
We'd have to define a "keydata" struct that lets us compare the existing
entries to the broken-out fields. It's probably not worth the
complexity, so we'll punt on that for now.

There is one subtlety here: our insertion is happening in a loop, with
each iteration looking at the pattern we just inserted (hence the
"recursive" in the name). So if we skip insertion, what do we look at?

The obvious answer is that we should remember the existing duplicate we
found and use that. But I _think_ in that case, we probably already have
all of the recursive bits already (from when the original entry was
added). And so just breaking out of the loop would be correct. But I'm
not 100% sure on that; after all, the original leaky code could have
done the same break, but it didn't.

So I went with the "obvious answer" above, which has no chance of
changing the behavior aside from fixing the leak.

With this patch, t1091 can now be marked leak-free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King a544b7da2c sparse-checkout: free string list after displaying
In sparse_checkout_list(), we put the hashmap entries into a string_list
so we can sort them. But after printing, we forget to free the list.

This patch drops 5 leaks from t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King 521e04e6e8 sparse-checkout: free pattern list in sparse_checkout_list()
In sparse_checkout_list(), we create a pattern_list that needs to
eventually be cleared. We remember to do so in the regular code path,
but the cone-mode path does an early return, and forgets to clean up.

We could fix the leak by adding a new call to clear_pattern_list(). But
we can simplify even further by just skipping the early return, pushing
the other code path (which consists now of only one line!) into an else
block. That also matches the same cone/non-cone if/else used in some
other functions.

This fixes 15 leaks found in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King 008f59d2d6 sparse-checkout: free sparse_filename after use
We allocate a heap buffer via get_sparse_checkout_filename(). Most calls
remember to free it, but sparse_checkout_init() forgets to, causing a
leak. Ironically, it remembers to do so in the error return paths, but
not in the path that makes it all the way to the function end!

Fixing this clears up 6 leaks from t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King a14d49ca84 sparse-checkout: refactor temporary sparse_checkout_patterns
In update_working_directory(), we take in a pattern_list, attach it to
the repository index by assigning it to index->sparse_checkout_patterns,
and then call unpack_trees. Afterwards, we remove it by setting
index->sparse_checkout_patterns back to NULL.

But there are two possible leaks here:

  1. If the index already had a populated sparse_checkout_patterns,
     we've obliterated it. We can fix this by saving and restoring it,
     rather than always setting it back to NULL.

  2. We may call the function with a NULL pattern_list, expecting it to
     use the on-disk sparse file. In that case, the index routines will
     lazy-load the sparse patterns automatically. But now at the end of
     the function when we restore the patterns, we'll leak those
     lazy-loaded ones!

     We can fix this by freeing the pattern list before overwriting its
     pointer whenever it does not match what was passed in (in practice
     this should only happen when the passed-in list is NULL, but this
     is erring on the defensive side).

Together these remove 48 indirect leaks found in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King d765fa0331 sparse-checkout: always free "line" strbuf after reading input
In add_patterns_from_input(), we may read lines from a file with a loop
like this:

  while (!strbuf_getline(&line, file)) {
	...
	strbuf_to_cone_pattern(&line, pl);
  }
  /* we don't strbuf_release(&line) here! */

This generally is OK because strbuf_to_cone_pattern() consumes the
buffer via strbuf_detach(). But we can leak in a few cases:

  1. We don't always consume the buffer! If the line ends up empty after
     trimming, we leave strbuf_to_cone_pattern() without detaching. In
     most cases this is OK, because a subsequent getline() call will use
     the same buffer. But if you had an empty line at the end of file,
     for example, it would leak.

  2. Even if strbuf_to_cone_pattern() always consumed the buffer,
     there's a subtle issue with strbuf_getline(). As we saw in
     94e2aa555e (strbuf: fix leak when `appendwholeline()` fails with
     EOF, 2024-05-27), it's possible for it to return EOF with an
     allocated buffer (e.g., if the underlying getdelim() call saw an
     error). So we should always strbuf_release() after finishing a read
     loop like this.

Note that even the code to read patterns from argv has the same problem.
Because that also uses strbuf_to_cone_pattern(), we stuff each argv
entry into a strbuf. It uses the same "line" strbuf as the getline code,
but we should position the strbuf_release() to cover both code paths.

This fixes at least 9 leaks found in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King c3324649ed sparse-checkout: reuse --stdin buffer when reading patterns
When we read patterns from --stdin, we loop on strbuf_getline(), and
detach each line we read to pass into add_pattern(). This used to be
necessary because add_pattern() required that the pattern strings remain
valid while the pattern_list was in use. But it also created a leak,
since we didn't record the detached buffers anywhere else.

Now that add_pattern() has been modified to make its own copy of the
strings, we can stop detaching and fix the leak. This fixes 4 leaks
detected in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:42 -07:00
Junio C Hamano 607c3d372e show-ref: introduce --branches and deprecate --heads
We call the tips of branches "heads", but this command calls the
option to show only branches "--heads", which confuses the branches
themselves and the tips of branches.

Straighten the terminology by introducing "--branches" option that
limits the output to branches, and deprecate "--heads" option used
that way.

We do not plan to remove "--heads" or "-h" yet; we may want to do so
at Git 3.0, in which case, we may need to start advertising upcoming
removal with an extra warning when they are used.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 15:07:08 -07:00
Junio C Hamano b773fb8822 ls-remote: introduce --branches and deprecate --heads
We call the tips of branches "heads", but this command calls the
option to show only branches "--heads", which confuses the branches
themselves and the tips of branches.

Straighten the terminology by introducing "--branches" option that
limits the output to branches, and deprecate "--heads" option used
that way.

We do not plan to remove "--heads" or "-h" yet; we may want to do so
at Git 3.0, in which case, we may need to start advertising upcoming
removal with an extra warning when they are used.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 15:07:08 -07:00
Junio C Hamano a096e70c78 refs: call branches branches
These things in refs/heads/ hierarchy are called "branches" in human
parlance.  Replace REF_HEADS with REF_BRANCHES to make it clearer.

No end-user visible change intended at this step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 15:07:08 -07:00
Jeff King db83b64cda sparse-checkout: clear patterns when init() sees existing sparse file
In sparse_checkout_init(), we first try to load patterns from an
existing file. If we found any, we return immediately, but end up
leaking the patterns we parsed. Fixing this reduces the number of leaks
in t7002 from 9 down to 5.

Note that there are two other exits from the function, but they don't
need the same treatment:

  - if we can't resolve HEAD, we write out a hard-coded sparse file and
    return. But we know the pattern list is empty there, since we didn't
    find any in the on-disk file and we haven't yet added any of our
    own.

  - otherwise, we do populate the list and then tail-call into
    write_patterns_and_update(). But that function frees the
    pattern_list itself, so we don't need to.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:23 -07:00
Jeff King 4d7f95ed1f sparse-checkout: pass string literals directly to add_pattern()
The add_pattern() function takes a pattern string, but neither makes a
copy of it nor takes ownership of the memory. So it is the caller's
responsibility to make sure the string hangs around as long as the
pattern_list which references it.

There are a few cases in sparse-checkout where we use string literal
patterns by stuffing them into a strbuf, detaching the buffer, and then
passing the result into add_pattern(). This creates a leak when the
pattern_list is eventually cleared, since we don't retain a copy of the
detached buffer to free.

But we can observe that the whole strbuf dance is unnecessary. The point
was presumably[1] to satisfy the lifetime requirement of the string. But
string literals have static duration; we can count on them lasting for
the whole program.

So we can fix the leak by just passing them directly. And as a bonus,
that simplifies the code. The leaks can be seen in t7002, which drops
from 25 leaks to 22 with this patch. It also makes t3602 and t1090
leak-free.

In the long run, we will also want to clean up this (undocumented!)
memory lifetime requirement of add_pattern(). But that can come in a
later patch; passing the string literals directly will be the right
thing either way.

[1] The code in question comes from 416adc8711 (sparse-checkout: update
    working directory in-process for 'init', 2019-11-21) and 99dfa6f970
    (sparse-checkout: use in-process update for disable subcommand,
    2019-11-21), but I didn't see anything in their commit messages or
    on the list explaining the strbufs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:23 -07:00
Jeff King 2181fe6e46 sparse-checkout: free string list in write_cone_to_file()
We use a string list to hold sorted and de-duped patterns, but don't
free it before leaving the function, causing a leak.

This drops the number of leaks found in t7002 from 27 to 25.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:22 -07:00
Junio C Hamano 03b0e7d3a7 Merge branch 'ps/leakfixes' into ps/leakfixes-more
* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-06-03 13:08:33 -07:00
Junio C Hamano f8da12adcf Merge branch 'jc/fix-2.45.1-and-friends-for-maint'
Adjust jc/fix-2.45.1-and-friends-for-2.39 for more recent
maintenance track.

* jc/fix-2.45.1-and-friends-for-maint:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 14:15:17 -07:00
Junio C Hamano 6c5be97e4e Merge branch 'jc/undecided-is-not-necessarily-sha1-fix'
The base topic started to make it an error for a command to leave
the hash algorithm unspecified, which revealed a few commands that
were not ready for the change.  Give users a knob to revert back to
the "default is sha-1" behaviour as an escape hatch, and start
fixing these breakages.

* jc/undecided-is-not-necessarily-sha1-fix:
  apply: fix uninitialized hash function
  builtin/hash-object: fix uninitialized hash function
  builtin/patch-id: fix uninitialized hash function
  t1517: test commands that are designed to be run outside repository
  setup: add an escape hatch for "no more default hash algorithm" change
2024-05-30 14:15:14 -07:00
Junio C Hamano 988499e295 Merge branch 'ps/refs-without-the-repository-updates'
Further clean-up the refs subsystem to stop relying on
the_repository, and instead use the repository associated to the
ref_store object.

* ps/refs-without-the-repository-updates:
  refs/packed: remove references to `the_hash_algo`
  refs/files: remove references to `the_hash_algo`
  refs/files: use correct repository
  refs: remove `dwim_log()`
  refs: drop `git_default_branch_name()`
  refs: pass repo when peeling objects
  refs: move object peeling into "object.c"
  refs: pass ref store when detecting dangling symrefs
  refs: convert iteration over replace refs to accept ref store
  refs: retrieve worktree ref stores via associated repository
  refs: refactor `resolve_gitlink_ref()` to accept a repository
  refs: pass repo when retrieving submodule ref store
  refs: track ref stores via strmap
  refs: implement releasing ref storages
  refs: rename `init_db` callback to avoid confusion
  refs: adjust names for `init` and `init_db` callbacks
2024-05-30 14:15:13 -07:00
Junio C Hamano a60c21b720 Merge branch 'ps/undecided-is-not-necessarily-sha1'
Before discovering the repository details, We used to assume SHA-1
as the "default" hash function, which has been corrected. Hopefully
this will smoke out codepaths that rely on such an unwarranted
assumptions.

* ps/undecided-is-not-necessarily-sha1:
  repository: stop setting SHA1 as the default object hash
  oss-fuzz/commit-graph: set up hash algorithm
  builtin/shortlog: don't set up revisions without repo
  builtin/diff: explicitly set hash algo when there is no repo
  builtin/bundle: abort "verify" early when there is no repository
  builtin/blame: don't access potentially unitialized `the_hash_algo`
  builtin/rev-parse: allow shortening to more than 40 hex characters
  remote-curl: fix parsing of detached SHA256 heads
  attr: fix BUG() when parsing attrs outside of repo
  attr: don't recompute default attribute source
  parse-options-cb: only abbreviate hashes when hash algo is known
  path: move `validate_headref()` to its only user
  path: harden validation of HEAD with non-standard hashes
2024-05-30 14:15:11 -07:00
Phillip Wood 0c26738aa4 rebase -i: pass struct replay_opts to parse_insn_line()
This new parameter will be used in the next commit. As adding the
parameter requires quite a few changes to plumb it through the call
chain these are separated into their own commit to avoid cluttering up
the next commit with incidental changes.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 10:02:56 -07:00
Jeff King 64f8502b40 mv: replace src_dir with a strvec
We manually manage the src_dir array with ALLOC_GROW. Using a strvec is
a little more ergonomic, and makes the memory ownership more clear. It
does mean that we copy the strings (which were otherwise just pointers
into the "sources" strvec), but using the same rationale as 9fcd9e4e72
(builtin/mv duplicate string list memory, 2024-05-27), it's just not
enough to be worth worrying about here.

As a bonus, this gets rid of some "int"s used for allocation management
(though in practice these were limited to command-line sizes and thus
not overflowable).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
Jeff King d58a687705 mv: factor out empty src_dir removal
This pulls the loop added by b6f51e3db9 (mv: cleanup empty
WORKING_DIRECTORY, 2022-08-09) into a sub-function. That reduces clutter
in cmd_mv() and makes it easier to see that the lifetime of the
a_src_dir strbuf is limited to this code (and thus its cleanup doesn't
need to go after the "out" label).

Another option would be to just declare the strbuf inside the loop,
since it is only used there. But this refactor retains the existing
property that we can reuse the allocated buffer for each iteration of
the loop. That optimization is probably overkill, but I think the
sub-function is more readable anyway, and then keeping the optimization
is basically free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
Jeff King cc65e085e4 mv: move src_dir cleanup to end of cmd_mv()
Commit b6f51e3db9 (mv: cleanup empty WORKING_DIRECTORY, 2022-08-09)
added an auxiliary array where we store directory arguments that we see
while processing the incoming arguments. After actually moving things,
we then use that array to remove now-empty directories, and then
immediately free the array.

But if the actual move queues any errors in only_match_skip_worktree,
that can cause us to jump straight to the "out" label to clean up,
skipping the free() and leaking the array.

Let's push the free() down past the "out" label so that we always clean
up (the array is initialized to NULL, so this is always safe). We'll
hold on to the memory a little longer than necessary, but clarity is
more important than micro-optimizing here.

Note that the adjacent "a_src_dir" strbuf does not suffer the same
problem; it is only allocated during the removal step.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
Junio C Hamano a3f0e2a064 Merge branch 'ps/leakfixes' into jk/leakfixes
* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-05-30 08:54:58 -07:00
Junio C Hamano 5529cba09f Merge branch 'ps/leakfixes' into ps/no-writable-strings
* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-05-29 09:32:24 -07:00
Ghanshyam Thakkar a70f8f19ad strbuf: introduce strbuf_addstrings() to repeatedly add a string
In a following commit we are going to port code from
"t/helper/test-sha256.c", t/helper/test-hash.c and "t/t0015-hash.sh" to
a new "t/unit-tests/t-hash.c" file using the recently added unit test
framework.

To port code like: perl -e "$| = 1; print q{aaaaaaaaaa} for 1..100000;"
we are going to need a new strbuf_addstrings() function that repeatedly
adds the same string a number of times to a buffer.

Such a strbuf_addstrings() function would already be useful in
"json-writer.c" and "builtin/submodule-helper.c" as both of these files
already have code that repeatedly adds the same string. So let's
introduce such a strbuf_addstrings() function in "strbuf.{c,h}" and use
it in both "json-writer.c" and "builtin/submodule-helper.c".

We use the "strbuf_addstrings" name as this way strbuf_addstr() and
strbuf_addstrings() would be similar for strings as strbuf_addch() and
strbuf_addchars() for characters.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Co-authored-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-29 09:09:39 -07:00
Junio C Hamano b32f298264 Merge branch 'jc/format-patch-more-aggressive-range-diff'
The default "creation-factor" used by "git format-patch" has been
raised to make it more aggressively find matching commits.

* jc/format-patch-more-aggressive-range-diff:
  format-patch: run range-diff with larger creation-factor
2024-05-28 11:17:10 -07:00
Junio C Hamano ee8537ebc9 Merge branch 'tb/pack-bitmap-write-cleanups'
The pack bitmap code saw some clean-up to prepare for a follow-up topic.

* tb/pack-bitmap-write-cleanups:
  pack-bitmap: introduce `bitmap_writer_free()`
  pack-bitmap-write.c: avoid uninitialized 'write_as' field
  pack-bitmap: drop unused `max_bitmaps` parameter
  pack-bitmap: avoid use of static `bitmap_writer`
  pack-bitmap-write.c: move commit_positions into commit_pos fields
  object.h: add flags allocated by pack-bitmap.h
2024-05-28 11:17:07 -07:00
Junio C Hamano 00ffa1cb1c Merge branch 'ps/builtin-config-cleanup'
Code clean-up to reduce inter-function communication inside
builtin/config.c done via the use of global variables.

* ps/builtin-config-cleanup: (21 commits)
  builtin/config: pass data between callbacks via local variables
  builtin/config: convert flags to a local variable
  builtin/config: track "fixed value" option via flags only
  builtin/config: convert `key` to a local variable
  builtin/config: convert `key_regexp` to a local variable
  builtin/config: convert `regexp` to a local variable
  builtin/config: convert `value_pattern` to a local variable
  builtin/config: convert `do_not_match` to a local variable
  builtin/config: move `respect_includes_opt` into location options
  builtin/config: move default value into display options
  builtin/config: move type options into display options
  builtin/config: move display options into local variables
  builtin/config: move location options into local variables
  builtin/config: refactor functions to have common exit paths
  config: make the config source const
  builtin/config: check for writeability after source is set up
  builtin/config: move actions into `cmd_config_actions()`
  builtin/config: move legacy options into `cmd_config()`
  builtin/config: move subcommand options into `cmd_config()`
  builtin/config: move legacy mode into its own function
  ...
2024-05-28 11:17:07 -07:00
Junio C Hamano 16a592f132 Merge branch 'ps/pseudo-ref-terminology'
Terminology to call various ref-like things are getting
straightened out.

* ps/pseudo-ref-terminology:
  refs: refuse to write pseudorefs
  ref-filter: properly distinuish pseudo and root refs
  refs: pseudorefs are no refs
  refs: classify HEAD as a root ref
  refs: do not check ref existence in `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: rename `is_pseudoref()` to `is_root_ref()`
  Documentation/glossary: define root refs as refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: redefine pseudorefs as special refs
2024-05-28 11:17:06 -07:00
Patrick Steinhardt ebdbefa4fe builtin/mv: fix leaks for submodule gitfile paths
Similar to the preceding commit, we have effectively given tracking
memory ownership of submodule gitfile paths. Refactor the code to start
tracking allocated strings in a separate `struct strvec` such that we
can easily plug those leaks. Mark now-passing tests as leak free.

Note that ideally, we wouldn't require two separate data structures to
track those paths. But we do need to store `NULL` pointers for the
gitfile paths such that we can indicate that its corresponding entries
in the other arrays do not have such a path at all. And given that
`struct strvec`s cannot store `NULL` pointers we cannot use them to
store this information.

There is another small gotcha that is easy to miss: you may be wondering
why we don't want to store `SUBMODULE_WITH_GITDIR` in the strvec. This
is because this is a mere sentinel value and not actually a string at
all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:03 -07:00
Patrick Steinhardt 52a7dab439 builtin/mv: refactor to use `struct strvec`
Memory allocation patterns in git-mv(1) are extremely hard to follow:
We copy around string pointers into manually-managed arrays, some of
which alias each other, but only sometimes, while we also drop some of
those strings at other times without ever daring to free them.

While this may be my own subjective feeling, it seems like others have
given up as the code has multiple calls to `UNLEAK()`. These are not
sufficient though, and git-mv(1) is still leaking all over the place
even with them.

Refactor the code to instead track strings in `struct strvec`. While
this has the effect of effectively duplicating some of the strings
without an actual need, it is way easier to reason about and fixes all
of the aliasing of memory that has been going on. It allows us to get
rid of the `UNLEAK()` calls and also fixes leaks that those calls did
not paper over.

Mark tests which are now leak-free accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
Patrick Steinhardt 9fcd9e4e72 builtin/mv duplicate string list memory
makes the next patch easier, where we will migrate to the paths being
owned by a strvec. given that we are talking about command line
parameters here it's also not like we have tons of allocations that this
would save

while at it, fix a memory leak

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
Patrick Steinhardt 3d231f7b82 builtin/mv: refactor `add_slash()` to always return allocated strings
The `add_slash()` function will only conditionally return an allocated
string when the passed-in string did not yet have a trailing slash. This
makes the memory ownership harder to track than really necessary.

It's dubious whether this optimization really buys us all that much. The
number of times we execute this function is bounded by the number of
arguments to git-mv(1), so in the typical case we may end up saving an
allocation or two.

Simplify the code to unconditionally return allocated strings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
Patrick Steinhardt 96c1655095 builtin/credential: clear credential before exit
We never release memory associated with `struct credential`. Fix this
and mark the corresponding test as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:01 -07:00
Patrick Steinhardt 1b261c20ed config: clarify memory ownership in `git_config_string()`
The out parameter of `git_config_string()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:00 -07:00
Patrick Steinhardt 83024d98f7 builtin/log: stop using globals for format config
This commit does the exact same as the preceding commit, only for the
format configuration instead of the log configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:00 -07:00
Patrick Steinhardt 106a54aecb builtin/log: stop using globals for log config
We're using global variables to store the log configuration. Many of
these can be set both via the command line and via the config, and
depending on how they are being set, they may contain allocated strings.
This leads to hard-to-track memory ownership and memory leaks.

Refactor the code to instead use a `struct log_config` that is being
allocated on the stack. This allows us to more clearly scope the
variables, track memory ownership and ultimately release the memory.

This also prepares us for a change to `git_config_string()`, which will
be adapted to have a `char **` out parameter instead of `const char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
Patrick Steinhardt 6073b3b5c3 config: clarify memory ownership in `git_config_pathname()`
The out parameter of `git_config_pathname()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
Patrick Steinhardt cc395d6b47 checkout: clarify memory ownership in `unique_tracking_name()`
The function `unique_tracking_name()` returns an allocated string, but
does not clearly indicate this because its return type is `const char *`
instead of `char *`. This has led to various callsites where we never
free its returned memory at all, which causes memory leaks.

Plug those leaks and mark now-passing tests as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:58 -07:00
René Scharfe 36d900d2b0 difftool: add env vars directly in run_file_diff()
Add the environment variables of the child process directly using
strvec_push() instead of building an array out of them and then adding
that using strvec_pushv().  The new code is shorter and avoids magic
array index values and fragile array padding.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 08:55:59 -07:00
Junio C Hamano d36cc0d5a4 Merge branch 'fixes/2.45.1/2.44' into jc/fix-2.45.1-and-friends-for-maint
* fixes/2.45.1/2.44:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:59:12 -07:00
Junio C Hamano 863c0ed71e Merge branch 'fixes/2.45.1/2.43' into fixes/2.45.1/2.44
* fixes/2.45.1/2.43:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:58:35 -07:00
Junio C Hamano 3c562ef2e6 Merge branch 'fixes/2.45.1/2.42' into fixes/2.45.1/2.43
* fixes/2.45.1/2.42:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:58:11 -07:00
Junio C Hamano 73339e4dc2 Merge branch 'fixes/2.45.1/2.41' into fixes/2.45.1/2.42
* fixes/2.45.1/2.41:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:57:43 -07:00
Junio C Hamano 4f215d214f Merge branch 'fixes/2.45.1/2.40' into fixes/2.45.1/2.41
* fixes/2.45.1/2.40:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:57:02 -07:00
Junio C Hamano 48440f60a7 Merge branch 'jc/fix-2.45.1-and-friends-for-2.39' into fixes/2.45.1/2.40
Revert overly aggressive "layered defence" that went into 2.45.1
and friends, which broke "git-lfs", "git-annex", and other use
cases, so that we can rebuild necessary counterparts in the open.

* jc/fix-2.45.1-and-friends-for-2.39:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 12:29:36 -07:00
Taylor Blau 4722e06edc pack-bitmap: move some initialization to `bitmap_writer_init()`
The pack-bitmap-writer machinery uses a oidmap (backed by khash.h) to
map from commits selected for bitmaps (by OID) to a bitmapped_commit
structure (containing the bitmap itself, among other things like its XOR
offset, etc.)

This map was initialized at the end of `bitmap_writer_build()`. New
entries are added in `pack-bitmap-write.c::store_selected()`, which is
called by the bitmap_builder machinery (which is responsible for
traversing history and generating the actual bitmaps).

Reorganize when this field is initialized and when entries are added to
it so that we can quickly determine whether a commit is a candidate for
pseudo-merge selection, or not (since it was already selected to receive
a bitmap, and thus storing it in a pseudo-merge would be redundant).

The changes are as follows:

  - Introduce a new `bitmap_writer_init()` function which initializes
    the `writer.bitmaps` field (instead of waiting until the end of
    `bitmap_writer_build()`).

  - Add map entries in `push_bitmapped_commit()` (which is called via
    `bitmap_writer_select_commits()`) with OID keys and NULL values to
    track whether or not we *expect* to write a bitmap for some given
    commit.

  - Validate that a NULL entry is found matching the given key when we
    store a selected bitmap.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:41 -07:00
Junio C Hamano 7593d66928 Merge branch 'la/hide-trailer-info'
The trailer API has been reshuffled a bit.

* la/hide-trailer-info:
  trailer unit tests: inspect iterator contents
  trailer: document parse_trailers() usage
  trailer: retire trailer_info_get() from API
  trailer: make trailer_info struct private
  trailer: make parse_trailers() return trailer_info pointer
  interpret-trailers: access trailer_info with new helpers
  sequencer: use the trailer iterator
  trailer: teach iterator about non-trailer lines
  trailer: add unit tests for trailer iterator
  Makefile: sort UNIT_TEST_PROGRAMS
2024-05-23 11:04:27 -07:00
Junio C Hamano 939d49e9bd Merge branch 'kn/ref-transaction-symref' into kn/update-ref-symref
* kn/ref-transaction-symref:
  refs: remove `create_symref` and associated dead code
  refs: rename `refs_create_symref()` to `refs_update_symref()`
  refs: use transaction in `refs_create_symref()`
  refs: add support for transactional symref updates
  refs: move `original_update_refname` to 'refs.c'
  refs: support symrefs in 'reference-transaction' hook
  files-backend: extract out `create_symref_lock()`
  refs: accept symref values in `ref_transaction_update()`
2024-05-23 09:38:59 -07:00
Junio C Hamano 0ff6d23a0f Merge branch 'ps/pseudo-ref-terminology' into ps/ref-storage-migration
* ps/pseudo-ref-terminology:
  refs: refuse to write pseudorefs
  ref-filter: properly distinuish pseudo and root refs
  refs: pseudorefs are no refs
  refs: classify HEAD as a root ref
  refs: do not check ref existence in `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: rename `is_pseudoref()` to `is_root_ref()`
  Documentation/glossary: define root refs as refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: redefine pseudorefs as special refs
2024-05-23 09:14:32 -07:00
Junio C Hamano e55f364398 Merge branch 'ps/refs-without-the-repository-updates' into ps/ref-storage-migration
* ps/refs-without-the-repository-updates:
  refs/packed: remove references to `the_hash_algo`
  refs/files: remove references to `the_hash_algo`
  refs/files: use correct repository
  refs: remove `dwim_log()`
  refs: drop `git_default_branch_name()`
  refs: pass repo when peeling objects
  refs: move object peeling into "object.c"
  refs: pass ref store when detecting dangling symrefs
  refs: convert iteration over replace refs to accept ref store
  refs: retrieve worktree ref stores via associated repository
  refs: refactor `resolve_gitlink_ref()` to accept a repository
  refs: pass repo when retrieving submodule ref store
  refs: track ref stores via strmap
  refs: implement releasing ref storages
  refs: rename `init_db` callback to avoid confusion
  refs: adjust names for `init` and `init_db` callbacks
2024-05-23 09:14:08 -07:00
Johannes Schindelin 873a466ea3 clone: drop the protections where hooks aren't run
As part of the security bug-fix releases v2.39.4, ..., v2.45.1, I
introduced logic to safeguard `git clone` from running hooks that were
installed _during_ the clone operation.

The rationale was that Git's CVE-2024-32002, CVE-2021-21300,
CVE-2019-1354, CVE-2019-1353, CVE-2019-1352, and CVE-2019-1349 should
have been low-severity vulnerabilities but were elevated to
critical/high severity by the attack vector that allows a weakness where
files inside `.git/` can be inadvertently written during a `git clone`
to escalate to a Remote Code Execution attack by virtue of installing a
malicious `post-checkout` hook that Git will then run at the end of the
operation without giving the user a chance to see what code is executed.

Unfortunately, Git LFS uses a similar strategy to install its own
`post-checkout` hook during a `git clone`; In fact, Git LFS is
installing four separate hooks while running the `smudge` filter.

While this pattern is probably in want of being improved by introducing
better support in Git for Git LFS and other tools wishing to register
hooks to be run at various stages of Git's commands, let's undo the
clone protections to unbreak Git LFS-enabled clones.

This reverts commit 8db1e8743c (clone: prevent hooks from running
during a clone, 2024-03-28).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00
Junio C Hamano 4674ab682d apply: fix uninitialized hash function
"git apply" can work outside a repository as a better "GNU patch",
but when it does so, it still assumed that it can access
the_hash_algo, which is no longer true in the new world order.

Make sure we explicitly fall back to SHA-1 algorithm for backward
compatibility.

It is of dubious value to make this configurable to other hash
algorithms, as the code does not use the_hash_algo for hashing
purposes when working outside a repository (which is how
the_hash_algo is left to NULL)---it is only used to learn the max
length of the hash when parsing the object names on the "index"
line, but failing to parse the "index" line is not a hard failure,
and the program does not support operations like applying binary
patches and --3way fallback that requires object access outside a
repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 09:07:48 -07:00
Patrick Steinhardt 8d058b8024 builtin/hash-object: fix uninitialized hash function
The git-hash-object(1) command allows users to hash an object even
without a repository. Starting with c8aed5e8da (repository: stop setting
SHA1 as the default object hash, 2024-05-07), this will make us hit an
uninitialized hash function, which subsequently leads to a segfault.

Fix this by falling back to SHA-1 explicitly when running outside of
a Git repository. Users can use GIT_DEFAULT_HASH environment to
specify what hash algorithm they want, so arguably this code should
not be needed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 09:05:13 -07:00
Patrick Steinhardt 4a1c95931f builtin/patch-id: fix uninitialized hash function
In c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), we have adapted `initialize_repository()` to no longer set
up a default hash function. As this function is also used to set up
`the_repository`, the consequence is that `the_hash_algo` will now by
default be a `NULL` pointer unless the hash algorithm was configured
properly. This is done as a mechanism to detect cases where we may be
using the wrong hash function by accident.

This change now causes git-patch-id(1) to segfault when it's run outside
of a repository. As this command can read diffs from stdin, it does not
necessarily need a repository, but then relies on `the_hash_algo` to
compute the patch ID itself.

It is somewhat dubious that git-patch-id(1) relies on `the_hash_algo` in
the first place. Quoting its manpage:

    A "patch ID" is nothing but a sum of SHA-1 of the file diffs
    associated with a patch, with line numbers ignored. As such, it’s
    "reasonably stable", but at the same time also reasonably unique,
    i.e., two patches that have the same "patch ID" are almost
    guaranteed to be the same thing.

We explicitly document patch IDs to be using SHA-1. Furthermore, patch
IDs are supposed to be stable for most of the part. But even with the
same input, the patch IDs will now be different depending on the repo's
configured object hash.

Work around the issue by setting up SHA-1 when there was no startup
repository for now. This is arguably not the correct fix, but for now we
rather want to focus on getting the segfault fixed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 09:05:13 -07:00
Junio C Hamano 4beb7a3b06 Merge branch 'kn/ref-transaction-symref'
Updates to symbolic refs can now be made as a part of ref
transaction.

* kn/ref-transaction-symref:
  refs: remove `create_symref` and associated dead code
  refs: rename `refs_create_symref()` to `refs_update_symref()`
  refs: use transaction in `refs_create_symref()`
  refs: add support for transactional symref updates
  refs: move `original_update_refname` to 'refs.c'
  refs: support symrefs in 'reference-transaction' hook
  files-backend: extract out `create_symref_lock()`
  refs: accept symref values in `ref_transaction_update()`
2024-05-20 11:20:04 -07:00
Patrick Steinhardt 2bb444b196 refs: remove `dwim_log()`
Remove `dwim_log()` in favor of `repo_dwim_log()` so that we can get rid
of one more dependency on `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
Patrick Steinhardt 97abaab5f6 refs: drop `git_default_branch_name()`
The `git_default_branch_name()` function is a thin wrapper around
`repo_default_branch_name()` with two differences:

  - We implicitly rely on `the_repository`.

  - We cache the default branch name.

None of the callsites of `git_default_branch_name()` are hot code paths
though, so the caching of the branch name is not really required.

Refactor the callsites to use `repo_default_branch_name()` instead and
drop `git_default_branch_name()`, thus getting rid of one more case
where we rely on `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
Patrick Steinhardt 30aaff437f refs: pass repo when peeling objects
Both `peel_object()` and `peel_iterated_oid()` implicitly rely on
`the_repository` to look up objects. Despite the fact that we want to
get rid of `the_repository`, it also leads to some restrictions in our
ref iterators when trying to retrieve the peeled value for a repository
other than `the_repository`.

Refactor these functions such that both take a repository as argument
and remove the now-unnecessary restrictions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
Patrick Steinhardt 330a2ae60b refs: pass ref store when detecting dangling symrefs
Both `warn_dangling_symref()` and `warn_dangling_symrefs()` derive the
ref store via `the_repository`. Adapt them to instead take in the ref
store as a parameter. While at it, rename the functions to have a `ref_`
prefix to align them with other functions that take a ref store.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
Patrick Steinhardt 8378c9d27b refs: convert iteration over replace refs to accept ref store
The function `for_each_replace_ref()` is a bit of an oddball across the
refs interfaces as it accepts a pointer to the repository instead of a
pointer to the ref store. The only reason for us to accept a repository
is so that we can eventually pass it back to the callback function that
the caller has provided. This is somewhat arbitrary though, as callers
that need the repository can instead make it accessible via the callback
payload.

Refactor the function to instead accept the ref store and adjust callers
accordingly. This allows us to get rid of some of the boilerplate that
we had to carry to pass along the repository and brings us in line with
the other functions that iterate through refs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
Patrick Steinhardt e19488a60a refs: refactor `resolve_gitlink_ref()` to accept a repository
In `resolve_gitlink_ref()` we implicitly rely on `the_repository` to
look up the submodule ref store. Now that we can look up submodule ref
stores for arbitrary repositories we can improve this function to
instead accept a repository as parameter for which we want to resolve
the gitlink.

Do so and adjust callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
Patrick Steinhardt 965f8991e5 refs: pass repo when retrieving submodule ref store
Looking up submodule ref stores has two deficiencies:

  - The initialized subrepo will be attributed to `the_repository`.

  - The submodule ref store will be tracked in a global map.

This makes it impossible to have submodule ref stores for a repository
other than `the_repository`.

Modify the function to accept the parent repository as parameter and
move the global map into `struct repository`. Like this it becomes
possible to look up submodule ref stores for arbitrary repositories.

Note that this also adds a new reference to `the_repository` in
`resolve_gitlink_ref()`, which is part of the refs interfaces. This will
get adjusted in the next patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:37 -07:00
Patrick Steinhardt ed93ea1602 refs: rename `init_db` callback to avoid confusion
Reference backends have two callbacks `init` and `init_db`. The
similarity of these two callbacks has repeatedly confused me whenever I
was looking at them, where I always had to look up which of them does
what.

Rename the `init_db` callback to `create_on_disk`, which should
hopefully be clearer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:36 -07:00
Junio C Hamano bca900904d Merge branch 'ps/refs-without-the-repository'
The refs API lost functions that implicitly assumes to work on the
primary ref_store by forcing the callers to pass a ref_store as an
argument.

* ps/refs-without-the-repository:
  refs: remove functions without ref store
  cocci: apply rules to rewrite callers of "refs" interfaces
  cocci: introduce rules to transform "refs" to pass ref store
  refs: add `exclude_patterns` parameter to `for_each_fullref_in()`
  refs: introduce missing functions that accept a `struct ref_store`
2024-05-16 10:10:14 -07:00
Junio C Hamano 46536278a8 Merge branch 'ps/refs-without-the-repository' into ps/refs-without-the-repository-updates
* ps/refs-without-the-repository:
  refs: remove functions without ref store
  cocci: apply rules to rewrite callers of "refs" interfaces
  cocci: introduce rules to transform "refs" to pass ref store
  refs: add `exclude_patterns` parameter to `for_each_fullref_in()`
  refs: introduce missing functions that accept a `struct ref_store`
2024-05-16 09:48:46 -07:00
Junio C Hamano f9d4eaf86c Merge branch 'jp/tag-trailer'
"git tag" learned the "--trailer" option to futz with the trailers
in the same way as "git commit" does.

* jp/tag-trailer:
  builtin/tag: add --trailer option
  builtin/commit: refactor --trailer logic
  builtin/commit: use ARGV macro to collect trailers
2024-05-15 09:52:53 -07:00
Junio C Hamano fe3ccc7aab Merge branch 'ps/config-subcommands'
The operation mode options (like "--get") the "git config" command
uses have been deprecated and replaced with subcommands (like "git
config get").

* ps/config-subcommands:
  builtin/config: display subcommand help
  builtin/config: introduce "edit" subcommand
  builtin/config: introduce "remove-section" subcommand
  builtin/config: introduce "rename-section" subcommand
  builtin/config: introduce "unset" subcommand
  builtin/config: introduce "set" subcommand
  builtin/config: introduce "get" subcommand
  builtin/config: introduce "list" subcommand
  builtin/config: pull out function to handle `--null`
  builtin/config: pull out function to handle config location
  builtin/config: use `OPT_CMDMODE()` to specify modes
  builtin/config: move "fixed-value" option to correct group
  builtin/config: move option array around
  config: clarify memory ownership when preparing comment strings
2024-05-15 09:52:53 -07:00
Patrick Steinhardt f1701f279a ref-filter: properly distinuish pseudo and root refs
The ref-filter interfaces currently define root refs as either a
detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
let's properly distinguish those ref types.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:52 -07:00
Patrick Steinhardt 9c62534377 builtin/config: pass data between callbacks via local variables
We use several global variables to pass data between callers and
callbacks in `get_color()` and `get_colorbool()`. Convert those to use
callback data structures instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:56 -07:00
Patrick Steinhardt 35a7cfda56 builtin/config: convert flags to a local variable
Both the `do_all` and `use_key_regexp` bits essentially act like flags
to `get_value()`. Let's convert them to actual flags so that we can get
rid of the last two remaining global variables that track options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:56 -07:00
Patrick Steinhardt ab8bac8bb6 builtin/config: track "fixed value" option via flags only
We track the "fixed value" option via two separate bits: once via the
global variable `fixed_value`, and once via the CONFIG_FLAGS_FIXED_VALUE
bit in `flags`. This is confusing and may easily lead to issues when one
is not aware that this is tracked via two separate mechanisms.

Refactor the code to use the flag exclusively. We already pass it to all
the required callsites anyway, except for `collect_config()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:56 -07:00
Patrick Steinhardt 040b141df3 builtin/config: convert `key` to a local variable
The `key` variable is used by the `get_value()` function for two
purposes:

  - It is used to store the result of `git_config_parse_key()`, which is
    then passed on to `collect_config()`.

  - It is used as a store to convert the provided key to an
    all-lowercase key when `use_key_regexp` is set.

Neither of these cases warrant a global variable at all. In the former
case we can pass the key via `struct collect_config_data`. And in the
latter case we really only want to have it as a temporary local variable
such that we can free associated memory.

Refactor the code accordingly to reduce our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:56 -07:00
Patrick Steinhardt fdfaaa1b68 builtin/config: convert `key_regexp` to a local variable
The `key_regexp` variable is used by the `format_config()` callback when
`use_key_regexp` is set. It is only ever set up by its only caller,
`collect_config()` and can thus easily be moved into the
`collect_config_data` structure.

Do so to remove our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:55 -07:00
Patrick Steinhardt 4ff8feb307 builtin/config: convert `regexp` to a local variable
The `regexp` variable is used by the `format_config()` callback when
`CONFIG_FLAGS_FIXED_VALUE` is not set. It is only ever set up by its
only caller, `collect_config()` and can thus easily be moved into the
`collect_config_data` structure.

Do so to remove our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:55 -07:00
Patrick Steinhardt bfe45f83e7 builtin/config: convert `value_pattern` to a local variable
The `value_pattern` variable is used by the `format_config()` callback
when `CONFIG_FLAGS_FIXED_VALUE` is used. It is only ever set up by its
only caller, `collect_config()` and can thus easily be moved into the
`collect_config_data` structure.

Do so to remove our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:55 -07:00
Patrick Steinhardt 65d197cffc builtin/config: convert `do_not_match` to a local variable
The `do_not_match` variable is used by the `format_config()` callback as
an indicator whether or not the passed regular expression is negated. It
is only ever set up by its only caller, `collect_config()` and can thus
easily be moved into the `collect_config_data` structure.

Do so to remove our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:55 -07:00
Patrick Steinhardt 8c86981228 builtin/config: move `respect_includes_opt` into location options
The variable tracking whether or not we want to honor includes is
tracked via a global variable. Move it into the location options
instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:54 -07:00
Patrick Steinhardt 4090a9c948 builtin/config: move default value into display options
The default value is tracked via a global variable. Move it into the
display options instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:54 -07:00
Patrick Steinhardt 94c4693079 builtin/config: move type options into display options
The type options are tracked via a global variable. Move it into the
display options instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:54 -07:00
Patrick Steinhardt c0c1e26326 builtin/config: move display options into local variables
The display options are tracked via a set of global variables. Move
them into a self-contained structure so that we can easily parse all
relevant options and hand them over to the various functions that
require them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:54 -07:00
Patrick Steinhardt ddb103c2c7 builtin/config: move location options into local variables
The location options are tracked via a set of global variables. Move
them into a self-contained structure so that we can easily parse all
relevant options and hand them over to the various functions that
require them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:53 -07:00
Patrick Steinhardt 999425cb12 builtin/config: refactor functions to have common exit paths
Refactor functions to have a single exit path. This will make it easier
in subsequent commits to add common cleanup code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:53 -07:00
Patrick Steinhardt e44b018c52 builtin/config: check for writeability after source is set up
The `check_write()` function verifies that we do not try to write to a
config source that cannot be written to, like for example stdin. But
while the new subcommands do call this function, they do so before
calling `handle_config_location()`. Consequently, we only end up
checking the default config location for writeability, not the location
that was actually specified by the caller of git-config(1).

Fix this by calling `check_write()` after `handle_config_location()`. We
will further clarify the relationship between those two functions in a
subsequent commit where we remove the global state that both implicitly
rely on.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
Patrick Steinhardt 9cab5e8078 builtin/config: move actions into `cmd_config_actions()`
We only use actions in the legacy mode. Convert them to an enum and move
them into `cmd_config_actions()` to clearly demonstrate that they are
not used anywhere else.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
Patrick Steinhardt 7d5387e263 builtin/config: move legacy options into `cmd_config()`
Move the legacy options as well some of the variables it references into
`cmd_config_action()`. This reduces our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
Patrick Steinhardt 8b908f9dcf builtin/config: move subcommand options into `cmd_config()`
Move the subcommand options as well as the `subcommand` variable into
`cmd_config()`. This reduces our reliance on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
Patrick Steinhardt 0336d0055c builtin/config: move legacy mode into its own function
In `cmd_config()` we first try to parse the provided arguments as
subcommands and, if this is successful, call the respective functions
of that subcommand. Otherwise we continue with the "legacy" mode that
uses implicit actions and/or flags.

Disentangle this by moving the legacy mode into its own function. This
allows us to move the options into the respective functions and clearly
separates concerns.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:52 -07:00
Patrick Steinhardt a577d2f1a9 builtin/config: stop printing full usage on misuse
When invoking git-config(1) with a wrong set of arguments we end up
calling `usage_builtin_config()` after printing an error message that
says what was wrong. As that function ends up printing the full list of
options, which is quite long, the actual error message will be buried by
a wall of text. This makes it really hard to figure out what exactly
caused the error.

Furthermore, now that we have recently introduced subcommands, the usage
information may actually be misleading as we unconditionally print
options of the subcommand-less mode.

Fix both of these issues by just not printing the options at all
anymore. Instead, we call `usage()` that makes us report in a single
line what has gone wrong. This should be way more discoverable for our
users and addresses the inconsistency.

Furthermore, this change allow us to inline the options into the
respective functions that use them to parse the command line.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:17:51 -07:00
Taylor Blau 85f360fee5 pack-bitmap: introduce `bitmap_writer_free()`
Now that there is clearer memory ownership around the bitmap_writer
structure, introduce a bitmap_writer_free() function that callers may
use to free any memory associated with their instance of the
bitmap_writer structure.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 06:53:46 -07:00
Taylor Blau 9675b06917 pack-bitmap: drop unused `max_bitmaps` parameter
The `max_bitmaps` parameter in `bitmap_writer_select_commits()` was
introduced back in 7cc8f97108 (pack-objects: implement bitmap writing,
2013-12-21), making it original to the bitmap implementation in Git
itself.

When that patch was merged via 0f9e62e084 (Merge branch
'jk/pack-bitmap', 2014-02-27), its sole caller in builtin/pack-objects.c
passed a value of "-1" for `max_bitmaps`, indicating no limit.

Since then, the only other caller (in midx.c, added via c528e17966
(pack-bitmap: write multi-pack bitmaps, 2021-08-31)) also uses a value
of "-1" for `max_bitmaps`.

Since no callers have needed a finite limit for the `max_bitmaps`
parameter in the nearly decade that has passed since 0f9e62e084, let's
remove the parameter and any dead pieces of code connected to it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 06:52:32 -07:00
Taylor Blau 07647c92ff pack-bitmap: avoid use of static `bitmap_writer`
The pack-bitmap machinery uses a structure called 'bitmap_writer' to
collect the data necessary to write out .bitmap files. Since its
introduction in 7cc8f97108 (pack-objects: implement bitmap writing,
2013-12-21), there has been a single static bitmap_writer structure,
which is responsible for all bitmap writing-related operations.

In practice, this is OK, since we are only ever writing a single .bitmap
file in a single process (e.g., `git multi-pack-index write --bitmap`,
`git pack-objects --write-bitmap-index`, `git repack -b`, etc.).

However, having a single static variable makes issues like data
ownership unclear, when to free variables, what has/hasn't been
initialized unclear.

Refactor this code to be written in terms of a given bitmap_writer
structure instead of relying on a static global.

Note that this exposes the structure definition of the bitmap_writer at
the pack-bitmap.h level. We could work around this by, e.g., forcing
callers to declare their writers as:

    struct bitmap_writer *writer;
    bitmap_writer_init(&bitmap_writer);

and then declaring `bitmap_writer_init()` as taking in a double-pointer
like so:

    void bitmap_writer_init(struct bitmap_writer **writer);

which would avoid us having to expose the definition of the structure
itself. This patch takes a different approach, since future patches
(like for the ongoing pseudo-merge bitmaps work) will want to modify the
innards of this structure (in the previous example, via pseudo-merge.c).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 06:52:32 -07:00
Junio C Hamano 83f1add914 Git 2.45.1
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmYxBJ0ACgkQsLXohpav
 5suE6A//RTmt/rsMCDvpHEYSvox0ln5oMWyXrqKiHLxesMc0uLWRHAUDrHGCg7JP
 OoZkf1cV2yOcD4lhO4YrlcHR3n1xdAyGrhc5vyLI4DFAAxdOLl4VDHRazXm51u+p
 8GLxQY/1xu9bvde1PDYL2qtjDMskMgqb2Rfvv6ULpfICJrioy+CO5wud7BYIX4qB
 oFZQnFLrQnSW9XT3r2+hKJKP4cHXQX5tYY0mkiy3bjbscNGyjdrkqMjJ2QEIWqhj
 SUCujS5Clx6WKr0uLxoKs1IemdV0lkg2IbsxMZ5yYxLH2P9O7jQHvjgOx5NgfRlu
 NtYMWsrkYhylWUxLiTFgLbJ8DE6sjN+emYOqCDRlr7XPvsvVX6eucX9YRxS4C/XP
 izoOhAHJOFRaI/nMuG7iOOmnobKJKy0PbVFgA4W8MtNKZ+4taKF24aSK3TZpArhX
 Z3gMQwSWoO6KVPJ7+Et2x/WV5BmVAbpMMufX2ErwOhMDMO9jlvYy0q2OeCaiMg1c
 xZGGxC441IsYPVwSrJFU/U+Pl190PEazgmclkaqdothbjeMPb/gBV4j46Rznjld4
 68n3h1rW2S5AQbMKie+/Yygi0O087VAvTMsYPxDKsDmbeUHvCEd148dKgdeU59ct
 IXkrf2UW7dUWwZv2lv8NMdLue2M5bB9Yeufg3GJkfOaTy+1S5TM=
 =g/43
 -----END PGP SIGNATURE-----

Sync with Git 2.45.1

* tag 'v2.45.1': (42 commits)
  Git 2.45.1
  Git 2.44.1
  Git 2.43.4
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  ...
2024-05-13 18:29:15 -07:00
Junio C Hamano 17bc3a4767 Merge branch 'ps/undecided-is-not-necessarily-sha1' into jc/undecided-is-not-necessarily-sha1-fix
* ps/undecided-is-not-necessarily-sha1:
  repository: stop setting SHA1 as the default object hash
  oss-fuzz/commit-graph: set up hash algorithm
  builtin/shortlog: don't set up revisions without repo
  builtin/diff: explicitly set hash algo when there is no repo
  builtin/bundle: abort "verify" early when there is no repository
  builtin/blame: don't access potentially unitialized `the_hash_algo`
  builtin/rev-parse: allow shortening to more than 40 hex characters
  remote-curl: fix parsing of detached SHA256 heads
  attr: fix BUG() when parsing attrs outside of repo
  attr: don't recompute default attribute source
  parse-options-cb: only abbreviate hashes when hash algo is known
  path: move `validate_headref()` to its only user
  path: harden validation of HEAD with non-standard hashes
2024-05-13 12:24:54 -07:00
Junio C Hamano 9422e7169e Merge branch 'ps/config-subcommands' into ps/builtin-config-cleanup
* ps/config-subcommands:
  builtin/config: display subcommand help
  builtin/config: introduce "edit" subcommand
  builtin/config: introduce "remove-section" subcommand
  builtin/config: introduce "rename-section" subcommand
  builtin/config: introduce "unset" subcommand
  builtin/config: introduce "set" subcommand
  builtin/config: introduce "get" subcommand
  builtin/config: introduce "list" subcommand
  builtin/config: pull out function to handle `--null`
  builtin/config: pull out function to handle config location
  builtin/config: use `OPT_CMDMODE()` to specify modes
  builtin/config: move "fixed-value" option to correct group
  builtin/config: move option array around
  config: clarify memory ownership when preparing comment strings
2024-05-10 10:32:06 -07:00
Junio C Hamano f526a4f314 Merge branch 'ps/the-index-is-no-more'
The singleton index_state instance "the_index" has been eliminated
by always instantiating "the_repository" and replacing references
to "the_index"  with references to its .index member.

* ps/the-index-is-no-more:
  repository: drop `initialize_the_repository()`
  repository: drop `the_index` variable
  builtin/clone: stop using `the_index`
  repository: initialize index in `repo_init()`
  builtin: stop using `the_index`
  t/helper: stop using `the_index`
2024-05-08 10:18:44 -07:00
Junio C Hamano c5c9acf77d Merge branch 'bc/credential-scheme-enhancement'
The credential helper protocol, together with the HTTP layer, have
been enhanced to support authentication schemes different from
username & password pair, like Bearer and NTLM.

* bc/credential-scheme-enhancement:
  credential: add method for querying capabilities
  credential-cache: implement authtype capability
  t: add credential tests for authtype
  credential: add support for multistage credential rounds
  t5563: refactor for multi-stage authentication
  docs: set a limit on credential line length
  credential: enable state capability
  credential: add an argument to keep state
  http: add support for authtype and credential
  docs: indicate new credential protocol fields
  credential: add a field called "ephemeral"
  credential: gate new fields on capability
  credential: add a field for pre-encoded credentials
  http: use new headers for each object request
  remote-curl: reset headers on new request
  credential: add an authtype field
2024-05-08 10:18:44 -07:00
Patrick Steinhardt 2e5c4758b7 cocci: apply rules to rewrite callers of "refs" interfaces
Apply the rules that rewrite callers of "refs" interfaces to explicitly
pass `struct ref_store`. The resulting patch has been applied with the
`--whitespace=fix` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
Patrick Steinhardt 54876c6dfb refs: add `exclude_patterns` parameter to `for_each_fullref_in()`
The `for_each_fullref_in()` function is supposedly the ref-store-less
equivalent of `refs_for_each_fullref_in()`, but the latter has gained a
new parameter `exclude_patterns` over time. Bring these two functions
back in sync again by adding the parameter to the former function, as
well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
John Passaro 066cef7707 builtin/tag: add --trailer option
git-tag supports interpreting trailers from an annotated tag message,
using --list --format="%(trailers)". However, the available methods to
add a trailer to a tag message (namely -F or --editor) are not as
ergonomic.

In a previous patch, we moved git-commit's implementation of its
--trailer option to the trailer.h API. Let's use that new function to
teach git-tag the same --trailer option, emulating as much of
git-commit's behavior as much as possible.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: John Passaro <john.a.passaro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:03 -07:00
John Passaro 4a8618785e builtin/commit: refactor --trailer logic
git-commit adds user trailers to the commit message by passing its
`--trailer` arguments to a child process running `git-interpret-trailers
--in-place`. This logic is broadly useful, not just for git-commit but
for other commands constructing message bodies (e.g. git-tag).

Let's move this logic from git-commit to a new function in the trailer
API, so that it can be re-used in other commands.

Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: John Passaro <john.a.passaro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:03 -07:00
John Passaro 56740f9910 builtin/commit: use ARGV macro to collect trailers
Replace git-commit's callback for --trailer with the standard
OPT_PASSTHRU_ARGV macro. The callback only adds its values to a strvec
and sanity-checks that `unset` is always false; both of these are
already implemented in the parse-option API.

Signed-off-by: John Passaro <john.a.passaro@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:05:41 -07:00
Karthik Nayak f151dfe3c9 refs: rename `refs_create_symref()` to `refs_update_symref()`
The `refs_create_symref()` function is used to update/create a symref.
But it doesn't check the old target of the symref, if existing. It force
updates the symref. In this regard, the name `refs_create_symref()` is a
bit misleading. So let's rename it to `refs_update_symref()`. This is
akin to how 'git-update-ref(1)' also allows us to create apart from
update.

While we're here, rename the arguments in the function to clarify what
they actually signify and reduce confusion.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:50 -07:00
Karthik Nayak 1bc4cc3fc4 refs: accept symref values in `ref_transaction_update()`
The function `ref_transaction_update()` obtains ref information and
flags to create a `ref_update` and add them to the transaction at hand.

To extend symref support in transactions, we need to also accept the
old and new ref targets and process it. This commit adds the required
parameters to the function and modifies all call sites.

The two parameters added are `new_target` and `old_target`. The
`new_target` is used to denote what the reference should point to when
the transaction is applied. Some functions allow this parameter to be
NULL, meaning that the reference is not changed.

The `old_target` denotes the value the reference must have before the
update. Some functions allow this parameter to be NULL, meaning that the
old value of the reference is not checked.

We also update the internal function `ref_transaction_add_update()`
similarly to take the two new parameters.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:49 -07:00
Patrick Steinhardt 373bfa6077 builtin/shortlog: don't set up revisions without repo
It is possible to run git-shortlog(1) outside of a repository by passing
it output from git-log(1) via standard input. Obviously, as there is no
repository in that context, it is thus unsupported to pass any revisions
as arguments.

Regardless of that we still end up calling `setup_revisions()`. While
that works alright, it is somewhat strange. Furthermore, this is about
to cause problems when we unset the default object hash.

Refactor the code to only call `setup_revisions()` when we have a
repository. This is safe to do as we already verify that there are no
arguments when running outside of a repository anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:50 -07:00
Patrick Steinhardt ab274909d4 builtin/diff: explicitly set hash algo when there is no repo
The git-diff(1) command can be used outside repositories to diff two
files with each other. But even if there is no repository we will end up
hashing the files that we are diffing so that we can print the "index"
line:

    ```
    diff --git a/a b/b
    index 7898192..6178079 100644
    --- a/a
    +++ b/b
    @@ -1 +1 @@
    -a
    +b
    ```

We implicitly use SHA1 to calculate the hash here, which is because
`the_repository` gets initialized with SHA1 during the startup routine.
We are about to stop doing this though such that `the_repository` only
ever has a hash function when it was properly initialized via a repo's
configuration.

To give full control to our users, we would ideally add a new switch to
git-diff(1) that allows them to specify the hash function when executed
outside of a repository. But for now, we only convert the code to make
this explicit such that we can stop setting the default hash algorithm
for `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
Patrick Steinhardt 332b56b762 builtin/bundle: abort "verify" early when there is no repository
Verifying a bundle requires us to have a repository. This is encoded in
`verify_bundle()`, which will return an error if there is no repository.
We call `open_bundle()` before we call `verify_bundle()` though, which
already performs some verifications even though we may ultimately abort
due to a missing repository.

This is problematic because `open_bundle()` already reads the bundle
header and verifies that it contains a properly formatted hash. When
there is no repository we have no clue what hash function to expect
though, so we always end up assuming SHA1 here, which may or may not be
correct. Furthermore, we are about to stop initializing `the_hash_algo`
when there is no repository, which will lead to segfaults.

Check early on whether we have a repository to fix this issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
Patrick Steinhardt ce992ce29a builtin/blame: don't access potentially unitialized `the_hash_algo`
We access `the_hash_algo` in git-blame(1) before we have executed
`parse_options_start()`, which may not be properly set up in case we
have no repository. This is fine for most of the part because all the
call paths that lead to it (git-blame(1), git-annotate(1) as well as
git-pick-axe(1)) specify `RUN_SETUP` and thus require a repository.

There is one exception though, namely when passing `-h` to print the
help. Here we will access `the_hash_algo` even if there is no repo.
This works fine right now because `the_hash_algo` gets sets up to point
to the SHA1 algorithm via `initialize_repository()`. But we're about to
stop doing this, and thus the code would lead to a `NULL` pointer
exception.

Prepare the code for this and only access `the_hash_algo` after we are
sure that there is a proper repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
Patrick Steinhardt 07658e9ce5 builtin/rev-parse: allow shortening to more than 40 hex characters
The `--short=` option for git-rev-parse(1) allows the user to specify
to how many characters object IDs should be shortened to. The option is
broken though for SHA256 repositories because we set the maximum allowed
hash size to `the_hash_algo->hexsz` before we have even set up the repo.
Consequently, `the_hash_algo` will always be SHA1 and thus we truncate
every hash after at most 40 characters.

Fix this by accessing `the_hash_algo` only after we have set up the
repo.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
Junio C Hamano 3452c8ab8a Merge branch 'ps/the-index-is-no-more' into ps/undecided-is-not-necessarily-sha1
* ps/the-index-is-no-more:
  repository: drop `initialize_the_repository()`
  repository: drop `the_index` variable
  builtin/clone: stop using `the_index`
  repository: initialize index in `repo_init()`
  builtin: stop using `the_index`
  t/helper: stop using `the_index`
2024-05-06 22:50:29 -07:00
Junio C Hamano c22d41d641 format-patch: run range-diff with larger creation-factor
We see too often that a range-diff added to format-patch output
shows too many "unmatched" patches.  This is because the default
value for creation-factor is set to a relatively low value.

It may be justified for other uses (like you have a yet-to-be-sent
new iteration of your series, and compare it against the 'seen'
branch that has an older iteration, probably with the '--left-only'
option, to pick out only your patches while ignoring the others) of
"range-diff" command, but when the command is run as part of the
format-patch, the user _knows_ and expects that the patches in the
old and the new iterations roughly correspond to each other, so we
can and should use a much higher default.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:57:22 -07:00
Patrick Steinhardt 7b91d310ce builtin/config: display subcommand help
Until now, `git config -h` would have printed help for the old-style
syntax. Now that all modes have proper subcommands though it is
preferable to instead display the subcommand help.

Drop the `NO_INTERNAL_HELP` flag to do so. While at it, drop the help
mismatch in t0450 and add the `--get-colorbool` option to the usage such
that git-config(1)'s synopsis and `git config -h` match.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:10 -07:00
Patrick Steinhardt 3cbace5ee0 builtin/config: introduce "edit" subcommand
Introduce a new "edit" subcommand to git-config(1). Please refer to
preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:10 -07:00
Patrick Steinhardt 15dad20c3f builtin/config: introduce "remove-section" subcommand
Introduce a new "remove-section" subcommand to git-config(1). Please
refer to preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:10 -07:00
Patrick Steinhardt 3418e96f37 builtin/config: introduce "rename-section" subcommand
Introduce a new "rename-section" subcommand to git-config(1). Please
refer to preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:09 -07:00
Patrick Steinhardt 95ea69c67b builtin/config: introduce "unset" subcommand
Introduce a new "unset" subcommand to git-config(1). Please refer to
preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:09 -07:00
Patrick Steinhardt 00bbdde141 builtin/config: introduce "set" subcommand
Introduce a new "set" subcommand to git-config(1). Please refer to
preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:09 -07:00
Patrick Steinhardt 4e51389000 builtin/config: introduce "get" subcommand
Introduce a new "get" subcommand to git-config(1). Please refer to
preceding commits regarding the motivation behind this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:09 -07:00
Patrick Steinhardt 14970509c6 builtin/config: introduce "list" subcommand
While git-config(1) has several modes, those modes are not exposed with
subcommands but instead by specifying action flags like `--unset` or
`--list`. This user interface is not really in line with how our more
modern commands work, where it is a lot more customary to say e.g. `git
remote list`. Furthermore, to add to the confusion, git-config(1) also
allows the user to request modes implicitly by just specifying the
correct number of arguments. Thus, `git config foo.bar` will retrieve
the value of "foo.bar" while `git config foo.bar baz` will set it to
"baz".

Overall, this makes for a confusing interface that could really use a
makeover. It hurts discoverability of what you can do with git-config(1)
and is comparatively easy to get wrong. Converting the command to have
subcommands instead would go a long way to help address these issues.

One concern in this context is backwards compatibility. Luckily, we can
introduce subcommands without breaking backwards compatibility at all.
This is because all the implicit modes of git-config(1) require that the
first argument is a properly formatted config key. And as config keys
_must_ have a dot in their name, any value without a dot would have been
discarded by git-config(1) previous to this change. Thus, given that
none of the subcommands do have a dot, they are unambiguous.

Introduce the first such new subcommand, which is "git config list". To
retain backwards compatibility we only conditionally use subcommands and
will fall back to the old syntax in case no subcommand was detected.
This should help to transition to the new-style syntax until we
eventually deprecate and remove the old-style syntax.

Note that the way we handle this we're duplicating some functionality
across old and new syntax. While this isn't pretty, it helps us to
ensure that there really is no change in behaviour for the old syntax.

Amend tests such that we run them both with old and new style syntax.
As tests are now run twice, state from the first run may be still be
around in the second run and thus cause tests to fail. Add cleanup logic
as required to fix such tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:08 -07:00
Patrick Steinhardt fee3796616 builtin/config: pull out function to handle `--null`
Pull out function to handle the `--null` option, which we are about to
reuse in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:08 -07:00
Patrick Steinhardt 9dda6b72b7 builtin/config: pull out function to handle config location
There's quite a bunch of options to git-config(1) that allow the user to
specify which config location to use when reading or writing config
options. The logic to handle this is thus by necessity also quite
involved.

Pull it out into a separate function so that we can reuse it in
subsequent commits which introduce proper subcommands.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:08 -07:00
Patrick Steinhardt daa3325024 builtin/config: use `OPT_CMDMODE()` to specify modes
The git-config(1) command has various different modes which are
accessible via e.g. `--get-urlmatch` or `--unset-all`. These modes are
declared with `OPT_BIT()`, which causes two minor issues:

  - The respective modes also have a negated form `--no-get-urlmatch`,
    which is unintended.

  - We have to manually handle exclusiveness of the modes.

Switch these options to instead use `OPT_CMDMODE()`, which is made
exactly for this usecase. Remove the now-unneeded check that only a
single mode is given, which is now handled by the parse-options
interface.

While at it, format optional placeholders for arguments to conform to
our style guidelines by using `[<placeholder>]`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:07 -07:00
Patrick Steinhardt 8415507b32 builtin/config: move "fixed-value" option to correct group
The `--fixed-value` option can be used to alter how the value-pattern
parameter is interpreted for the various actions of git-config(1). But
while it is an option, it is currently listed as part of the actions
group, which is wrong.

Move the option to the "Other" group, which hosts the various options
known to git-config(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:07 -07:00
Patrick Steinhardt 424a29c3a7 builtin/config: move option array around
Move around the option array. This will help us with a follow-up commit
that introduces subcommands to git-config(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:07 -07:00
Patrick Steinhardt a78b462976 config: clarify memory ownership when preparing comment strings
The ownership of memory returned when preparing a comment string is
quite intricate: when the returned value is different than the passed
value, then the caller is responsible to free the memory. This is quite
subtle, and it's even easier to miss because the returned value is in
fact a `const char *`.

Adapt the function to always return either `NULL` or a newly allocated
string. The function is called at most once per git-config(1), so it's
not like this micro-optimization really matters. Thus, callers are now
always responsible for freeing the value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 11:50:07 -07:00
Linus Arver 24a25c630c trailer: make parse_trailers() return trailer_info pointer
This is the second and final preparatory commit for making the
trailer_info struct private to the trailer implementation.

Make trailer_info_get() do the actual work of allocating a new
trailer_info struct, and return a pointer to it. Because
parse_trailers() wraps around trailer_info_get(), it too can return this
pointer to the caller. From the trailer API user's perspective, the call
to trailer_info_new() can be replaced with parse_trailers(); do so in
interpret-trailers.

Because trailer_info_new() is no longer called by interpret-trailers,
remove this function from the trailer API.

With this change, we no longer allocate trailer_info on the stack ---
all uses of it are via a pointer where the actual data is always
allocated at runtime through trailer_info_new(). Make
trailer_info_release() free this dynamically allocated memory.

Finally, due to the way the function signatures of parse_trailers() and
trailer_info_get() have changed, update the callsites in
format_trailers_from_commit() and trailer_iterator_init() accordingly.

Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
Linus Arver 655eb65d48 interpret-trailers: access trailer_info with new helpers
Instead of directly accessing trailer_info members, access them
indirectly through new helper functions exposed by the trailer API.

This is the first of two preparatory commits which will allow us to
use the so-called "pimpl" (pointer to implementation) idiom for the
trailer API, by making the trailer_info struct private to the trailer
implementation (and thus hidden from the API).

Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Linus Arver <linus@ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02 09:57:08 -07:00
Junio C Hamano 75b182d34e Merge branch 'js/for-each-repo-keep-going'
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out.  Now it keeps going.

* js/for-each-repo-keep-going:
  maintenance: running maintenance should not stop on errors
  for-each-repo: optionally keep going on an error
2024-04-30 14:49:45 -07:00
Junio C Hamano 90f6b5a597 Merge branch 'aj/stash-staged-fix'
"git stash -S" did not handle binary files correctly, which has
been corrected.

* aj/stash-staged-fix:
  stash: fix "--staged" with binary files
2024-04-30 14:49:43 -07:00
Junio C Hamano 708e9257f8 Merge branch 'jc/format-patch-rfc-more'
The "--rfc" option of "git format-patch" learned to take an
optional string value to be used in place of "RFC" to tweak the
"[PATCH]" on the subject header.

* jc/format-patch-rfc-more:
  format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)]
  format-patch: allow --rfc to optionally take a value, like --rfc=WIP
2024-04-30 14:49:43 -07:00
Junio C Hamano 07fc8275e1 Merge branch 'ds/format-patch-rfc-and-k'
The "-k" and "--rfc" options of "format-patch" will now error out
when used together, as one tells us not to add anything to the
title of the commit, and the other one tells us to add "RFC" in
addition to "PATCH".

* ds/format-patch-rfc-and-k:
  format-patch: ensure that --rfc and -k are mutually exclusive
2024-04-30 14:49:42 -07:00
Junio C Hamano 55e5548a0f Merge branch 'xx/disable-replace-when-building-midx'
The procedure to build multi-pack-index got confused by the
replace-refs mechanism, which has been corrected by disabling the
latter.

* xx/disable-replace-when-building-midx:
  midx: disable replace objects
2024-04-30 14:49:42 -07:00
Johannes Schindelin 1c00f92eb5 Sync with 2.44.1
* maint-2.44: (41 commits)
  Git 2.44.1
  Git 2.43.4
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  ...
2024-04-29 20:42:30 +02:00