Commit Graph

74067 Commits (c3d034df1606beca8a4ba3bc8a8266f4c711f23c)

Author SHA1 Message Date
Junio C Hamano 51ea70c18a Merge branch 'jk/sparse-leakfix'
Many memory leaks in the sparse-checkout code paths have been
plugged.

* jk/sparse-leakfix:
  sparse-checkout: free duplicate hashmap entries
  sparse-checkout: free string list after displaying
  sparse-checkout: free pattern list in sparse_checkout_list()
  sparse-checkout: free sparse_filename after use
  sparse-checkout: refactor temporary sparse_checkout_patterns
  sparse-checkout: always free "line" strbuf after reading input
  sparse-checkout: reuse --stdin buffer when reading patterns
  dir.c: always copy input to add_pattern()
  dir.c: free removed sparse-pattern hashmap entries
  sparse-checkout: clear patterns when init() sees existing sparse file
  dir.c: free strings in sparse cone pattern hashmaps
  sparse-checkout: pass string literals directly to add_pattern()
  sparse-checkout: free string list in write_cone_to_file()
2024-06-12 13:37:17 -07:00
Junio C Hamano c2f79440ac Merge branch 'jk/cap-exclude-file-size'
An overly large ".gitignore" files are now rejected silently.

* jk/cap-exclude-file-size:
  dir.c: reduce max pattern file size to 100MB
  dir.c: skip .gitignore, etc larger than INT_MAX
2024-06-12 13:37:17 -07:00
Junio C Hamano b8bdb2f283 Merge branch 'jc/safe-directory-leading-path'
The safe.directory configuration knob has been updated to
optionally allow leading path matches.

* jc/safe-directory-leading-path:
  safe.directory: allow "lead/ing/path/*" match
2024-06-12 13:37:16 -07:00
Junio C Hamano 22cf18fd9e Merge branch 'gt/t-hash-unit-test'
A pair of test helpers that essentially are unit tests on hash
algorithms have been rewritten using the unit-tests framework.

* gt/t-hash-unit-test:
  t/: migrate helper/test-{sha1, sha256} to unit-tests/t-hash
  strbuf: introduce strbuf_addstrings() to repeatedly add a string
2024-06-12 13:37:15 -07:00
Junio C Hamano 56346ba24e Merge branch 'cp/reftable-unit-test'
Basic unit tests for reftable have been reimplemented under the
unit test framework.

* cp/reftable-unit-test:
  t: improve the test-case for parse_names()
  t: add test for put_be16()
  t: move tests from reftable/record_test.c to the new unit test
  t: move tests from reftable/stack_test.c to the new unit test
  t: move reftable/basics_test.c to the unit testing framework
2024-06-12 13:37:14 -07:00
Junio C Hamano a39e28ace7 Merge branch 'jc/t1517-more'
A new test was added to ensure git commands that are designed to
run outside repositories do work.

* jc/t1517-more:
  imap-send: minimum leakfix
  t1517: more coverage for commands that work without repository
2024-06-12 13:37:14 -07:00
Ghanshyam Thakkar ed54840872 t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c
helper/test-oidtree.c along with t0069-oidtree.sh test the oidtree.h
library, which is a wrapper around crit-bit tree. Migrate them to
the unit testing framework for better debugging and runtime
performance. Along with the migration, add an extra check for
oidtree_each() test, which showcases how multiple expected matches can
be given to check_each() helper.

To achieve this, introduce a new library called 'lib-oid.h'
exclusively for the unit tests to use. It currently mainly includes
utility to generate object_id from an arbitrary hex string
(i.e. '12a' -> '12a0000000000000000000000000000000000000'). This also
handles the hash algo selection based on GIT_TEST_DEFAULT_HASH.
This library will also be helpful when we port other unit tests such
as oid-array, oidset etc.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
[jc: small fixlets squashed in]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 13:33:20 -07:00
Patrick Steinhardt 037df60013 object-name: don't try to abbreviate to lengths greater than hexsz
When given a length that equals the current hash algorithm's hex size,
then `repo_find_unique_abbrev_r()` exits early without trying to find an
abbreviation. This is only sensible because there is nothing to
abbreviate in the first place, so searching through objects to find a
unique prefix would be a waste of compute.

What we don't handle though is the case where the user passes a length
greater than the hash length. This is fine in practice as we still
compute the correct result. But at the very least, this is a waste of
resources as we try to abbreviate a value that cannot be abbreviated,
which causes us to hit the object database.

Start to explicitly handle values larger than hexsz to avoid this
performance penalty, which leads to a measureable speedup. The following
benchmark has been executed in linux.git:

  Benchmark 1: git -c core.abbrev=9000 log --abbrev-commit (revision = HEAD~)
    Time (mean ± σ):     12.812 s ±  0.040 s    [User: 12.225 s, System: 0.554 s]
    Range (min … max):   12.723 s … 12.857 s    10 runs

  Benchmark 2: git -c core.abbrev=9000 log --abbrev-commit (revision = HEAD)
    Time (mean ± σ):     11.095 s ±  0.029 s    [User: 10.546 s, System: 0.521 s]
    Range (min … max):   11.037 s … 11.122 s    10 runs

  Summary
    git -c core.abbrev=9000 log --abbrev-commit HEAD (revision = HEAD) ran
      1.15 ± 0.00 times faster than git -c core.abbrev=9000 log --abbrev-commit HEAD (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 12:57:18 -07:00
Patrick Steinhardt 59ff92c516 parse-options-cb: stop clamping "--abbrev=" to hash length
The `OPT__ABBREV()` option allows the user to specify the length that
object hashes shall be abbreviated to. This length needs to be in the
range of `(MIN_ABBREV, the_hash_algo->hexsz)`, which is why we clamp the
value as required. While this makes sense in the case of `MIN_ABBREV`,
it is unnecessary for the upper boundary as the value is eventually
passed down to `repo_find_unnique_abbrev_r()`, which handles values
larger than the current hash length just fine.

In the preceding commit, we have changed parsing of the "core.abbrev"
config to stop clamping to the upper boundary. Let's do the same here so
that the code becomes simpler, we are consistent with how we treat the
"core.abbrev" config and so that we stop depending on `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 12:57:18 -07:00
Patrick Steinhardt 524c0183c9 config: fix segfault when parsing "core.abbrev" without repo
The "core.abbrev" config allows the user to specify the minimum length
when abbreviating object hashes. Next to the values "auto" and "no",
this config also accepts a concrete length that needs to be bigger or
equal to the minimum length and smaller or equal to the hash algorithm's
hex length. While the former condition is trivial, the latter depends on
the object format used by the current repository. It is thus a variable
upper boundary that may either be 40 (SHA-1) or 64 (SHA-256).

This has two major downsides. First, the user that specifies this config
must be aware of the object hashes that its repository use. If they want
to configure the value globally, then they cannot pick any value in the
range `[41, 64]` if they have any repository that uses SHA-1. If they
did, Git would error out when parsing the config.

Second, and more importantly, parsing "core.abbrev" crashes when outside
of a Git repository because we dereference `the_hash_algo` to figure out
its hex length. Starting with c8aed5e8da (repository: stop setting SHA1
as the default object hash, 2024-05-07) though, we stopped initializing
`the_hash_algo` outside of Git repositories.

Fix both of these issues by not making it an error anymore when the
given length exceeds the hash length. Instead, leave the abbreviated
length intact. `repo_find_unique_abbrev_r()` handles this just fine
except for a performance penalty which we will fix in a subsequent
commit.

Reported-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-12 12:57:18 -07:00
Patrick Steinhardt 1d969afb78 Makefile: add ability to append to CFLAGS and LDFLAGS
There are some usecases where we may want to append CFLAGS to the
default CFLAGS set by Git. This could for example be to enable or
disable specific compiler warnings or to change the optimization level
that code is compiled with. This cannot be done without overriding the
complete CFLAGS value though and thus requires the user to redeclare the
complete defaults used by Git.

Introduce a new variable `CFLAGS_APPEND` that gets appended to the
default value of `CFLAGS`. As compiler options are last-one-wins, this
fulfills both of the usecases mentioned above. It's also common practice
across many other projects to have such a variable.

While at it, also introduce a matching `LDFLAGS_APPEND` variable. While
there isn't really any need for this variable as there are no default
`LDFLAGS`, users may expect this variable to exist, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 16:11:43 -07:00
Taylor Blau e162aed591 pack-revindex.c: guard against out-of-bounds pack lookups
The function midx_key_to_pack_pos() is a helper function used by
midx_to_pack_pos() and midx_pair_to_pack_pos() to translate a (pack,
offset) tuple into a position into the MIDX pseudo-pack order.

Ensure that the pack ID given to midx_pair_to_pack_pos() is bounded by
the number of packs within the MIDX to prevent, for instance,
uninitialized memory from being used as a pack ID.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 16:08:28 -07:00
Taylor Blau ed4a1d6ae1 pack-bitmap.c: avoid uninitialized `pack_int_id` during reuse
When performing multi-pack reuse, reuse_partial_packfile_from_bitmap()
is responsible for generating an array of bitmapped_pack structs from
which to perform reuse.

In the multi-pack case, we loop over the MIDXs packs and copy the result
of calling `nth_bitmapped_pack()` to construct the list of reusable
paths.

But we may also want to do pack-reuse over a single pack, either because
we only had one pack to perform reuse over (in the case of single-pack
bitmaps), or because we explicitly asked to do single pack reuse even
with a MIDX[^1].

When this is the case, the array we generate of reusable packs contains
only a single element, which is either (a) the pack attached to the
single-pack bitmap, or (b) the MIDX's preferred pack.

In 795006fff4 (pack-bitmap: gracefully handle missing BTMP chunks,
2024-04-15), we refactored the reuse_partial_packfile_from_bitmap()
function and stopped assigning the pack_int_id field when reusing only
the MIDX's preferred pack. This results in an uninitialized read down in
try_partial_reuse() like so:

    ==7474==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55c5cd191dde in try_partial_reuse pack-bitmap.c:1887:8
    #1 0x55c5cd191dde in reuse_partial_packfile_from_bitmap_1 pack-bitmap.c:2001:8
    #2 0x55c5cd191dde in reuse_partial_packfile_from_bitmap pack-bitmap.c:2105:3
    #3 0x55c5cce0bd0e in get_object_list_from_bitmap builtin/pack-objects.c:4043:3
    #4 0x55c5cce0bd0e in get_object_list builtin/pack-objects.c:4156:27
    #5 0x55c5cce0bd0e in cmd_pack_objects builtin/pack-objects.c:4596:3
    #6 0x55c5ccc8fac8 in run_builtin git.c:474:11

which happens when try_partial_reuse() tries to call
midx_pair_to_pack_pos() when it tries to reject cross-pack deltas.

Avoid the uninitialized read by ensuring that the pack_int_id field is
set in the single-pack reuse case by setting it to either the MIDX
preferred pack's pack_int_id, or '-1', in the case of single-pack
bitmaps.  In the latter case, we never read the pack_int_id field, so
the choice of '-1' is intentional as a "garbage in, garbage out"
measure.

Guard against further regressions in this area by adding a test which
ensures that we do not throw out deltas from the preferred pack as
"cross-pack" due to an uninitialized pack_int_id.

[^1]: This can happen for a couple of reasons, either because the
  repository is configured with 'pack.allowPackReuse=(true|single)', or
  because the MIDX was generated prior to the introduction of the BTMP
  chunk, which contains information necessary to perform multi-pack
  reuse.

Reported-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 16:08:28 -07:00
Taylor Blau 0c5a62f14b midx-write.c: do not read existing MIDX with `packs_to_include`
Commit d6a8c58675 (midx-write.c: support reading an existing MIDX with
`packs_to_include`, 2024-05-29) changed the MIDX generation machinery to
support reading from an existing MIDX when writing a new one.

Unfortunately, the rest of the MIDX generation machinery is not prepared
to deal with such a change. For instance, the function responsible for
adding to the object ID fanout table from a MIDX source
(midx_fanout_add_midx_fanout()) will gladly add objects from an existing
MIDX for some fanout level regardless of whether or not those objects
came from packs that are to be included in the subsequent MIDX write.

This results in broken pseudo-pack object order (leading to incorrect
object traversal results) and segmentation faults, like so (generated by
running the added test prior to the changes in midx-write.c):

    #0  0x000055ee31393f47 in midx_pack_order (ctx=0x7ffdde205c70) at midx-write.c:590
    #1  0x000055ee31395a69 in write_midx_internal (object_dir=0x55ee32570440 ".git/objects",
        packs_to_include=0x7ffdde205e20, packs_to_drop=0x0, preferred_pack_name=0x0,
        refs_snapshot=0x0, flags=15) at midx-write.c:1171
    #2  0x000055ee31395f38 in write_midx_file_only (object_dir=0x55ee32570440 ".git/objects",
        packs_to_include=0x7ffdde205e20, preferred_pack_name=0x0, refs_snapshot=0x0, flags=15)
        at midx-write.c:1274
    [...]

In stack frame #0, the code on midx-write.c:590 is using the new pack ID
corresponding to some object which was added from the existing MIDX.
Importantly, the pack from which that object was selected in the
existing MIDX does not appear in the new MIDX as it was excluded via
`--stdin-packs`.

In this instance, the pack in question had pack ID "1" in the existing
MIDX, but since it was excluded from the new MIDX, we never filled in
that entry in the pack_perm table, resulting in:

    (gdb) p *ctx->pack_perm@2
    $1 = {0, 1515870810}

Which is what causes the segfault above when we try and read:

    struct pack_info *pack = &ctx->info[ctx->pack_perm[i]];
    if (pack->bitmap_pos == BITMAP_POS_UNKNOWN)
        pack->bitmap_pos = 0;

Fundamentally, we should be able to read information from an existing
MIDX when generating a new one. But in practice the midx-write.c code
assumes that we won't run into issues like the above with incongruent
pack IDs, and often makes those assumptions in extremely subtle and
fragile ways.

Instead, let's avoid reading from an existing MIDX altogether, and stick
with the pre-d6a8c58675 implementation. Harden against any regressions
in this area by adding a test which demonstrates these issues.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 16:08:28 -07:00
Patrick Steinhardt fbf7a46d88 builtin/blame: fix leaking ignore revs files
When parsing the blame configuration we add "blame.ignoreRevsFile"
configs to a string list. This string list is declared as with `NODUP`,
and thus we hand over the allocated string to that list. We eventually
end up calling `string_list_clear()` on that list, but due to it being
declared as `NODUP` we will not release the associated strings and thus
leak memory.

Fix this issue by setting up the list as `DUP` instead and free the
config string after insertion.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
Patrick Steinhardt 3332f35577 builtin/blame: fix leaking prefixed paths
In `cmd_blame()` we compute prefixed paths by calling `add_prefix()`,
which itself calls `prefix_path()`. While `prefix_path()` returns an
allocated string, `add_prefix()` pretends to return a constant string.
Consequently, this path never gets freed.

Fix the return type to be `char *` and free the path to plug the memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
Patrick Steinhardt ee6a998583 blame: fix leaking data for blame scoreboards
There are some memory leaks when cleaning up blame scoreboards. Fix
those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
Patrick Steinhardt 4b4f5a911c line-range: plug leaking find functions
In `parse_range_funcname()` we may end up allocating a "find function",
but never free it. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
Patrick Steinhardt 44ec7c575f merge: fix leaking merge bases
When calling either the recursive or the ORT merge machineries we need
to provide a list of merge bases. The ownership of that parameter is
then implicitly transferred to the callee, which is somewhat fishy.
Furthermore, that list may leak in some cases where the merge machinery
runs into an error, thus causing a memory leak.

Refactor the code such that we stop transferring ownership. Instead, the
merge machinery will now create its own local copies of the passed in
list as required if they need to modify the list. Free the list at the
callsites as required.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:08 -07:00
Patrick Steinhardt 77241a6b5e builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
In "builtin/merge.c" we use the helper infrastructure to figure out what
merge strategies there are. We never free contents of the `cmdnames`
structures though and thus leak their memory.

Fix this by exposing the already existing `clean_cmdnames()` function to
release their memory. As this name isn't quite idiomatic, rename it to
`cmdnames_release()` while at it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt 6e95f4ee03 sequencer: fix memory leaks in `make_script_with_merges()`
Fix some trivial memory leaks in `make_script_with_merges()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt 8909d6e1a1 builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
In `wanted_peer_refs()` we first create a copy of the "HEAD" ref. This
copy may not actually be passed back to the caller, but is not getting
freed in this case. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt 4806c55c86 apply: fix leaking string in `match_fragment()`
Before calling `update_pre_post_images()`, we call `strbuf_detach()` to
put its buffer into a new string variable that we then pass to that
function. Besides being rather pointless, it also causes us to leak
memory of that variable because we never free it.

Get rid of the variable altogether and instead reach into the `strbuf`
directly. While at it, refactor the code to have a common exit path and
mark string that do not contain allocated memory as constant.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt 1e5c1601f9 sequencer: fix leaking string buffer in `commit_staged_changes()`
We're leaking the `rev` string buffer in various call paths. Refactor
the function to have a common exit path so that we can release its
memory reliably.

This fixes a subset of tests failing with the memory sanitizer in t3404.
But as there are more failures, we cannot yet mark the whole test suite
as passing.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt 63c9bd372e commit: fix leaking parents when calling `commit_tree_extended()`
When creating commits via `commit_tree_extended()`, the caller passes in
a string list of parents. This call implicitly transfers ownership of
that list to the function, which is quite surprising to begin with. But
to make matters worse, `commit_tree_extended()` doesn't even bother to
free the list of parents in error cases. The result is a memory leak,
and one that the caller cannot fix by themselves because they do not
know whether parts of the string list have already been released.

Refactor the code such that callers can keep ownership of the list of
parents, which is getting indicated by parameter being a constant
pointer now. Free the lists at the calling site and add a common exit
path to those sites as required.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt c6eb58bfb1 config: fix leaking "core.notesref" variable
The variable used to track the "core.notesref" config is not getting
freed before we assign to it and thus leaks. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:07 -07:00
Patrick Steinhardt f46ede661f rerere: fix various trivial leaks
We leak various different string lists in the rerere code. Free those to
plug them.

Note that the `merge_rr` variable is intentionally being free'd with the
`free_util` parameter set to 1. The `util` field is used there to store
the IDs of every rerere item and thus needs to be freed, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt 748bd0943b builtin/stash: fix leak in `show_stash()`
We leak the `revision_args()` variable. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt a90a089611 revision: free diff options
There is a todo comment in `release_revisions()` that mentions that we
need to free the diff options, which was added via 54c8a7c379 (revisions
API: add a TODO for diff_free(&revs->diffopt), 2022-04-14). Releasing
the diff options wasn't quite feasible at that time because some call
sites rely on its contents to remain even after the revisions have been
released.

In fact, there really only are a couple of callsites that misbehave
here:

  - `cmd_shortlog()` releases the revisions, but continues to access its
    file pointer.

  - `do_diff_cache()` creates a shallow copy of `struct diff_options`,
    but does not set the `no_free` member. Consequently, we end up
    releasing resources of the caller-provided diff options.

  - `diff_free()` and friends do not play nice when being called
    multiple times as they don't unset data structures that they have
    just released.

Fix all of those cases and enable the call to `diff_free()`, which plugs
a bunch of memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt a282dbeba7 builtin/log: fix leaking commit list in git-cherry(1)
We're storing the list of commits that git-cherry(1) is about to print
into a temporary list. This list is never getting free'd and thus leaks.
Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt 8ff6bd4750 merge-recursive: fix memory leak when finalizing merge
We do not free some members of `struct merge_options`' private data.
Fix this to plug those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt 3199b22e7d builtin/merge-recursive: fix leaking object ID bases
In `cmd_merge_recursive()` we have a static array of object ID bases
that we pass to `merge_recursive_generic()`. This interface is somewhat
weird though because the latter function accepts a pointer to a pointer
of object IDs, which requires us to allocate the object IDs on the heap.
And as we never free those object IDs, the end result is a leak.

While we can easily solve this leak by just freeing the respective
object IDs, the whole calling convention is somewhat weird. Instead,
refactor `merge_recursive_generic()` to accept a plain pointer to object
IDs so that we can avoid allocating them altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt 9e903a5531 builtin/difftool: plug memory leaks in `run_dir_diff()`
We're leaking a bunch of memory leaks in `run_dir_diff()`. Plug them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:06 -07:00
Patrick Steinhardt f87c55c264 object-name: free leaking object contexts
While it is documented in `struct object_context::path` that this
variable needs to be released by the caller, this fact is rather easy to
miss given that we do not ever provide a function to release the object
context. And of course, while some callers dutifully release the path,
many others don't.

Introduce a new `object_context_release()` function that releases the
path. Convert callsites that used to free the path to use that new
function and add missing calls to callsites that were leaking memory.
Refactor those callsites as required to have a single return path, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt 61f8bb1ec1 builtin/rev-list: fix leaking bitmap index when calculating disk usage
git-rev-list(1) can speed up its object size calculations for reachable
objects via a bitmap walk, if there is any bitmap. This is done in
`try_bitmap_disk_usage()`, which tries to optimistically load the bitmap
and then use it, if available. It never frees it though, leading to a
memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt f644dc8494 notes: fix memory leak when pruning notes
In `prune_notes()` we first store the notes that are to be deleted in a
local list, and then iterate through that list to delete those notes one
by one. We never free the list though and thus leak its memory. Fix
this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt 9748537437 revision: fix leaking display notes
We never free the display notes options embedded into `struct revision`.
Implement a new function `release_display_notes()` that we can call in
`release_revisions()` to fix this.

There is another gotcha here though: we play some games with the string
list used to track extra notes refs, where we sometimes set the bit that
indicates that strings should be strdup'd and sometimes unset it. This
dance is done to avoid a copy of an already-allocated string when we
call `enable_ref_display_notes()`. But this dance is rather pointless as
we can instead call `string_list_append_nodup()` to transfer ownership
of the allocated string to the list.

Refactor the code to do so and drop the `strdup_strings` dance.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt 3d31d38255 merge-recursive: fix leaking rename conflict info
When computing rename conflicts in our recursive merge algorithm we set
up `struct rename_conflict_info`s to track that information. We never
free those data structures though and thus leak memory.

We need to be a bit more careful here though because the same rename
conflict info can be assigned to multiple structures. Accommodate for
this by introducing a `rename_conflict_info_owned` bit that we can use
to steer whether or not the rename conflict info shall be free'd.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt afb0653d23 biultin/rev-parse: fix memory leaks in `--parseopt` mode
We have a bunch of memory leaks in git-rev-parse(1)'s `--parseopt` mode.
Refactor the code to use `struct strvec`s to make it easier for us to
track the lifecycle of those leaking variables and then free them.

While at it, remove the unneeded static lifetime for some of the
variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt 11ee9a75e7 bundle: plug leaks in `create_bundle()`
When creating a bundle, we set up a revision walk, but never release
data associated with it. Furthermore, we create a mostly-shallow copy of
that revision walk where we only adapt its pending objects such that we
can reuse the walk. While that copy must not be released, the pending
objects array need to be.

Plug those memory leaks by releasing the revision walk and the pending
objects of the copied revision walk.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt bb8c43d5cd notes-utils: free note trees when releasing copied notes
While we clear most of the members of `struct notes_rewrite_cfg` in
`finish_copy_notes_for_rewrite()`, we do not clear the notes tree. Fix
this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:05 -07:00
Patrick Steinhardt 14da26230a parse-options: fix leaks for users of OPT_FILENAME
The `OPT_FILENAME()` option will, if set, put an allocated string into
the user-provided variable. Consequently, that variable thus needs to be
free'd by the caller of `parse_options()`. Some callsites don't though
and thus leak memory. Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:04 -07:00
Patrick Steinhardt 56931c4d89 revision: fix memory leak when reversing revisions
When reversing revisions in a rev walk, `get_revision()` will allocate a
new commit list and assign it to `revs->commits`. It does not free the
old list though, which makes it leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11 13:15:04 -07:00
Junio C Hamano 8d94cfb545 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 10:30:39 -07:00
Junio C Hamano 5235e56ea5 Merge branch 'jk/leakfixes'
Memory leaks in "git mv" has been plugged.

* jk/leakfixes:
  mv: replace src_dir with a strvec
  mv: factor out empty src_dir removal
  mv: move src_dir cleanup to end of cmd_mv()
  t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL
  t-strvec: use va_end() to match va_start()
2024-06-10 10:30:39 -07:00
Junio C Hamano 718b50e3bf Merge branch 'iw/trace-argv-on-alias'
The alias-expanded command lines are logged to the trace output.

* iw/trace-argv-on-alias:
  run-command: show prepared command
  Documentation: alias: add notes on shell expansion
  Documentation: alias: rework notes into points
2024-06-10 10:30:38 -07:00
René Scharfe d7b97b7185 diff: let external diffs report that changes are uninteresting
The options --exit-code and --quiet instruct git diff to indicate
whether it found any significant changes by exiting with code 1 if it
did and 0 if there were none.  Currently this doesn't work if external
diff programs are involved, as we have no way to learn what they found.

Add that ability in the form of the new configuration options
diff.trustExitCode and diff.<driver>.trustExitCode and the environment
variable GIT_EXTERNAL_DIFF_TRUST_EXIT_CODE.  They pair with the config
options diff.external and diff.<driver>.command and the environment
variable GIT_EXTERNAL_DIFF, respectively.

The new options are off by default, keeping the old behavior.  Enabling
them indicates that the external diff returns exit code 1 if it finds
significant changes and 0 if it doesn't, like diff(1).

The name of the new options is taken from the git difftool and mergetool
options of similar purpose.  (There they enable passing on the exit code
of a diff tool and to infer whether a merge done by a merge tool is
successful.)

The new feature sets the diff flag diff_from_contents in
diff_setup_done() if we need the exit code and are allowed to call
external diffs.  This disables the optimization that avoids calling the
program with --quiet.  Add it back by skipping the call if the external
diff is not able to report empty diffs.  We can only do that check after
evaluating the file-specific attributes in run_external_diff().

If we do run the external diff with --quiet, send its output to
/dev/null.

I considered checking the output of the external diff to check whether
its empty.  It was added as 11be65cfa4 (diff: fix --exit-code with
external diff, 2024-05-05) and quickly reverted, as it does not work
with external diffs that do not write to stdout.  There's no reason why
a graphical diff tool would even need to write anything there at all.

I also considered using a non-zero exit code for empty diffs, which
could be done without adding new configuration options.  We'd need to
disable the optimization that allows git diff --quiet to skip calling
external diffs, though -- that might be quite surprising if graphical
diff programs are involved.  And assigning the opposite meaning of the
exit codes compared to diff(1) and git diff --exit-code to the external
diff can cause unnecessary confusion.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:20:46 -07:00
René Scharfe 54443bbfc3 userdiff: add and use struct external_diff
Wrap the string specifying the external diff command in a new struct to
simplify adding attributes, which the next patch will do.

Make sure external_diff() still returns NULL if neither the environment
variable GIT_EXTERNAL_DIFF nor the configuration option diff.external is
set, to continue allowing its use in a boolean context.

Use a designated initializer for the default builtin userdiff driver to
adjust to the type change of the second struct member.  Spelling out
only the non-zero members improves readability as a nice side-effect.

No functional change intended.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:19:20 -07:00
René Scharfe 33be6cf51a t4020: test exit code with external diffs
Add tests to check the exit code of git diff with its options --quiet
and --exit-code when using an external diff program.  Currently we
cannot tell whether it found significant changes or not.

While at it, document briefly that --quiet turns off execution of
external diff programs because that behavior surprised me for a moment
while writing the tests.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:19:20 -07:00
Junio C Hamano 99c7de732e __attribute__: add a few missing format attributes
A public function mem_pool_strfmt() takes printf like parameters,
but is not given an attribute as such.  Also a few file-scope static
functions were missing their format attribute.

Add them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10 09:16:30 -07:00