macOS with fsmonitor daemon can hang forever when a submodule is
involved, which has been corrected.
* kn/osx-fsmonitor-with-submodules-fix:
fsmonitor OSX: fix hangs for submodules
fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
the value with trailing "/." (as repo_get_git_dir(the_repository) on
Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
IS_OUTSIDE_CONE.
In this case, fsevent_callback() doesn't update cookie_list so that
fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
not invoked.
As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
that with_lock__mark_cookies_seen() should unlock, the whole daemon
hangs.
Remove trailing "/." from state->path_gitdir_watch.buf for submodules
and add a corresponding test in t7527-builtin-fsmonitor.sh. The test is
disabled for MINGW because hangs treated with this patch occur only for
Darwin and there is no simple way to terminate the win32 fsmonitor
daemon that hangs.
Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leakfixes.
* jk/http-leakfixes: (28 commits)
http-push: clean up local_refs at exit
http-push: clean up loose request when falling back to packed
http-push: clean up objects list
http-push: free xml_ctx.cdata after use
http-push: free remote_ls_ctx.dentry_name
http-push: free transfer_request strbuf
http-push: free transfer_request dest field
http-push: free curl header lists
http-push: free repo->url string
http-push: clear refspecs before exiting
http-walker: free fake packed_git list
remote-curl: free HEAD ref with free_one_ref()
http: stop leaking buffer in http_get_info_packs()
http: call git_inflate_end() when releasing http_object_request
http: fix leak of http_object_request struct
http: fix leak when redacting cookies from curl trace
transport-helper: fix leak of dummy refs_list
fetch-pack: clear pack lockfiles list
fetch: free "raw" string when shrinking refspec
transport-helper: fix strbuf leak in push_refs_with_push()
...
When "git sparse-checkout disable" turns a sparse checkout into a
regular checkout, the index is fully expanded. This totally
expected behaviour however had an "oops, we are expanding the
index" advice message, which has been corrected.
* ds/sparse-checkout-expansion-advice:
sparse-checkout: disable advice in 'disable'
Background tasks "git maintenance" runs may need to use credential
information when going over the network, but a credential helper
may work only in an interactive environment, and end up blocking a
scheduled task waiting for UI. Credential helpers can now behave
differently when they are not running interactively.
* ds/background-maintenance-with-credential:
scalar: configure maintenance during 'reconfigure'
maintenance: add custom config to background jobs
credential: add new interactive config option
When a subprocess to work in a submodule spawned by "git submodule"
fails with SIGPIPE, the parent Git process caught the death of it,
but gave a generic "failed to work in that submodule", which was
misleading. We now behave as if the parent got SIGPIPE and die.
* pw/submodule-process-sigpipe:
submodule status: propagate SIGPIPE
The push reports that report failures to the user when pushing a
reference leak in several places. Plug these leaks by introducing a new
function `ref_push_report_free()` that frees the list of reports and
call it as required. While at it, fix a trivially leaking error string
in the vicinity.
These leaks get hit in t5411, but plugging them does not make the whole
test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the return parameter of `write_rev_file_order()` is a string
constant, the function may indeed return an allocated string when its
first parameter is a `NULL` pointer. This makes for a confusing calling
convention, where callers need to be aware of these intricate ownership
rules and cast away the constness to free the string in some cases.
Adapt the function and its caller `write_rev_file()` to always return an
allocated string and adapt callers to always free the return value.
Note that this requires us to also adapt `rename_tmp_packfile()`, which
compares the pointers to packfile data with each other. Now that the
path of the reverse index file gets allocated unconditionally the check
will always fail. This is fixed by using strcmp(3P) instead, which also
feels way less fragile.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We leak the config values when `gpg_sign` or `strategy` options are
being overridden via the command line. To fix this we need to free the
old value, which requires us to figure out whether the value was changed
via an option in the first place. The easy way to do this, which is to
initialize local variables with `NULL`, doesn't work because we cannot
tell the case where the user has passed e.g. `--no-gpg-sign`. Instead,
we use a sentinel value for both values that we can compare against to
check whether the user has passed the option.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning with bundle URIs we re-initialize `the_repository` after
having fetched the bundle. This causes a bunch of memory leaks though
because we do not release its previous state.
These leaks can be plugged by calling `repo_clear()` before we call
`repo_init()`. But this causes another issue because the remote that we
used is tied to the lifetime of the repository's remote state, which
would also get released. We thus have to make sure that it does not get
free'd under our feet.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are various different memory leaks in git-pack-redundant(1),
mostly caused by not even trying to free allocated memory. Fix them.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `OPT_PATHSPEC_FROM_FILE()` option maps to `OPT_FILENAME()`, which we
know will always allocate memory when passed. We never free the memory
though, causing a memory leak. Plug it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're leaking the args vector in git-annotate(1) because we never clear
it. Fixing it isn't as easy as calling `strvec_clear()` though because
calling `cmd_blame()` will cause the underlying array to be modified.
Instead, we also need to pass a shallow copy of the argv array to the
function.
Do so to plug the memory leaks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/http-leakfixes: (28 commits)
http-push: clean up local_refs at exit
http-push: clean up loose request when falling back to packed
http-push: clean up objects list
http-push: free xml_ctx.cdata after use
http-push: free remote_ls_ctx.dentry_name
http-push: free transfer_request strbuf
http-push: free transfer_request dest field
http-push: free curl header lists
http-push: free repo->url string
http-push: clear refspecs before exiting
http-walker: free fake packed_git list
remote-curl: free HEAD ref with free_one_ref()
http: stop leaking buffer in http_get_info_packs()
http: call git_inflate_end() when releasing http_object_request
http: fix leak of http_object_request struct
http: fix leak when redacting cookies from curl trace
transport-helper: fix leak of dummy refs_list
fetch-pack: clear pack lockfiles list
fetch: free "raw" string when shrinking refspec
transport-helper: fix strbuf leak in push_refs_with_push()
...
The `get_schedule_cmd()` function allows us to override the schedule
command with a specific test command such that we can verify the
underlying logic in a platform-independent way. Its memory management is
somewhat wild though, because it basically gives up and assigns an
allocated string to the string constant output pointer. While this part
is marked with `UNLEAK()` to mask this, we also leak the local string
lists.
Rework the function such that it has a separate out parameter. If set,
we will assign it the final allocated command. Plug the other memory
leaks and create a common exit path.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing the maintenance strategy from config we allocate a config
string, but do not free it after parsing it. Plug this leak by instead
using `git_config_get_string_tmp()`, which does not allocate any memory.
This leak is exposed by t7900, but plugging it alone does not make the
test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several leaking data structures in git-difftool(1). Plug them.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When repacking, we assemble git-pack-objects(1) arguments both for the
"normal" pack and for the cruft pack. This configuration gets populated
with a bunch of `OPT_PASSTHRU` options that we end up passing to the
child process. These options are allocated, but never free'd.
Create a new `pack_objects_args_release()` function that releases the
memory for us and call it for both sets of options.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `opt_ff` field gets populated either via `OPT_PASSTHRU` via
`config_get_ff()` or when `--rebase` is passed. So we sometimes end up
overriding the value in `opt_ff` with another value, but we do not free
the old value, causing a memory leak.
Adapt the type of the variable to be `char *` and consistently assign
allocated strings to it such that we can easily free it when it is being
overridden.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `update_submodule()` fails we return with `die_message()`, which
only causes us to print the same message as `die()` would without
actually causing the process to die. We don't free memory in that case
and thus leak memory.
Fix the leak by freeing the remote ref.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix leaking error buffer when `compute_alternate_path()` fails.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `runcommand_in_submodule_cb()` we may end up not executing the child
command when `argv` is empty. But we still populate the command with
environment variables and other things, which needs cleanup. This leads
to a memory leak because we do not call `finish_command()`.
Fix this by clearing the child process when we don't execute it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're not freeing the submodule update strategy command. Provide a
helper function that does this for us and call it in
`update_data_release()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `html_path` variable gets populated via `git_help_config()`, which
puts an allocated string into it if its value has been configured. We do
not clear the old value though, which causes a memory leak in case the
config exists multiple times.
Plug this leak. The leak is exposed by t0012, but plugging it alone is
not sufficient to make the test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `get_html_page_path()` we may end up assigning the return value of
`system_path()` to the global `html_path` variable. But as we also
assign the returned value to `to_free`, we will deallocate its memory
upon returning from the function. Consequently, `html_path` will now
point to deallocated memory.
Fix this issue by instead assigning the value to a separate local
variable.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable backend learned to more efficiently handle exclude
patterns while enumerating the refs.
* ps/reftable-exclude:
refs/reftable: wire up support for exclude patterns
reftable/reader: make table iterator reseekable
t/unit-tests: introduce reftable library
Makefile: stop listing test library objects twice
builtin/receive-pack: fix exclude patterns when announcing refs
refs: properly apply exclude patterns to namespaced refs
If the --lock-pack option is passed (which it typically is when
fetch-pack is used under the hood by smart-http), then we may end up
with entries in our pack_lockfiles string_list. We need to clear them
before returning to avoid a leak.
In git-fetch this isn't a problem, since the same cleanup happens via
transport_unlock_pack(). But the leak is detectable in t5551, which does
http fetches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--prefetch" option to git-fetch modifies the default refspec,
including eliminating some entries entirely. When we drop an entry we
free the strings in the refspec_item, but we forgot to free the matching
string in the "raw" array of the refspec struct. There's no behavioral
bug here (since we correctly shrink the raw array, too), but we're
leaking the allocated string.
Let's add in the leak-fix, and while we're at it drop "const" from
the type of the raw string array. These strings are always allocated by
refspec_append(), etc, and this makes the memory ownership more clear.
This is all a bit more intimate with the refspec code than I'd like, and
I suspect it would be better if each refspec_item held on to its own raw
string, we had a single array, and we could use refspec_item_clear() to
clean up everything. But that's a non-trivial refactoring, since
refspec_item structs can be held outside of a "struct refspec", without
having a matching raw string at all. So let's leave that for now and
just fix the leak in the most immediate way.
This lets us mark t5582 as leak-free.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The send-pack --force-with-lease option populates a push_cas_option
struct with allocated strings. Exiting without cleaning this up will
cause leak-checkers to complain.
We can fix this by calling clear_cas_option(), after making it publicly
available. Previously it was used only for resetting the list when we
saw --no-force-with-lease.
The git-push command has the same "leak", though in this case it won't
trigger a leak-checker since it stores the push_cas_option struct as a
global rather than on the stack (and is thus reachable even after main()
exits). I've added cleanup for it here anyway, though, as
future-proofing.
The leak is triggered by t5541 (it tests --force-with-lease over http,
which requires a separate send-pack process under the hood), but we
can't mark it as leak-free yet.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call get_remote_heads() for protocol v0, that may populate the
"shallow" oid_array, which must be cleaned up to avoid a leak at the
program exit. The same problem exists for both fetch-pack and send-pack,
but not for the usual transport.c code paths, since we already do this
cleanup in disconnect_git().
Fixing this lets us mark t5542 as leak-free for the send-pack side, but
fetch-pack will need some more fixes before we can do the same for
t5539.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our fetch_pack_args holds a filter_options struct that may be populated
with allocated strings by the by the "--filter" command-line option. We
must free it before exiting to avoid a leak when the program exits.
The usual fetch code paths that use transport.c don't have the same
leak, because we do the cleanup in disconnect_git().
Fixing this leak lets us mark t5500 as leak-free.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `fetch_pack()` the caller is expected to pass in a set of
sought-after refs that they want to fetch. This array gets massaged to
not contain duplicate entries, which is done by replacing duplicate refs
with `NULL` pointers. This modifies the caller-provided array, and in
case we do unset any pointers the caller now loses track of that ref and
cannot free it anymore.
Now the obvious fix would be to not only unset these pointers, but to
also free their contents. But this doesn't work because callers continue
to use those refs. Another potential solution would be to copy the array
in `fetch_pack()` so that we dont modify the caller-provided one. But
that doesn't work either because the NULL-ness of those entries is used
by callers to skip over ref entries that we didn't even try to fetch in
`report_unmatched_refs()`.
Instead, we make it the responsibility of our callers to duplicate these
arrays as needed. It ain't pretty, but it works to plug the memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix typos in comments.
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git sparse-checkout disable' with the sparse index
enabled, Git is expected to expand the index into a full index. However,
it currently outputs the advice message saying that that is unexpected
and likely due to an issue with the working directory.
Disable this advice message when in this code path. Establish a pattern
for doing a similar removal in the future.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The convention to calling into built-in command implementation has
been updated to pass the repository, if known, together with the
prefix value.
* jc/pass-repo-to-builtins:
add: pass in repo variable instead of global the_repository
builtin: remove USE_THE_REPOSITORY for those without the_repository
builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h
builtin: add a repository parameter for builtin functions
Code clean-up.
* ps/environ-wo-the-repository: (21 commits)
environment: stop storing "core.notesRef" globally
environment: stop storing "core.warnAmbiguousRefs" globally
environment: stop storing "core.preferSymlinkRefs" globally
environment: stop storing "core.logAllRefUpdates" globally
refs: stop modifying global `log_all_ref_updates` variable
branch: stop modifying `log_all_ref_updates` variable
repo-settings: track defaults close to `struct repo_settings`
repo-settings: split out declarations into a standalone header
environment: guard state depending on a repository
environment: reorder header to split out `the_repository`-free section
environment: move `set_git_dir()` and related into setup layer
environment: make `get_git_namespace()` self-contained
environment: move object database functions into object layer
config: make dependency on repo in `read_early_config()` explicit
config: document `read_early_config()` and `read_very_early_config()`
environment: make `get_git_work_tree()` accept a repository
environment: make `get_graft_file()` accept a repository
environment: make `get_index_file()` accept a repository
environment: make `get_object_directory()` accept a repository
environment: make `get_git_common_dir()` accept a repository
...
The interpret-trailers command failed to recognise the end of the
message when the commit log ends in an incomplete line.
* bl/trailers-and-incomplete-last-line-fix:
interpret-trailers: handle message without trailing newline
At the moment, some background jobs are getting blocked on credentials
during the 'prefetch' task. This leads to other tasks, such as
incremental repacks, getting blocked. Further, if a user manages to fix
their credentials, then they still need to cancel the background process
before their background maintenance can continue working.
Update the background schedules for our four scheduler integrations to
include these config options via '-c' options:
* 'credential.interactive=false' will stop Git and some credential
helpers from prompting in the UI (assuming the '-c' parameters are
carried through and respected by GCM).
* 'core.askPass=true' will replace the text fallback for a username
and password into the 'true' command, which will return a success in
its exit code, but Git will treat the empty string returned as an
invalid password and move on.
We can do some testing that the credentials are passed, at least in the
systemd case due to writing the service files.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It has been reported than running
git submodule status --recurse | grep -q ^+
results in an unexpected error message
fatal: failed to recurse into submodule $submodule
When "git submodule--helper" recurses into a submodule it creates a
child process. If that process fails then the error message above is
displayed by the parent. In the case above the child is killed by
SIGPIPE as "grep -q" exits as soon as it sees the first match. Fix this
by propagating SIGPIPE so that it is visible to the process running
git. We could propagate other signals but I'm not sure there is much
value in doing that. In the common case of the user pressing Ctrl-C or
Ctrl-\ then SIGINT or SIGQUIT will be sent to the foreground process
group and so the parent process will receive the same signal as the
child.
Reported-by: Matt Liberty <mliberty@precisioninno.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase --autostash" failed to resurrect the autostashed
changes when the command gets aborted after giving back control
asking for hlep in conflict resolution.
* pw/rebase-autostash-fix:
rebase: apply and cleanup autostash when rebase fails to start
Bugfixes and leak plugging in "git for-each-ref --format=..." code
paths.
* jk/ref-filter-trailer-fixes:
ref-filter: fix leak with unterminated %(if) atoms
ref-filter: add ref_format_clear() function
ref-filter: fix leak when formatting %(push:remoteref)
ref-filter: fix leak with %(describe) arguments
ref-filter: fix leak of %(trailers) "argbuf"
ref-filter: store ref_trailer_buf data per-atom
ref-filter: drop useless cast in trailers_atom_parser()
ref-filter: strip signature when parsing tag trailers
ref-filter: avoid extra copies of payload/signature
t6300: drop newline from wrapped test title
Code clean-up.
* jc/range-diff-lazy-setup:
remerge-diff: clean up temporary objdir at a central place
remerge-diff: lazily prepare temporary objdir on demand
In `write_head_info()` we announce references to the remote client. We
need to honor "transfer.hideRefs" here so that we do not announce any
references that the client shouldn't be able to learn about. This is
done via two separate mechanisms:
- We hand over exclude patterns to the reference backend. We can only
honor "plain" exclude patterns here that do not have prefixes with
special meaning such as "^" or "!". Filtering down the references is
handled by `hidden_refs_to_excludes()`.
- In `show_ref_cb()` we perform a second check against hidden refs.
For one this is done such that we can handle those special prefixes.
And second, handling exclude patterns in ref backends is optional,
so we also have to handle "normal" patterns.
The special-meaning "^" prefix alters whether a hidden ref applies to
the namespace-stripped reference name or the full name. So while we
would usually call `refs_for_each_namespaced_ref()` to only get those
references in the current namespace, we can't because we'd get the
already-rewritten reference names. Instead, we are forced to use
`refs_for_each_fullref_in()` and then manually strip away the namespace
prefix such that we have access to both names.
But this also means that we do not get namespace handling for exclude
patterns, which `refs_for_each_namespaced_ref()` brings for free. This
results in a bug because we potentially end up hiding away references
based on their namespaced name and not on the stripped name as we really
should be doing.
Fix this by manually rewriting the exclude patterns to their namespaced
variants.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interpret-trailers command failed to recognise the end of the
message when the commit log ends in an incomplete line.
* bl/trailers-and-incomplete-last-line-fix:
interpret-trailers: handle message without trailing newline
A file descriptor left open is now properly closed when "git
sparse-checkout" updates the sparse patterns.
* jk/sparse-fdleak-fix:
sparse-checkout: use fdopen_lock_file() instead of xfdopen()
sparse-checkout: check commit_lock_file when writing patterns
sparse-checkout: consolidate cleanup when writing patterns
"git verify-pack" and "git index-pack" started dying outside a
repository, which has been corrected.
* ps/index-pack-outside-repo-fix:
builtin/index-pack: fix segfaults when running outside of a repo
With the repository variable available in the builtin function as an
argument, pass this down into helper functions instead of using the
global the_repository.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For builtins that do not operate on a repository, remove
the #define USE_THE_REPOSITORY.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of including USE_THE_REPOSITORY_VARIABLE by default on every
builtin, remove it from builtin.h and add it to all the builtins that
include builtin.h (by definition, that means all builtins/*.c).
Also, remove the include statement for repository.h since it gets
brought in through builtin.h.
The next step will be to migrate each builtin
from having to use the_repository.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to reduce the usage of the global the_repository, add a
parameter to builtin functions that will get passed a repository
variable.
This commit uses UNUSED on most of the builtin functions, as subsequent
commits will modify the actual builtins to pass the repository parameter
down.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git cat-file" works well with the sparse-index, and gets marked as
such.
* kl/cat-file-on-sparse-index:
builtin/cat-file: mark 'git cat-file' sparse-index compatible
t1092: allow run_on_* functions to use standard input
One-line messages to "die" and other helper functions will get LF
added by these helper functions, but many existing messages had an
unnecessary LF at the end, which have been corrected.
* jk/messages-with-excess-lf-fix:
drop trailing newline from warning/error/die messages
A data corruption bug when multi-pack-index is used and the same
objects are stored in multiple packfiles has been corrected.
* tb/multi-pack-reuse-fix:
builtin/pack-objects.c: do not open-code `MAX_PACK_OBJECT_HEADER`
pack-bitmap.c: avoid repeated `pack_pos_to_offset()` during reuse
builtin/pack-objects.c: translate bit positions during pack-reuse
pack-bitmap: tag bitmapped packs with their corresponding MIDX
t/t5332-multi-pack-reuse.sh: verify pack generation with --strict
"git verify-pack" and "git index-pack" started dying outside a
repository, which has been corrected.
* ps/index-pack-outside-repo-fix:
builtin/index-pack: fix segfaults when running outside of a repo
"git bundle unbundle" outside a repository triggered a BUG()
unnecessarily, which has been corrected.
* ps/bundle-outside-repo-fix:
bundle: default to SHA1 when reading bundle headers
builtin/bundle: have unbundle check for repo before opening its bundle
The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.
cf. <Zqh2T_2RLt0SeKF7@tanuki>
* jc/patch-id:
patch-id: tighten code to detect the patch header
patch-id: rewrite code that detects the beginning of a patch
patch-id: make get_one_patchid() more extensible
patch-id: call flush_current_id() only when needed
t4204: patch-id supports various input format
Stop storing the "core.notesRef" config value globally. Instead,
retrieve the value in `default_notes_ref()`. The code is never called in
a hot loop anyway, so doing this on every invocation should be perfectly
fine.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Same as the preceding commits, storing the "core.warnAmbiguousRefs"
value globally is misdesigned as this setting may be set per repository.
Move the logic into the repo-settings subsystem. The usual pattern here
is that users are expected to call `prepare_repo_settings()` before they
access the settings themselves. This seems somewhat fragile though, as
it is easy to miss and leads to somewhat ugly code patterns at the call
sites.
Instead, introduce a new function that encapsulates this logic for us.
This also allows us to change how exactly the lazy initialization works
in the future, e.g. by only partially initializing values as requested
by the caller.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The value of "core.logAllRefUpdates" is being stored in the global
variable `log_all_ref_updates`. This design is somewhat aged nowadays,
where it is entirely possible to access multiple repositories in the
same process which all have different values for this setting. So using
a single global variable to track it is plain wrong.
Remove the global variable. Instead, we now provide a new function part
of the repo-settings subsystem that parses the value for a specific
repository. While that may require us to read the value multiple times,
we work around this by reading it once when the ref backends are set up
and caching the value there.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In refs-related code we modify the global `log_all_ref_updates`
variable, which is done because `should_autocreate_reflog()` does not
accept passing an `enum log_refs_config` but instead accesses the global
variable. Adapt its interface such that the value is provided by the
caller, which allows us to compute the proper value locally without
having to modify global state.
This change requires us to move the enum to "repo-settings.h", or
otherwise we get compilation errors due to include cycles. We're about
to fully move this setting into the repo-settings subsystem anyway, so
this is fine.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `get_git_work_tree()` function retrieves the path of the work tree
of `the_repository`. Make it accept a `struct repository` such that it
can work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `get_graft_file()` function retrieves the path to the graft file of
`the_repository`. Make it accept a `struct repository` such that it can
work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `get_index_file()` function retrieves the path to the index file
of `the_repository`. Make it accept a `struct repository` such that it
can work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `get_object_directory()` function retrieves the path to the object
directory for `the_repository`. Make it accept a `struct repository`
such that it can work on arbitrary repositories and make it part of the
repository subsystem. This reduces our reliance on `the_repository` and
clarifies scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `get_git_common_dir()` function retrieves the path to the common
directory for `the_repository`. Make it accept a `struct repository`
such that it can work on arbitrary repositories and make it part of the
repository subsystem. This reduces our reliance on `the_repository` and
clarifies scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `get_git_dir()` function retrieves the path to the Git directory for
`the_repository`. Make it accept a `struct repository` such that it can
work on arbitrary repositories and make it part of the repository
subsystem. This reduces our reliance on `the_repository` and clarifies
scope.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After using the ref-filter API, callers should use ref_filter_clear() to
free any used memory. However, there's not a matching function to clear
the ref_format struct.
Traditionally this did not need to be cleaned up, as it was just a way
for the caller to store and pass format options as a single unit. Even
though the parsing step of some placeholders may allocate data, that's
usually inside their "used_atom" structs, which are part of the
ref_filter itself.
But a few placeholders keep data outside of there. The %(ahead-behind)
and %(is-base) parsers both keep a master list of bases, because they
perform a single filtering pass outside of the use of any particular
atom. And since the format parser does not have access to the ref_filter
struct, they store their cross-atom data in the ref_format struct
itself.
And thus when they are finished, the ref_format also needs to be cleaned
up. So let's add a function to do so, and call it from all of the users
of the ref-filter API.
The %(is-base) case is found by running LSan on t6300. After this patch,
the script can now be marked leak-free.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make our codebase compilable with the -Werror=unused-parameter
option.
* jk/unused-parameters:
CodingGuidelines: mention -Wunused-parameter and UNUSED
config.mak.dev: enable -Wunused-parameter by default
compat: mark unused parameters in win32/mingw functions
compat: disable -Wunused-parameter in win32/headless.c
compat: disable -Wunused-parameter in 3rd-party code
t-reftable-readwrite: mark unused parameter in callback function
gc: mark unused config parameter in virtual functions
When git-interpret-trailers is used to add a trailer to a message that
does not end in a trailing newline, the new trailer is added on the line
immediately following the message instead of as a trailer block
separated from the message by a blank line.
For example, if a message's text was exactly "The subject" with no
trailing newline present, `git interpret-trailers --trailer
my-trailer=true` will result in the following malformed commit message:
The subject
my-trailer: true
While it is generally expected that a commit message should end with a
newline character, git-interpret-trailers should not be returning an
invalid message in this case.
Use `strbuf_complete_line` to ensure that the message ends with a
newline character when reading the input.
Signed-off-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When updating sparse patterns, we open a lock_file to write out the new
data. The lock_file struct holds the file descriptor, but we call
fdopen() to get a stdio handle to do the actual write.
After we finish writing, we fflush() so that all of the data is on disk,
and then call commit_lock_file() which closes the descriptor. But we
never fclose() the stdio handle, leaking it.
The obvious solution seems like it would be to just call fclose(). But
when? If we do it before commit_lock_file(), then the lock_file code is
left thinking it owns the now-closed file descriptor, and will do an
extra close() on the descriptor. But if we do it before, we have the
opposite problem: the lock_file code will close the descriptor, and
fclose() will do the extra close().
We can handle this correctly by using fdopen_lock_file(). That leaves
ownership of the stdio handle with the lock_file, which knows not to
double-close it.
We do have to adjust the code a bit:
- we have to handle errors ourselves; we can just die(), since that's
what xfdopen() would have done (and we can even provide a more
specific error message).
- we no longer need to call fflush(); committing the lock-file
auto-closes it, which will now do the flush for us. As a bonus, this
will actually check that the flush was successful before renaming
the file into place.
- we can get rid of the local "fd" variable, since we never look at it
ourselves now
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a new "sparse-checkout" file, we do the usual strategy of
writing to a lockfile and committing it into place. But we don't check
the outcome of commit_lock_file(). Failing there would prevent us from
writing a bogus file (good), but we would ignore the error and return a
successful exit code (bad).
Fix this by calling die(). Note that we need to keep the sparse_filename
variable valid for longer, since the filename stored in the lock_file
struct will be dropped when we run commit_lock_file().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In write_patterns_and_update(), we always need to free the pattern list
before exiting the function. Rather than handling it manually when we
return early, we can jump to an "out" label where cleanup happens. This
let us drop one line, but also establishes a pattern we can use for
other cleanup.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our error reporting routines append a trailing newline, and the strings
we pass to them should not include them (otherwise we get an extra blank
line after the message).
These cases were all found by looking at the results of:
git grep -P '[^_](error|error_errno|warning|die|die_errno)\(.*\\n"[,)]' '*.c'
Note that we _do_ sometimes include a newline in the middle of such
messages, to create multiline output (hence our grep matching "," or ")"
after we see the newline, so we know we're at the end of the string).
It's possible that one or more of these cases could intentionally be
including a blank line at the end, but having looked at them all
manually, I think these are all just mistakes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The list of packs to keep is populated via a command line option but
never free'd. Plug this memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix leaking input and output buffers in git-fmt-merge-msg(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even when `get_oid_with_context()` fails it may have allocated some data
in the object context. But we do not release it in git-grep(1) when the
call fails, leading to a memory leak. Plug it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--keep-pack` option of git-pack-objects(1) populates the arguments
into a string list. And while the list is marked as `NODUP` and thus
won't duplicate the strings, the list entries themselves still need to
be free'd. We don't though, causing a leak.
Plug it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `repack_promisor_objects()` we read output from git-pack-objects(1)
line by line, using `strbuf_getline_lf()`. We never free the line
buffer, causing a memory leak. Plug it.
This leak is being hit in t5616, but plugging it alone is not
sufficient to make the whole test suite leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interfaces to retrieve signing keys and their IDs are misdesigned as
they return string constants even though they indeed allocate memory,
which leads to memory leaks. Refactor the code to instead always return
allocated strings and let the callers free them accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the push-check subcommand of the submodule helper we acquire a list
of local refs, but never free that list. Fix this memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When appending a refspec via `refspec_append_mapped()` we leak the
result of `query_refspecs()`. The overall logic around refspec queries
is quite weird, as callers are expected to either set the `src` or `dst`
pointers, and then the (allocated) result will be in the respective
other struct member.
As we have the `src` member set, plugging the memory leak is thus as
easy as just freeing the `dst` member. While at it, use designated
initializers to initialize the structure.
This leak was exposed by t5516, but fixing it is not sufficient to make
the whole test suite leak free.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change affects how 'git cat-file' works with the index when
specifying an object with the ":<path>" syntax (which will give file
contents from the index).
'git cat-file' expands a sparse index to a full index any time contents
are requested from the index by specifying an object with the ":<path>"
syntax. This is true even when the requested file is part of the sparse
index, and results in much slower 'git cat-file' operations when working
within the sparse index.
Mark 'git cat-file' as not needing a full index, so that you only pay
the cost of expanding the sparse index to a full index when you request
a file outside of the sparse index.
Add tests to ensure both that:
- 'git cat-file' returns the correct file contents whether or not the
file is in the sparse index
- 'git cat-file' expands to the full index any time you request
something outside of the sparse index
Signed-off-by: Kevin Lyles <klyles+github@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was reported that git-verify-pack(1) has started to crash with Git
v2.46.0 when run outside of a repository. This is another fallout from
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), where we have stopped setting the default hash algorithm
for `the_repository`. Consequently, code that relies on `the_hash_algo`
will now crash when it hasn't explicitly been initialized, which may be
the case when running outside of a Git repository.
The crash is not in git-verify-pack(1) but instead in git-index-pack(1),
which gets called by the former. Ideally, both of these programs should
be able to identify the hash algorithm used by the packfile and index
without having to rely on external information. But unfortunately, the
format for neither of them is completely self-describing, so it is not
possible to derive that information. This is a design issue that we
should address by introducing a new packfile version that encodes its
object hash.
For now though the more important fix is to not make either of these
programs crash anymore, which we do by falling back to SHA1 when the
object hash is unconfigured. This pessimizes reading packfiles which
use a different hash than SHA1, but restores previous behaviour.
Reported-by: Ilya K <me@0upti.me>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If "git rebase" fails to start after stashing the user's uncommitted
changes then it forgets to restore the stashed changes and remove the
state directory. To make matters worse, running "git rebase --abort" to
apply the stashed changes and cleanup the state directory fails because
the state directory only contains the "autostash" file and is missing
the "head-name" and "onto" files required by read_basic_state().
Fix this by applying the autostash and removing the state directory if
the pre-rebase hook or initial checkout fail. This matches what
finish_rebase() does at the end of a successful rebase. If the user
modifies any files after the autostash is created it is possible there
will be conflicts when the autostash is applied. In that case
apply_autostash() saves the stash in a new entry under refs/stash and so
it is safe to remove the state directory containing the autostash file.
New tests are added to check the autostash is applied and the state
directory is removed if the rebase fails to start. Checks are also added
to some existing tests in order to ensure there is no state directory
left behind when a rebase fails to start and no autostash has been
created.
Reported-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
More trace2 events at key points on push and fetch code paths have
been added.
* js/fetch-push-trace2-annotation:
send-pack: add new tracing regions for push
fetch: add top-level trace2 regions
trace2: implement trace2_printf() for event target
The "opts" parameter is always used, so marking it with MAYBE_UNUSED is
just confusing.
This annotation goes back to 41abfe15d9 (maintenance: add pack-refs
task, 2021-02-09), when it really was unused. Back then we did not have
the UNUSED macro that would complain if the code changed to use the
parameter. So when we started using it in bfc2f9eb8e (builtin/gc:
forward git-gc(1)'s `--auto` flag when packing refs, 2024-03-25), nobody
noticed.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The underlying machinery for "git diff-index" has long been made to
expand the sparse index as needed, but the command fully expanded
the sparse index upfront, which now has been taught not to do.
* ds/sparse-diff-index:
diff-index: integrate with the sparse index
Commit d1ae15d68b (builtin/gc: refactor to read config into structure,
2024-08-16) added a new parameter to the maintenance_task virtual
functions, but most of them don't need to look at it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git check-mailmap command reads the mailmap from either the default
.mailmap location and then from the mailmap.blob and mailmap.file
configurations.
A following change to git send-email will want to support new
configuration options based on the configured identity. The
identity-based configuration and options only make sense in the context
of git send-email.
Expose the read_mailmap_file and read_mailmap_blob functions from
mailmap.c. Teach git check-mailmap the --mailmap-file and
--mailmap-blob options which load the additional mailmap sources.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git check-mailmap splits each provided contact using split_ident_line.
This function requires that the contact either be of the form "Name
<user@host>" or of the form "<user@host>". In particular, if the mail
portion of the contact is not surrounded by angle brackets,
split_ident_line will reject it.
This results in git check-mailmap rejecting attempts to translate simple
email addresses:
$ git check-mailmap user@host
fatal: unable to parse contact: user@host
This limits the usability of check-mailmap as it requires placing angle
brackets around plain email addresses.
In particular, attempting to use git check-mailmap to support mapping
addresses in git send-email is not straight forward. The sanitization
and validation functions in git send-email strip angle brackets from
plain email addresses. It is not trivial to add brackets prior to
invoking git check-mailmap.
Instead, modify check_mailmap() to allow such strings as contacts. In
particular, treat any line which cannot be split by split_ident_line as
a simple email address.
No attempt is made to actually parse the address line, or validate that
it is actually an email address. Implementing such validation is not
trivial. Besides, we weren't validating the address between angle
brackets before anyways.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `write_reused_pack_one()` defines an header to store the
OFS_DELTA header, but uses the constant "10" instead of
"MAX_PACK_OBJECT_HEADER" (as is done elsewhere in the same patch, circa
bb514de356 (pack-objects: improve partial packfile reuse, 2019-12-18)).
Declare the `ofs_header` field to be sized according to
`MAX_PACK_OBJECT_HEADER` (which is 10, as defined in "pack.h") instead
of the constant 10.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reusing chunks verbatim from an existing source pack, the function
write_reused_pack() first attempts to reuse whole words (via the
function `write_reused_pack_verbatim()`), and then individual bits (via
`write_reused_pack_one()`).
In the non-MIDX case, all of this code works fine. Likewise, in the MIDX
case, processing bits individually from the first (preferred) pack works
fine. However, processing subsequent packs in the MIDX case is broken
when there are duplicate objects among the set of MIDX'd packs.
This is because we treat the individual bit positions as valid pack
positions within the source pack(s), which does not account for gaps in
the source pack, like we see when the MIDX must break ties between
duplicate objects which appear in multiple packs.
The broken code looks like:
for (; i < reuse_packfile_bitmap->word_alloc; i++) {
for (offset = 0; offset < BITS_IN_EWORD, offset++) {
/* ... */
write_reused_pack_one(reuse_packfile->p,
pos + offset - reuse_packfile->bitmap_pos,
f, pack_start, &w_curs);
}
}
, where the second argument is incorrect and does not account for gaps.
Instead, make sure that we translate bit positions in the MIDX's
pseudo-pack order to pack positions in the respective source packs by:
- Translating the bit position (pseudo-pack order) to a MIDX position
(lexical order).
- Use the MIDX position to obtain the offset at which the given object
occurs in the source pack.
- Then translate that offset back into a pack relative position within
the source pack by calling offset_to_pack_pos().
After doing this, then we can safely use the result as a pack position.
Note that when doing single-pack reuse, as well as reusing objects from
the MIDX's preferred pack, such translation is not necessary, since
either ties are broken in favor of the preferred pack, or there are no
ties to break at all (in the case of non-MIDX bitmaps).
Failing to do this can result in strange failure modes. One example that
can occur when misinterpreting bits in the above fashion is that Git
thinks it's supposed to send a delta that the caller does not want.
Under this (incorrect) assumption, we try to look up the delta's base
(so that we can patch any OFS_DELTAs if necessary). We do this using
find_reused_offset().
But if we try and call that function for an offset belonging to an
object we did not send, we'll get back garbage. This can result in us
computing a negative fixup value, which results in memory corruption
when trying to write the (patched) OFS_DELTA header.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The synopsis says --regexp=<regexp> but the --regexp option is a
Boolean that says "the name given is not literal, but a pattern to
match the name".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark unused parameters as UNUSED to squelch -Wunused warnings.
* jk/mark-unused-parameters:
t-hashmap: stop calling setup() for t_intern() test
scalar: mark unused parameters in dummy function
daemon: mark unused parameters in non-posix fallbacks
setup: mark unused parameter in config callback
test-mergesort: mark unused parameters in trivial callback
t-hashmap: mark unused parameters in callback function
reftable: mark unused parameters in virtual functions
reftable: drop obsolete test function declarations
reftable: ignore unused argc/argv in test functions
unit-tests: ignore unused argc/argv
t/helper: mark more unused argv/argc arguments
oss-fuzz: mark unused argv/argc argument
refs: mark unused parameters in do_for_each_reflog_helper()
refs: mark unused parameters in ref_store fsck callbacks
update-ref: mark more unused parameters in parser callbacks
imap-send: mark unused parameter in ssl_socket_connect() fallback
We created a useless pseudo-merge reachability bitmap that is about
0 commits, and attempted to include commits that are not in packs,
which made no sense. These bugs have been corrected.
* tb/pseudo-merge-bitmap-fixes:
pseudo-merge.c: ensure pseudo-merge groups are closed
pseudo-merge.c: do not generate empty pseudo-merge commits
t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups
pack-bitmap-write.c: select pseudo-merges even for small bitmaps
pack-bitmap: drop redundant args from `bitmap_writer_finish()`
pack-bitmap: drop redundant args from `bitmap_writer_build()`
pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()`
pack-bitmap: initialize `bitmap_writer_init()` with packing_data
A tests for "git maintenance" that were broken on Windows have been
corrected.
* ps/maintenance-detach-fix-more:
builtin/maintenance: fix loose objects task emitting pack hash
t7900: exercise detaching via trace2 regions
t7900: fix flaky test due to leaking background job
Maintenance tasks other than "gc" now properly go background when
"git maintenance" runs them.
* ps/maintenance-detach-fix:
run-command: fix detaching when running auto maintenance
builtin/maintenance: add a `--detach` flag
builtin/gc: add a `--detach` flag
builtin/gc: stop processing log file on signal
builtin/gc: fix leaking config values
builtin/gc: refactor to read config into structure
config: fix constness of out parameter for `git_config_get_expiry()`
"git rev-list ... | git diff-tree -p --remerge-diff --stdin" should
behave more or less like "git log -p --remerge-diff" but instead it
crashed, forgetting to prepare a temporary object store needed.
* xx/diff-tree-remerge-diff-fix:
diff-tree: fix crash when used with --remerge-diff
A flakey test and incorrect calls to strtoX() functions have been
fixed.
* kl/test-fixes:
t6421: fix test to work when repo dir contains d0
set errno=0 before strtoX calls
A recent update broke "git ls-remote" used outside a repository,
which has been corrected.
* ps/ls-remote-out-of-repo-fix:
builtin/ls-remote: fall back to SHA1 outside of a repo
Use of API functions that implicitly depend on the_repository
object in the config subsystem has been rewritten to pass a
repository object through the callchain.
* ps/config-wo-the-repository:
config: hide functions using `the_repository` by default
global: prepare for hiding away repo-less config functions
config: don't depend on `the_repository` with branch conditions
config: don't have setters depend on `the_repository`
config: pass repo to functions that rename or copy sections
config: pass repo to `git_die_config()`
config: pass repo to `git_config_get_expiry_in_days()`
config: pass repo to `git_config_get_expiry()`
config: pass repo to `git_config_get_max_percent_split_change()`
config: pass repo to `git_config_get_split_index()`
config: pass repo to `git_config_get_index_threads()`
config: expose `repo_config_clear()`
config: introduce missing setters that take repo as parameter
path: hide functions using `the_repository` by default
path: stop relying on `the_repository` in `worktree_git_path()`
path: stop relying on `the_repository` when reporting garbage
hooks: remove implicit dependency on `the_repository`
editor: do not rely on `the_repository` for interactive edits
path: expose `do_git_common_path()` as `repo_common_pathv()`
path: expose `do_git_path()` as `repo_git_pathv()`
More leak fixes.
* ps/leakfixes-part-4: (22 commits)
builtin/diff: free symmetric diff members
diff: free state populated via options
builtin/log: fix leak when showing converted blob contents
userdiff: fix leaking memory for configured diff drivers
builtin/format-patch: fix various trivial memory leaks
diff: fix leak when parsing invalid ignore regex option
unpack-trees: clear index when not propagating it
sequencer: release todo list on error paths
merge-ort: unconditionally release attributes index
builtin/fast-export: plug leaking tag names
builtin/fast-export: fix leaking diff options
builtin/fast-import: plug trivial memory leaks
builtin/notes: fix leaking `struct notes_tree` when merging notes
builtin/rebase: fix leaking `commit.gpgsign` value
config: fix leaking comment character config
submodule-config: fix leaking name entry when traversing submodules
read-cache: fix leaking hashfile when writing index fails
bulk-checkin: fix leaking state TODO
object-name: fix leaking symlink paths in object context
object-file: fix memory leak when reading corrupted headers
...
At $DAYJOB we experienced some slow fetch operations and needed some
additional data to help diagnose the issue.
Add top-level trace2 regions for the various modes of operation of
`git-fetch`. None of these regions are in recursive code, so any
enclosed trace messages should only see their nesting level increase by
one.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse index allows focusing the index data structure on the files
present in the sparse-checkout, leaving only tree entries for
directories not within the sparse-checkout. Each builtin needs a
repository setting to indicate that it has been tested with the sparse
index before Git will allow the index to be loaded into memory in its
sparse form. This is a safety precaution.
There are still some builtins that haven't been integrated due to the
complexity of the integration and the lack of significant use. However,
'git diff-index' was neglected only because of initial data showing low
usage. The diff machinery was already integrated and there is no more
work to be done there but add some tests to be sure 'git diff-index'
behaves as expected.
For this purpose, we can follow the testing pattern used in 51ba65b5c3
(diff: enable and test the sparse index, 2021-12-06). One difference
here is that we only verify that the sparse index case agrees with the
full index case, but do not generate the expected output. The 'git diff'
tests use the '--name-status' option to ease the creation of the
expected output, but that's not an option for 'diff-index'. Since the
underlying diff machinery is the same, a simple comparison is sufficient
to give some coverage.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `fetch_refs_from_bundle()` we assemble a vector of arguments to pass
to `unbundle()`, but never free it. And in theory we wouldn't have to
because `unbundle()` already knows to free the vector for us. But it
fails to do so when it exits early due to `verify_bundle()` failing.
The calling convention that the arguments are freed by the callee and
not the caller feels somewhat weird. Refactor the code such that it is
instead the responsibility of the caller to free the vector, adapting
the only two callsites where we pass extra arguments. This also fixes
the memory leak.
This memory leak gets hit in t5510, but fixing it isn't sufficient to
make the whole test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the `--atomic` flag, we use a single ref transaction to commit all
ref updates in git-fetch(1). The lifetime of transactions is somewhat
weird: while `ref_transaction_abort()` will free the transaction, a call
to `ref_transaction_commit()` won't. We thus have to manually free the
transaction in the successful case.
Adapt the code to free the transaction in the exit path to plug the
resulting memory leak. As `ref_transaction_abort()` already freed the
transaction for us, we have to unset the transaction when we hit that
code path to not cause a double free.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We build several ref lists in git-fetch-pack(1), but never free them.
Fix those leaks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We never free data associated with the assembled refspec in
git-send-pack(1), causing a memory leak. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing an MIDX in git-repack(1) we first collect all the pack
names that we want to add to it in a string list. This list is marked as
`NODUP`, which indicates that it will neither duplicate nor own strings
added to it. In `write_midx_included_packs()` we then `insert()` strings
via `xstrdup()` or `strbuf_detach()`, but the resulting strings will not
be owned by anything and thus leak.
Fix this issue by marking the list as `DUP` and using a local buffer to
compute the pack names.
This leak is hit in t5319, but plugging it is not sufficient to make the
whole test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--output" switch is an `OPT_FILENAME()` option, which allocates
memory when specified by the user. But while we free the string when
executed without the "--remote" switch, we don't otherwise because we
return via a separate exit path that doesn't know to free it.
Fix this by creating a common exit path.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-upload-archive(1), we pass an array of arguments to
`write_archive()` to tell it what exactly to do. We don't ever clear the
vector though, causing a memory leak. Furthermore though, the call to
`write_archive()` may cause contents of the array to be modified, which
would cause us to leak memory to allocated strings held by it.
Fix the issue by having `write_archive()` create a shallow copy of
`argv` before parsing the arguments. Like this, we won't modify the
caller's array and can easily `strvec_clear()` it to plug these memory
leaks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `-X` switch for git-merge-tree(1) will push each option into a local
`xopts` vector that we then end up parsing. The vector never gets freed
though, causing a memory leak. Plug it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git bundle unbundle" outside a repository triggered a BUG()
unnecessarily, which has been corrected.
* ps/bundle-outside-repo-fix:
bundle: default to SHA1 when reading bundle headers
builtin/bundle: have unbundle check for repo before opening its bundle
The "loose-objects" maintenance tasks executes git-pack-objects(1) to
pack all loose objects into a new packfile. This command ends up
printing the hash of the packfile to stdout though, which clutters the
output of `git maintenance run`.
Fix this issue by disabling stdout of the child process.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t7900, we exercise the `--detach` logic by checking whether the
command ended up writing anything to its output or not. This supposedly
works because we close stdin, stdout and stderr when daemonizing. But
one, it breaks on platforms where daemonize is a no-op, like Windows.
And second, that git-maintenance(1) outputs anything at all in these
tests is a bug in the first place that we'll fix in a subsequent commit.
Introduce a new trace2 region around the detach which allows us to more
explicitly check whether the detaching logic was executed. This is a
much more direct way to exercise the logic, provides a potentially
useful signal to tracing logs and also works alright on platforms which
do not have the ability to daemonize.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
[jc: dropped a stale in-code comment from a test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a continuation of 44ad082968 (update-ref: mark unused parameter
in parser callbacks, 2023-08-29), as we've grown a few more virtual
functions since then.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fsck" infrastructure has been taught to also check the sanity
of the ref database, in addition to the object database.
* sj/ref-fsck:
fsck: add ref name check for files backend
files-backend: add unified interface for refs scanning
builtin/refs: add verify subcommand
refs: set up ref consistency check infrastructure
fsck: add refs report function
fsck: add a unified interface for reporting fsck messages
fsck: make "fsck_error" callback generic
fsck: rename objects-related fsck error functions
fsck: rename "skiplist" to "skip_oids"
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.
* dd/notes-empty-no-edit-by-default:
notes: do not trigger editor when adding an empty note
"git checkout --ours" (no other arguments) complained that the
option is incompatible with branch switching, which is technically
correct, but found confusing by some users. It now says that the
user needs to give pathspec to specify what paths to checkout.
* jc/checkout-no-op-switch-errors:
checkout: special case error messages during noop switching
It was reported that creating a stash with `--keep-index
--include-untracked` causes an error when HEAD points to a commit whose
tree is empty:
$ git stash push --keep-index --include-untracked
error: pathspec ':/' did not match any file(s) known to git
This error comes from `git checkout --no-overlay $i_tree -- :/`, which
we execute to reset the working tree to the state in our index. As the
tree generated from the index is empty in our case, ':/' does not match
any files and thus causes git-checkout(1) to error out.
Fix the issue by skipping the checkout when the index tree is empty. As
explained in the in-code comment, this should be the correct thing to do
as there is nothing that we'd have to reset in the first place.
Reported-by: Piotr Siupa <piotrsiupa@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the past, we used to execute `git gc --auto` as part of our automatic
housekeeping routines. As git-gc(1) may require quite some time to
perform the housekeeping, it knows to detach itself and run in the
background so that the user can continue their work.
Eventually, we refactored our automatic housekeeping to instead use the
more flexible git-maintenance(1) command. The upside of this new infra
is that the user can configure which maintenance tasks are performed, at
least to a certain degree. So while it continues to run git-gc(1) by
default, it can also be adapted to e.g. use git-multi-pack-index(1) for
maintenance of the object database.
The auto-detach of the new infra is somewhat broken though once the user
configures non-standard tasks. The problem is essentially that we detach
at the wrong level in the process hierarchy: git-maintenance(1) never
detaches itself, but instead it continues to be git-gc(1) which does.
When configured to only run the git-gc(1) maintenance task, then the
result is basically the same as before. But when configured to run other
tasks, then git-maintenance(1) will wait for these to run to completion.
Even worse, it may be that git-gc(1) runs concurrently with other
housekeeping tasks, stomping on each others feet.
Fix this bug by asking git-gc(1) to not detach when it is being invoked
via git-maintenance(1). Instead, git-maintenance(1) now respects a new
config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
detaches itself into the background when running as part of our auto
maintenance. This should continue to behave the same for all users which
use the git-gc(1) task, only. For others though, it means that we now
properly perform all tasks in the background. The default behaviour of
git-maintenance(1) when executed by the user does not change, it will
remain in the foreground unless they pass the `--detach` option.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Same as the preceding commit, add a `--[no-]detach` flag to the
git-maintenance(1) command. This will be used in a subsequent commit to
fix backgrounding of that command when configured with a non-standard
set of tasks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running `git gc --auto`, the command will by default detach and
continue running in the background. This behaviour can be tweaked via
the `gc.autoDetach` config, but not via a command line switch. We need
that in a subsequent commit though, where git-maintenance(1) will want
to ask its git-gc(1) child process to not detach anymore.
Add a `--[no-]detach` flag that does this for us.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When detaching, git-gc(1) will redirect its stderr to a "gc.log" log
file, which is then used to surface errors of a backgrounded process to
the user. To ensure that the file is properly managed on abnormal exit
paths, we install both signal and exit handlers that try to either
commit the underlying lock file or roll it back in case there wasn't any
error.
This logic is severly broken when handling signals though, as we end up
calling all kinds of functions that are not signal safe. This includes
malloc(3P) via `git_path()`, fprintf(3P), fflush(3P) and many more
functions. The consequence can be anything, from deadlocks to crashes.
Unfortunately, we cannot really do much about this without a larger
refactoring.
The least-worst thing we can do is to not set up the signal handler in
the first place. This will still cause us to remove the lockfile, as the
underlying tempfile subsystem already knows to unlink locks when
receiving a signal. But it may cause us to remove the lock even in the
case where it would have contained actual errors, which is a change in
behaviour.
The consequence is that "gc.log" will not be committed, and thus
subsequent calls to `git gc --auto` won't bail out because of this.
Arguably though, it is better to retry garbage collection rather than
having the process run into a potentially-corrupted state.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're leaking config values in git-gc(1) when those values are tracked
as strings. Introduce a new `gc_config_release()` function that releases
this memory to plug those leaks and release old values before populating
the config fields via `git_config_string()` et al.
Note that there is one small gotcha here with the "--prune" option. Next
to passing a string, this option also accepts the "--no-prune" option
that overrides the default or configured value. We thus need to discern
between the option not having been passed by the user and the negative
variant of it. This is done by using a simple sentinel value that lets
us discern these cases.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-gc(1) command knows to read a bunch of config keys to tweak its
own behaviour. The values are parsed into global variables, which makes
it hard to correctly manage the lifecycle of values that may require a
memory allocation.
Refactor the code to use a `struct gc_config` that gets populated and
passed around. For one, this makes previously-implicit dependencies on
these config values clear. Second, it will allow us to properly manage
the lifecycle in the next commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The type of the out parameter of `git_config_get_expiry()` is a pointer
to a constant string, which creates the impression that ownership of the
returned data wasn't transferred to the caller. This isn't true though
and thus quite misleading.
Adapt the parameter to be of type `char **` and adjust callers
accordingly. While at it, refactor `get_shared_index_expire_date()` to
drop the static `shared_index_expire` variable. It is only used in that
function, and furthermore we would only hit the code where we parse the
expiry date a single time because we already use a static `prepared`
variable to track whether we did parse it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rev-list ... | git diff-tree -p --remerge-diff --stdin" should
behave more or less like "git log -p --remerge-diff" but instead it
crashed, forgetting to prepare a temporary object store needed.
* xx/diff-tree-remerge-diff-fix:
diff-tree: fix crash when used with --remerge-diff
The refs API has been taught to give symref target information to
the users of ref iterators, allowing for-each-ref and friends to
avoid an extra ref_resolve_* API call per a symbolic ref.
* jc/refs-symref-referent:
ref-filter: populate symref from iterator
refs: add referent to each_ref_fn
refs: keep track of unresolved reference value in iterators
Support to specify ref backend for submodules has been enhanced.
* ps/submodule-ref-format:
object: fix leaking packfiles when closing object store
submodule: fix leaking seen submodule names
submodule: fix leaking fetch tasks
builtin/submodule: allow "add" to use different ref storage format
refs: fix ref storage format for submodule ref stores
builtin/clone: propagate ref storage format to submodules
builtin/submodule: allow cloning with different ref storage format
git-submodule.sh: break overly long command lines
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_finish()` function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_build()` function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit ensures that the bitmap_writer's "to_pack" field is
initialized early on, so the "to_pack" and "index_nr" arguments to
`bitmap_writer_build_type_index()` are redundant.
Drop them and adjust the callers accordingly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to determine its object order, the pack-bitmap machinery keeps
a 'struct packing_data' corresponding to the pack or pseudo-pack (when
writing a MIDX bitmap) being written.
The to_pack field is provided to the bitmap machinery by callers of
bitmap_writer_build() and assigned to the bitmap_writer struct at that
point.
But a subsequent commit will want to have access to that data earlier on
during commit selection. Prepare for that by adding a 'to_pack' argument
to 'bitmap_writer_init()', and initializing the field during that
function.
Subsequent commits will clean up other functions which take
now-redundant arguments (like nr_objects, which is equivalent to
pdata->objects_nr, or pdata itself).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A flakey test and incorrect calls to strtoX() functions have been
fixed.
* kl/test-fixes:
t6421: fix test to work when repo dir contains d0
set errno=0 before strtoX calls
The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.
* jc/patch-id:
patch-id: tighten code to detect the patch header
patch-id: rewrite code that detects the beginning of a patch
patch-id: make get_one_patchid() more extensible
patch-id: call flush_current_id() only when needed
t4204: patch-id supports various input format
A recent update broke "git ls-remote" used outside a repository,
which has been corrected.
* ps/ls-remote-out-of-repo-fix:
builtin/ls-remote: fall back to SHA1 outside of a repo
We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates which commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.
Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `show_blob_object()`, we proactively call `textconv_object()`. In
case we have a textconv driver for this blob we will end up showing the
converted contents, otherwise we'll show the un-converted contents of it
instead.
When the object has been converted we never free the buffer containing
the converted contents. Fix this to plug this memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When resolving revisions in `get_tags_and_duplicates()`, we only
partially manage the lifetime of `full_name`. In fact, managing its
lifetime properly is almost impossible because we put direct pointers to
that variable into multiple lists without duplicating the string. The
consequence is that these strings will ultimately leak.
Refactor the code to make the lists we put those names into duplicate
the memory. This allows us to properly free the string as required and
thus plugs the memory leak.
While this requires us to allocate more data overall, it shouldn't be
all that bad given that the number of allocations corresponds with the
number of command line parameters, which typically aren't all that many.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
such that its contents aren't getting freed inside of `handle_commit()`.
We never unset that flag though, which means that the structure's
allocated resources will ultimately leak.
Fix this by unsetting the flag after the loop such that we release its
resources via `release_revisions()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allocate a `struct notes_tree` in `merge_commit()` which we then
initialize via `init_notes()`. It's not really necessary to allocate the
structure though given that we never pass ownership to the caller.
Furthermore, the allocation leads to a memory leak because despite its
name, `free_notes()` doesn't free the `notes_tree` but only clears it.
Fix this issue by converting the code to use an on-stack variable.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `get_replay_opts()`, we override the `gpg_sign` field that already
got populated by `sequencer_init_config()` in case the user has
"commit.gpgsign" set in their config. This creates a memory leak because
we overwrite the previously assigned value, which may have already
pointed to an allocated string.
Let's plug the memory leak by freeing the value before we overwrite it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the comment line character has been specified multiple times in the
configuration, then `git_default_core_config()` will cause a memory leak
because it unconditionally copies the string into `comment_line_str`
without free'ing the previous value. In fact, it can't easily free the
value in the first place because it may contain a string constant.
Refactor the code such that we track allocated comment character strings
via a separate non-constant variable `comment_line_str_to_free`. Adapt
sites that set `comment_line_str` to set both and free the old value
that was stored in `comment_line_str_to_free`.
This memory leak is being hit in t3404. As there are still other memory
leaks in that file we cannot yet mark it as passing with leak checking
enabled.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `git bundle unbundle` subcommand requires a repository to unbundle
the contents into. As thus, the subcommand checks whether we have a
startup repository in the first place, and if not it dies.
This check happens after we have already opened the bundle though. This
causes a segfault when running outside of a repository starting with
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07) because we have no hash function set up, but we do try to
parse refs advertised by the bundle's header.
The next commit will fix that underlying issue by defaulting to the SHA1
object format for bundles, which will also fix the described segfault here.
But as we know that we will die anyway, we can do better than that and
avoid some vain work by moving the check for a repository before we try
to open the bundle.
Reported-by: ArcticLampyrid <ArcticLampyrid@outlook.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor functions that rename or copy config sections to accept a
`struct repository` such that we can get rid of the implicit dependency
on `the_repository`. Rename the functions accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor `git_die_config()` to accept a `struct repository` such that we
can get rid of the implicit dependency on `the_repository`. Rename the
function accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor `git_config_get_expiry()` to accept a `struct repository` such
that we can get rid of the implicit dependency on `the_repository`.
Rename the function accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor `git_config_get_split_index()` to accept a `struct repository`
such that we can get rid of the implicit dependency on `the_repository`.
Rename the function accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When not provided a worktree, then `worktree_git_path()` will fall back
to returning a path relative to the main repository. In this case, we
implicitly rely on `the_repository` to derive the path. Remove this
dependency by passing a `struct repository` as parameter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We access `the_repository` in `report_linked_checkout_garbage()` both
directly and indirectly via `get_git_dir()`. Remove this dependency by
instead passing a `struct repository` as parameter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We implicitly depend on `the_repository` in our hook subsystem because
we use `strbuf_git_path()` to compute hook paths. Remove this dependency
by accepting a `struct repository` as parameter instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After running a diff between two things, or a series of diffs while
walking the history, the diff computation is concluded by a call to
diff_result_code() to extract the exit status of the diff machinery.
The function can work on "struct diffopt", but all the callers
historically and currently pass "struct diffopt" that is embedded in
the "struct rev_info" that is used to hold the remerge_diff bit and
the remerge_objdir variable that points at the temporary object
directory in use.
Redefine diff_result_code() to take the whole "struct rev_info" to
give it an access to these members related to remerge-diff, so that
it can get rid of the temporary object directory for any and all
callers that used the feature. We can lose the equivalent code to
do so from the code paths for individual commands, diff-tree, diff,
and log.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is error prone for each caller that sets revs.remerge_diff bit
to be responsible for preparing a temporary object directory and
rotate it into the list of alternate object stores, making it the
primary object store.
Instead, remove the code to set up and arrange the temporary object
directory from the current callers and implement it in the code that
runs remerge-diff logic. The code to undo the futzing of the list
of alternate object store is still spread across the callers, but we
will deal with it in future steps.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a parameter to each_ref_fn so that callers to the ref APIs
that use this function as a callback can have acess to the
unresolved value of a symbolic ref.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using "git-diff-tree" to get the tree diff for merge commits with
the diff format set to `remerge`, a bug is triggered as shown below:
$ git diff-tree -r --remerge-diff 363337e6eb363337e6eb
BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!?
This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check
added in commit 7b90ab467a (log: clean unneeded objects during log
--remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when
attempting to clean up temporary objects generated during the remerge
process.
After some further digging, I find that the remerge-related diff options
were introduced in db757e8b8d (show, log: provide a --remerge-diff
capability, 2022-02-02), which also affect the setup of `rev_info` for
"git-diff-tree", but were not accounted for in the original
implementation (inferred from the commit message).
Elijah Newren, the author of the remerge diff feature, notes that other
callers of `log-tree.c:log_tree_commit` (the only caller of
`log-tree.c:do_remerge_diff`) also exist, but:
`builtin/am.c`: manually sets all flags; remerge_diff is not among them
`sequencer.c`: manually sets all flags; remerge_diff is not among them
so `builtin/diff-tree.c` really is the only caller that was overlooked
when remerge-diff functionality was added.
This commit resolves the crash by adding `remerge_objdir` setup logic to
`builtin/diff-tree.c`, mirroring `builtin/log.c:cmd_log_walk_no_free`.
It also includes the necessary cleanup for `remerge_objdir`.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.
* dd/notes-empty-no-edit-by-default:
notes: do not trigger editor when adding an empty note
Introduce a new subcommand "verify" in git-refs(1) to allow the user to
check the reference database consistency and also this subcommand will
be used as the entry point of checking refs for "git-fsck(1)".
Add "verbose" field into "fsck_options" to indicate whether we should
print verbose messages when checking refs and objects consistency.
Remove bit-field for "strict" field, this is because we cannot take
address of a bit-field which makes it unhandy to set member variables
when parsing the command line options.
The "git-fsck(1)" declares "fsck_options" variable with "static"
identifier which avoids complaint by the leak-checker. However, in
"git-refs verify", we need to do memory clean manually. Thus add
"fsck_options_clear" function in "fsck.c" to provide memory clean
operation.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "fsck_error" callback is designed to report the objects-related
error messages. It accepts two parameter "oid" and "object_type" which
is not generic. In order to provide a unified callback which can report
either objects or refs, remove the objects-related parameters and add
the generic parameter "void *fsck_report".
Create a new "fsck_object_report" structure which incorporates the
removed parameters "oid" and "object_type". Then change the
corresponding references to adapt to new "fsck_error" callback.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The names of objects-related fsck error functions are generic. It's OK
when there is only object database check. However, we are going to
introduce refs database check report function. To avoid ambiguity,
rename object-related fsck error functions to explicitly indicate these
functions are used to report objects-related messages.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Same as with "clone", users may want to add a submodule to a repository
with a non-default ref storage format. Wire up a new `--ref-format=`
option that works the same as for `git submodule clone`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When recursively cloning a repository with a non-default ref storage
format, e.g. by passing the `--ref-format=` option, then only the
top-level repository will end up using that ref storage format, and
all recursively cloned submodules will instead use the default format.
While mixed-format constellations are expected to work alright, the
outcome still is somewhat surprising as we have essentially ignored
the user's request.
Fix this by propagating the requested ref format to cloned submodules.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As submodules are proper self-contained repositories, it is perfectly
valid for them to have a different ref storage format than their parent
repository. There is no obvious way for users to ask for the ref storage
format when initializing submodules though. Whether the setup of such
mixed-ref-storage-format constellations is all that useful remains to be
seen. But there is no good reason to not expose such an option, and we
will require it in a subsequent patch.
Introduce a new `--ref-format=` option for git-submodule(1) that allows
the user to pick the ref storage format. This option will also be used
in a subsequent commit, where we start to propagate the same flag from
git-clone(1) to cloning submodules with the `--recursive` switch.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the rest of the MIDX subsystem and relevant callers have been
updated to learn about how to read and process incremental MIDX chains,
let's finally update the implementation in `write_midx_internal()` to be
able to write incremental MIDX chains.
This new feature is available behind the `--incremental` option for the
`multi-pack-index` builtin, like so:
$ git multi-pack-index write --incremental
The implementation for doing so is relatively straightforward, and boils
down to a handful of different kinds of changes implemented in this
patch:
- The `compute_sorted_entries()` function is taught to reject objects
which appear in any existing MIDX layer.
- Functions like `write_midx_revindex()` are adjusted to write
pack_order values which are offset by the number of objects in the
base MIDX layer.
- The end of `write_midx_internal()` is adjusted to move
non-incremental MIDX files when necessary (i.e. when creating an
incremental chain with an existing non-incremental MIDX in the
repository).
There are a handful of other changes that are introduced, like new
functions to clear incremental MIDX files that are unrelated to the
current chain (using the same "keep_hash" mechanism as in the
non-incremental case).
The tests explicitly exercising the new incremental MIDX feature are
relatively limited for two reasons:
1. Most of the "interesting" behavior is already thoroughly covered in
t5319-multi-pack-index.sh, which handles the core logic of reading
objects through a MIDX.
The new tests in t5334-incremental-multi-pack-index.sh are mostly
focused on creating and destroying incremental MIDXs, as well as
stitching their results together across layers.
2. A new GIT_TEST environment variable is added called
"GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL", which modifies the
entire test suite to write incremental MIDXs after repacking when
combined with the "GIT_TEST_MULTI_PACK_INDEX" variable.
This exercises the long tail of other interesting behavior that is
defined implicitly throughout the rest of the CI suite. It is
likewise added to the linux-TEST-vars job.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two years ago, commit ff1e653c8e (midx: respect
'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP', 2021-08-31) introduced a new
environment variable which caused the test suite to write MIDX bitmaps
after any 'git repack' invocation.
At the time, this was done to help flush out any bugs with MIDX bitmaps
that weren't explicitly covered in the t5326-multi-pack-bitmap.sh
script.
Two years later, that flag has served us well and is no longer providing
meaningful coverage, as the script in t5326 has matured substantially
and covers many more interesting cases than it did back when ff1e653c8e
was originally written.
Remove the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable
as it is no longer serving a useful purpose. More importantly, removing
this variable clears the way for us to introduce a new one to help
similarly flush out bugs related to incremental MIDX chains.
Because these incremental MIDX chains are (for now) incompatible with
MIDX bitmaps, we cannot have both.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To detect conversion failure after calls to functions like `strtod`, one
can check `errno == ERANGE`. These functions are not guaranteed to set
`errno` to `0` on successful conversion, however. Manual manipulation of
`errno` can likely be avoided by checking that the output pointer
differs from the input pointer, but that's not how other locations, such
as parse.c:139, handle this issue; they set errno to 0 prior to
executing the function.
For every place I could find a strtoX function with an ERANGE check
following it, set `errno = 0;` prior to executing the conversion
function.
Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), we have stopped setting the default hash algorithm for
`the_repository`. Consequently, code that relies on `the_hash_algo` will
now crash when it hasn't explicitly been initialized, which may be the
case when running outside of a Git repository.
It was reported that git-ls-remote(1) may crash in such a way when using
a remote helper that advertises refspecs. This is because the refspec
announced by the helper will get parsed during capability negotiation.
At that point we haven't yet figured out what object format the remote
uses though, so when run outside of a repository then we will fail.
The course of action is somewhat dubious in the first place. Ideally, we
should only parse object IDs once we have asked the remote helper for
the object format. And if the helper didn't announce the "object-format"
capability, then we should always assume SHA256. But instead, we used to
take either SHA1 if there was no repository, or we used the hash of the
local repository, which is wrong.
Arguably though, crashing hard may not be in the best interest of our
users, either. So while the old behaviour was buggy, let's restore it
for now as a short-term fix. We should eventually revisit, potentially
by deferring the point in time when we parse the refspec until after we
have figured out the remote's object hash.
Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two trivial leaks in git-credential-cache(1):
- We leak the child process in `spawn_daemon()`. As we do not call
`finish_command()` and instead let the created process daemonize, we
have to clear the process manually.
- We do not free the computed socket path in case it wasn't given via
`--socket=`.
Plug both of these memory leaks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several heuristics that git-worktree(1) uses to derive the
name of the newly created branch when not given explicitly. These
heuristics all allocate a new string, but we only end up freeing that
string in a subset of cases.
Fix the remaining cases where we didn't yet free the derived branch
names. While at it, also free `opt_track`, which is being populated via
an `OPT_PASSTHRU()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a trivial memory leak in git-shortlog(1). Fix it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are multiple trivial memory leaks in git-rerere(1). Fix those.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We never free credentials read by the credential store, leading to a
memory leak. Plug it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several memory leaks in git-show-branch(1). Fix them.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--parseopt` mode allows shell scripts to have the same option
parsing mode as we have in C builtins. It soaks up a set of option
descriptions via stdin and massages them into proper `struct option`s
that we can then use to parse a set of arguments.
We only partially free those options when done though, creating a memory
leak. Interestingly, we only end up free'ing the first option's help,
which is of course wrong.
Fix this by freeing all option's help fields as well as their `argh`
fields to plug this memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are multiple trivial memory leaks in git-stash(1). Fix those.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>