This is another step towards letting us remove the include of cache.h in
strbuf.c. It does mean that we also need to add includes of abspath.h
in a number of C files.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dozens of files made use of gettext functions, without explicitly
including gettext.h. This made it more difficult to find which files
could remove a dependence on cache.h. Make C files explicitly include
gettext.h if they are using it.
However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust several files to be more explicit about their dependency on
replace-objects to accommodate this change.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--super-prefix" option to "git" was initially added in [1] for
use with "ls-files"[2], and shortly thereafter "submodule--helper"[3]
and "grep"[4]. It wasn't until [5] that "read-tree" made use of it.
At the time [5] made sense, but since then we've made "ls-files"
recurse in-process in [6], "grep" in [7], and finally
"submodule--helper" in the preceding commits.
Let's also remove it from "read-tree", which allows us to remove the
option to "git" itself.
We can do this because the only remaining user of it is the submodule
API, which will now invoke "read-tree" with its new "--super-prefix"
option. It will only do so when the "submodule_move_head()" function
is called.
That "submodule_move_head()" function was then only invoked by
"read-tree" itself, but now rather than setting an environment
variable to pass "--super-prefix" between cmd_read_tree() we:
- Set a new "super_prefix" in "struct unpack_trees_options". The
"super_prefixed()" function in "unpack-trees.c" added in [5] will now
use this, rather than get_super_prefix() looking up the environment
variable we set earlier in the same process.
- Add the same field to the "struct checkout", which is only needed to
ferry the "super_prefix" in the "struct unpack_trees_options" all the
way down to the "entry.c" callers of "submodule_move_head()".
Those calls which used the super prefix all originated in
"cmd_read_tree()". The only other caller is the "unlink_entry()"
caller in "builtin/checkout.c", which now passes a "NULL".
1. 74866d7579 (git: make super-prefix option, 2016-10-07)
2. e77aa336f1 (ls-files: optionally recurse into submodules, 2016-10-07)
3. 89c8626557 (submodule helper: support super prefix, 2016-12-08)
4. 0281e487fd (grep: optionally recurse into submodules, 2016-12-16)
5. 3d415425c7 (unpack-trees: support super-prefix option, 2017-01-17)
6. 188dce131f (ls-files: use repository object, 2017-06-22)
7. f9ee2fcdfa (grep: recurse in-process using 'struct repository', 2017-08-02)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Undoes 'jk/unused-annotation' topic and redoes it to work around
Coccinelle rules misfiring false positives in unrelated codepaths.
* ab/unused-annotation:
git-compat-util.h: use "deprecated" for UNUSED variables
git-compat-util.h: use "UNUSED", not "UNUSED(var)"
Annotate function parameters that are not used (but cannot be
removed for structural reasons), to prepare us to later compile
with -Wunused warning turned on.
* jk/unused-annotation:
is_path_owned_by_current_uid(): mark "report" parameter as unused
run-command: mark unused async callback parameters
mark unused read_tree_recursive() callback parameters
hashmap: mark unused callback parameters
config: mark unused callback parameters
streaming: mark unused virtual method parameters
transport: mark bundle transport_options as unused
refs: mark unused virtual method parameters
refs: mark unused reflog callback parameters
refs: mark unused each_ref_fn parameters
git-compat-util: add UNUSED macro
As reported in [1] the "UNUSED(var)" macro introduced in
2174b8c75de (Merge branch 'jk/unused-annotation' into next,
2022-08-24) breaks coccinelle's parsing of our sources in files where
it occurs.
Let's instead partially go with the approach suggested in [2] of
making this not take an argument. As noted in [1] "coccinelle" will
ignore such tokens in argument lists that it doesn't know about, and
it's less of a surprise to syntax highlighters.
This undoes the "help us notice when a parameter marked as unused is
actually use" part of 9b24034754 (git-compat-util: add UNUSED macro,
2022-08-19), a subsequent commit will further tweak the macro to
implement a replacement for that functionality.
1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hashmap comparison functions must conform to a particular callback
interface, but many don't use all of their parameters. Especially the
void cmp_data pointer, but some do not use keydata either (because they
can easily form a full struct to pass when doing lookups). Let's mark
these to make -Wunused-parameter happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git_replace_ref_base global is used to store the value of the
GIT_REPLACE_REF_BASE environment variable or the default of
"refs/replace/". This is initialized within setup_git_env().
The ref_namespaces array is a new centralized location for information
such as the ref namespace used for replace refs. Instead of having this
namespace stored in two places, use the ref_namespaces array instead.
For simplicity, create a local git_replace_ref_base variable wherever
the global was previously used.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git interprets different meanings to different refs based on their
names. Some meanings are cosmetic, like how refs in 'refs/remotes/*'
are colored differently from refs in 'refs/heads/*'. Others are more
critical, such as how replace refs are interpreted.
Before making behavior changes based on ref namespaces, collect all
known ref namespaces into a array of ref_namespace_info structs. This
array is indexed by the new ref_namespace enum for quick access.
As of this change, this array is purely documentation. Future changes
will add dependencies on this array.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The path taken by "git multi-pack-index" command from the end user
was compared with path internally prepared by the tool withut first
normalizing, which lead to duplicated paths not being noticed,
which has been corrected.
* ds/midx-normalize-pathname-before-comparison:
cache: use const char * for get_object_directory()
multi-pack-index: use --object-dir real path
midx: use real paths in lookup_multi_pack_index()
The get_object_directory() method returns the exact string stored at
the_repository->objects->odb->path. The return type of "char *" implies
that the caller must keep track of the buffer and free() it when
complete. This causes significant problems later when the ODB is
accessed.
Use "const char *" as the return type to avoid this confusion. There are
no current callers that care about the non-const definition.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Built-in fsmonitor (part 2).
* jh/builtin-fsmonitor-part2: (30 commits)
t7527: test status with untracked-cache and fsmonitor--daemon
fsmonitor: force update index after large responses
fsmonitor--daemon: use a cookie file to sync with file system
fsmonitor--daemon: periodically truncate list of modified files
t/perf/p7519: add fsmonitor--daemon test cases
t/perf/p7519: speed up test on Windows
t/perf/p7519: fix coding style
t/helper/test-chmtime: skip directories on Windows
t/perf: avoid copying builtin fsmonitor files into test repo
t7527: create test for fsmonitor--daemon
t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
help: include fsmonitor--daemon feature flag in version info
fsmonitor--daemon: implement handle_client callback
compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS
compat/fsmonitor/fsm-listen-darwin: add MacOS header files for FSEvent
compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows
fsmonitor--daemon: create token-based changed path cache
fsmonitor--daemon: define token-ids
fsmonitor--daemon: add pathname classification
fsmonitor--daemon: implement 'start' command
...
Replace core.fsyncObjectFiles with two new configuration variables,
core.fsync and core.fsyncMethod.
* ns/core-fsyncmethod:
core.fsync: documentation and user-friendly aggregate options
core.fsync: new option to harden the index
core.fsync: add configuration parsing
core.fsync: introduce granular fsync control infrastructure
core.fsyncmethod: add writeout-only mode
wrapper: make inclusion of Windows csprng header tightly scoped
Move fsmonitor config settings to a new and opaque
`struct fsmonitor_settings` structure. Add a lazily-loaded pointer
to this into `struct repo_settings`
Create an `enum fsmonitor_mode` type in `struct fsmonitor_settings` to
represent the state of fsmonitor. This lets us represent which, if
any, fsmonitor provider (hook or IPC) is enabled.
Create `fsm_settings__get_*()` getters to lazily look up fsmonitor-
related config settings.
Get rid of the `core_fsmonitor` global variable. Move the code to
lookup the existing `core.fsmonitor` config value into the fsmonitor
settings.
Create a hook pathname variable in `struct fsmonitor-settings` and
only set it when in hook mode.
Extend the definition of `core.fsmonitor` to be either a boolean
or a hook pathname. When true, the builtin FSMonitor is used.
When false or unset, no FSMonitor (neither builtin nor hook) is
used.
The existing `core_fsmonitor` global variable was used to store the
pathname to the fsmonitor hook *and* it was used as a boolean to see
if fsmonitor was enabled. This dual usage and global visibility leads
to confusion when we add the IPC-based provider. So lets hide the
details in fsmonitor-settings.c and let it decide which provider to
use in the case of multiple settings. This avoids cluttering up
repo-settings.c with these private details.
A future commit in builtin-fsmonitor series will add the ability to
disqualify worktrees for various reasons, such as being mounted from a
remote volume, where fsmonitor should not be started. Having the
config settings hidden in fsmonitor-settings.c allows such worktree
restrictions to override the config values used.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change introduces code to parse the core.fsync setting and
configure the fsync_components variable.
core.fsync is configured as a comma-separated list of component names to
sync. Each time a core.fsync variable is encountered in the
configuration heirarchy, we start off with a clean state with the
platform default value. Passing 'none' resets the value to indicate
nothing will be synced. We gather all negative and positive entries from
the comma separated list and then compute the new value by removing all
the negative entries and adding all of the positive entries.
We issue a warning for components that are not recognized so that the
configuration code is compatible with configs from future versions of
Git with more repo components.
Complete documentation for the new setting is included in a later patch
in the series so that it can be reviewed once in final form.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit introduces the infrastructure for the core.fsync
configuration knob. The repository components we want to sync
are identified by flags so that we can turn on or off syncing
for specific components.
If core.fsyncObjectFiles is set and the core.fsync configuration
also includes FSYNC_COMPONENT_LOOSE_OBJECT, we will fsync any
loose objects. This picks the strictest data integrity behavior
if core.fsync and core.fsyncObjectFiles are set to conflicting values.
This change introduces the currently unused fsync_component
helper, which will be used by a later patch that adds fsyncing to
the refs backend.
Actual configuration and documentation of the fsync components
list are in other patches in the series to separate review of
the underlying mechanism from the policy of how it's configured.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit introduces the `core.fsyncMethod` configuration
knob, which can currently be set to `fsync` or `writeout-only`.
The new writeout-only mode attempts to tell the operating system to
flush its in-memory page cache to the storage hardware without issuing a
CACHE_FLUSH command to the storage controller.
Writeout-only fsync is significantly faster than a vanilla fsync on
common hardware, since data is written to a disk-side cache rather than
all the way to a durable medium. Later changes in this patch series will
take advantage of this primitive to implement batching of hardware
flushes.
When git_fsync is called with FSYNC_WRITEOUT_ONLY, it may fail and the
caller is expected to do an ordinary fsync as needed.
On Apple platforms, the fsync system call does not issue a CACHE_FLUSH
directive to the storage controller. This change updates fsync to do
fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity
with existing behavior on Apple platforms by setting the default value
of the new core.fsyncMethod option.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Typically with sparse checkouts, we expect files outside the sparsity
patterns to be marked as SKIP_WORKTREE and be missing from the working
tree. Sometimes this expectation would be violated however; including
in cases such as:
* users grabbing files from elsewhere and writing them to the worktree
(perhaps by editing a cached copy in an editor, copying/renaming, or
even untarring)
* various git commands having incomplete or no support for the
SKIP_WORKTREE bit[1,2]
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
When the SKIP_WORKTREE bit in the index did not reflect the presence of
the file in the working tree, it traditionally caused confusion and was
difficult to detect and recover from. So, in a sparse checkout, since
af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files present
in worktree, 2022-01-14), Git automatically clears the SKIP_WORKTREE
bit at index read time for entries corresponding to files that are
present in the working tree.
There is another workflow, however, where it is expected that paths
outside the sparsity patterns appear to exist in the working tree and
that they do not lose the SKIP_WORKTREE bit, at least until they get
modified. A Git-aware virtual file system[4] takes advantage of its
position as a file system driver to expose all files in the working
tree, fetch them on demand using partial clone on access, and tell Git
to pay attention to them on demand by updating the sparse checkout
pattern on writes. This means that commands like "git status" only have
to examine files that have potentially been modified, whereas commands
like "ls" are able to show the entire codebase without requiring manual
updates to the sparse checkout pattern.
Thus since af6a51875a, Git with such Git-aware virtual file systems
unsets the SKIP_WORKTREE bit for all files and commands like "git
status" have to fetch and examine them all.
Introduce a configuration setting sparse.expectFilesOutsideOfPatterns to
allow limiting the tracked set of files to a small set once again. A
Git-aware virtual file system or other application that wants to
maintain files outside of the sparse checkout can set this in a
repository to instruct Git not to check for the presence of
SKIP_WORKTREE files. The setting defaults to false, so most users of
sparse checkout will still get the benefit of an automatically updating
index to recover from the variety of difficult issues detailed in
af6a51875a for paths with SKIP_WORKTREE set despite the path being
present.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] The three long paragraphs in the middle of
https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
[4] such as the vfsd described in
https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@google.com/
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
New interface into the tmp-objdir API to help in-core use of the
quarantine feature.
* ns/tmp-objdir:
tmp-objdir: disable ref updates when replacing the primary odb
tmp-objdir: new API for creating temporary writable databases
When creating a subprocess with a temporary ODB, we set the
GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not
to update refs, since the tmp-objdir may go away.
Introduce a similar mechanism for in-process temporary ODBs when
we call tmp_objdir_replace_primary_odb. Now both mechanisms set
the disable_ref_updates flag on the odb, which is queried by
the ref_transaction_prepare function.
Peff's test case [1] was invoking ref updates via the cachetextconv
setting. That particular code silently does nothing when a ref
update is forbidden. See the call to notes_cache_put in
fill_textconv where errors are ignored.
[1] https://lore.kernel.org/git/YVOn3hDsb5pnxR53@coredump.intra.peff.net/
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tmp_objdir API provides the ability to create temporary object
directories, but was designed with the goal of having subprocesses
access these object stores, followed by the main process migrating
objects from it to the main object store or just deleting it. The
subprocesses would view it as their primary datastore and write to it.
Here we add the tmp_objdir_replace_primary_odb function that replaces
the current process's writable "main" object directory with the
specified one. The previous main object directory is restored in either
tmp_objdir_migrate or tmp_objdir_destroy.
For the --remerge-diff usecase, add a new `will_destroy` flag in `struct
object_database` to mark ephemeral object databases that do not require
fsync durability.
Add 'git prune' support for removing temporary object databases, and
make sure that they have a name starting with tmp_ and containing an
operation-specific name.
Based-on-patch-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "GIT_TEST_FSYNC" environment variable now exists for
disabling fsync() even on packfiles and other "critical" data.
Running "make test -j8 NO_SVN_TESTS=1" on a noisy 8-core system
on an HDD, test runtime drops from ~4 minutes down to ~3 minutes.
Using "GIT_TEST_FSYNC=1" re-enables fsync() for comparison
purposes.
SVN interopability tests are minimally affected since SVN will
still use fsync in various places.
This will also be useful for 3rd-party tools which create
throwaway git repositories of temporary data, but remains
undocumented for end users.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref iteration code used to optionally allow dangling refs to be
shown, which has been tightened up.
* jk/ref-paranoia:
refs: drop "broken" flag from for_each_fullref_in()
ref-filter: drop broken-ref code entirely
ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN
repack, prune: drop GIT_REF_PARANOIA settings
refs: turn on GIT_REF_PARANOIA by default
refs: omit dangling symrefs when using GIT_REF_PARANOIA
refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag
refs-internal.h: reorganize DO_FOR_EACH_* flag documentation
refs-internal.h: move DO_FOR_EACH_* flags next to each other
t5312: be more assertive about command failure
t5312: test non-destructive repack
t5312: create bogus ref as necessary
t5312: drop "verbose" helper
t5600: provide detached HEAD for corruption failures
t5516: don't use HEAD ref for invalid ref-deletion tests
t7900: clean up some more broken refs
Code cleanup.
* ab/repo-settings-cleanup:
repository.h: don't use a mix of int and bitfields
repo-settings.c: simplify the setup
read-cache & fetch-negotiator: check "enum" values in switch()
environment.c: remove test-specific "ignore_untracked..." variable
wrapper.c: add x{un,}setenv(), and use xsetenv() in environment.c
Now that GIT_REF_PARANOIA is the default, we don't need to selectively
enable it for destructive operations. In fact, it's harmful to do so,
because it overrides any GIT_REF_PARANOIA=0 setting that the user may
have provided (because they're trying to work around some corruption).
With these uses gone, we can further clean up the ref_paranoia global,
and make it a static variable inside the refs code.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of the global ignore_untracked_cache_config variable added in
dae6c322fa (test-dump-untracked-cache: don't modify the untracked
cache, 2016-01-27) we can make use of the new facility to set config
via environment variables added in d8d77153ea (config: allow
specifying config entries via envvar pairs, 2021-01-12).
It's arguably a bit hacky to use setenv() and getenv() to pass
messages between the same program, but since the test helpers are not
the main intended audience of repo-settings.c I think it's better than
hardcoding the test-only special-case in prepare_repo_settings().
This uses the xsetenv() wrapper added in the preceding commit, if we
don't set these in the environment we'll fail in
t7063-status-untracked-cache.sh, but let's fail earlier anyway if that
were to happen.
This breaks any parent process that's potentially using the
GIT_CONFIG_* and GIT_CONFIG_PARAMETERS mechanism to pass one-shot
config setting down to a git subprocess, but in this case we don't
care about the general case of such potential parents. This process
neither spawns other "git" processes, nor is it interested in other
configuration. We might want to pick up other test modes here, but
those will be passed via GIT_TEST_* environment variables.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add fatal wrappers for setenv() and unsetenv(). In d7ac12b25d (Add
set_git_dir() function, 2007-08-01) we started checking its return
value, and since 48988c4d0c (set_git_dir: die when setenv() fails,
2018-03-30) we've had set_git_dir_1() die if we couldn't set it.
Let's provide a wrapper for both, this will be useful in many other
places, a subsequent patch will make another use of xsetenv().
The checking of the return value here is over-eager according to
setenv(3) and POSIX. It's documented as returning just -1 or 0, so
perhaps we should be checking -1 explicitly.
Let's just instead die on any non-zero, if our C library is so broken
as to return something else than -1 on error (and perhaps not set
errno?) the worst we'll do is die with a nonsensical errno value, but
we'll want to die in either case.
Let's make these return "void" instead of "int". As far as I can tell
there's no other x*() wrappers that needed to make the decision of
deviating from the signature in the C library, but since their return
value is only used to indicate errors (so we'd die here), we can catch
unreachable code such as
if (xsetenv(...) < 0)
[...];
I think it would be OK skip the NULL check of the "name" here for the
calls to die_errno(). Almost all of our setenv() callers are taking a
constant string hardcoded in the source as the first argument, and for
the rest we can probably assume they've done the NULL check
themselves. Even if they didn't, modern C libraries are forgiving
about it (e.g. glibc formatting it as "(null)"), on those that aren't,
well, we were about to die anyway. But let's include the check anyway
for good measure.
1. https://pubs.opengroup.org/onlinepubs/009604499/functions/setenv.html
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 8de7eeb54b (compression: unify pack.compression configuration
parsing, 2016-11-15) the variables core_compression_level and
core_compression_seen are only set, but never read. Remove them.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
realpath is only populated if we execute the git_work_tree_initialized
block. However that block also causes us to return early, meaning we
never actually release the strbuf in the case where we populated it.
Therefore we move all strbuf related code into the block to guarantee
that we can't leak it.
LSAN output from t0095:
Direct leak of 129 byte(s) in 1 object(s) allocated from:
#0 0x49a9b9 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x78f585 in xrealloc wrapper.c:126:8
#2 0x713ff4 in strbuf_grow strbuf.c:98:2
#3 0x713ff4 in strbuf_getcwd strbuf.c:597:3
#4 0x4f0c18 in strbuf_realpath_1 abspath.c:99:7
#5 0x5ae4a4 in set_git_work_tree environment.c:259:3
#6 0x6fdd8a in setup_discovered_git_dir setup.c:931:2
#7 0x6fdd8a in setup_git_directory_gently setup.c:1235:12
#8 0x4cb50d in get_bloom_filter_for_commit t/helper/test-bloom.c:41:2
#9 0x4cb50d in cmd__bloom t/helper/test-bloom.c:95:3
#10 0x4caa1f in cmd_main t/helper/test-tool.c:124:11
#11 0x4caded in main common-main.c:52:11
#12 0x7f0869f02349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 129 byte(s) leaked in 1 allocation(s).
It looks like this leak has existed since realpath was first added to
set_git_work_tree() in:
3d7747e318 (real_path: remove unsafe API, 2020-03-10)
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we currently have the `GIT_CONFIG_PARAMETERS` environment variable
which can be used to pass runtime configuration data to git processes,
it's an internal implementation detail and not supposed to be used by
end users.
Next to being for internal use only, this way of passing config entries
has a major downside: the config keys need to be parsed as they contain
both key and value in a single variable. As such, it is left to the user
to escape any potentially harmful characters in the value, which is
quite hard to do if values are controlled by a third party.
This commit thus adds a new way of adding config entries via the
environment which gets rid of this shortcoming. If the user passes the
`GIT_CONFIG_COUNT=$n` environment variable, Git will parse environment
variable pairs `GIT_CONFIG_KEY_$i` and `GIT_CONFIG_VALUE_$i` for each
`i` in `[0,n)`.
While the same can be achieved with `git -c <name>=<value>`, one may
wish to not do so for potentially sensitive information. E.g. if one
wants to set `http.extraHeader` to contain an authentication token,
doing so via `-c` would trivially leak those credentials via e.g. ps(1),
which typically also shows command arguments.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `getenv_safe()` helper function helps to safely retrieve multiple
environment values without the need to depend on platform-specific
behaviour for the return value's lifetime. We'll make use of this
function in a following patch, so let's make it available by making it
non-static and adding a declaration.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* jk/leakfix:
submodule--helper: fix leak of core.worktree value
config: fix leak in git_config_get_expiry_in_days()
config: drop git_config_get_string_const()
config: fix leaks from git_config_get_string_const()
checkout: fix leak of non-existent branch names
submodule--helper: use strbuf_release() to free strbufs
clear_pattern_list(): clear embedded hashmaps
As evidenced by the leak fixes in the previous commit, the "const" in
git_config_get_string_const() clearly misleads people into thinking that
it does not allocate a copy of the string. We can fix this by renaming
it, but it's easier still to just drop it. Of the four remaining
callers:
- The one in git_config_parse_expiry() still needs to allocate, since
that's what its callers expect. We can just use the non-const
version and cast our pointer. Slightly ugly, but the damage is
contained in one spot.
- The two in apply are writing to global "const char *" variables, and
need to continue allocating. We often mark these as const because we
assign default string literals to them. But in this case we don't do
that, so we can just declare them as real "char *" pointers and use
the non-const version.
- The call in checkout doesn't actually need a copy; it can just use
the non-allocating "tmp" version of the function.
The function is also mentioned in the MyFirstContribution document. We
can swap that call out for the non-allocating "tmp" variant, which fits
well in the example given.
We'll drop the "configset" and "repo" variants, as well (which are
unused).
Note that this frees up the "const" name, so we could rename the "tmp"
variant back to that. But let's give some time for topics in flight to
adapt to the new code before doing so (if we do it too soon, the
function semantics will change but the compiler won't alert us).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "argc" and "argv" names made sense when the struct was argv_array,
but now they're just confusing. Let's rename them to "nr" (which we use
for counts elsewhere) and "v" (which is rather terse, but reads well
when combined with typical variable names like "args.v").
Note that we have to update all of the callers immediately. Playing
tricks with the preprocessor is hard here, because we wouldn't want to
rewrite unrelated tokens.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We eventually want to drop the argv_array name and just use strvec
consistently. There's no particular reason we have to do it all at once,
or care about interactions between converted and unconverted bits.
Because of our preprocessor compat layer, the names are interchangeable
to the compiler (so even a definition and declaration using different
names is OK).
This patch converts remaining files from the first half of the alphabet,
to keep the diff to a manageable size.
The conversion was done purely mechanically with:
git ls-files '*.c' '*.h' |
xargs perl -i -pe '
s/ARGV_ARRAY/STRVEC/g;
s/argv_array/strvec/g;
'
and then selectively staging files with "git add '[abcdefghjkl]*'".
We'll deal with any indentation/style fallouts separately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This requires updating #include lines across the code-base, but that's
all fairly mechanical, and was done with:
git ls-files '*.c' '*.h' |
xargs perl -i -pe 's/argv-array.h/strvec.h/'
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many functions in commit.h that are more related to shallow
repositories than they are to any sort of generic commit machinery.
Likely this began when there were only a few shallow-related functions,
and commit.h seemed a reasonable enough place to put them.
But, now there are a good number of shallow-related functions, and
placing them all in 'commit.h' doesn't make sense.
This patch extracts a 'shallow.h', which takes all of the declarations
from 'commit.h' for functions which already exist in 'shallow.c'. We
will bring the remaining shallow-related functions defined in 'commit.c'
in a subsequent patch.
For now, move only the ones that already are implemented in 'shallow.c',
and update the necessary includes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Returning a shared buffer invites very subtle bugs due to reentrancy or
multi-threading, as demonstrated by the previous patch.
There was an unfinished effort to abolish this [1].
Let's finally rid of `real_path()`, using `strbuf_realpath()` instead.
This patch uses a local `strbuf` for most places where `real_path()` was
previously called.
However, two places return the value of `real_path()` to the caller. For
them, a `static` local `strbuf` was added, effectively pushing the
problem one level higher:
read_gitfile_gently()
get_superproject_working_tree()
[1] https://lore.kernel.org/git/1480964316-99305-1-git-send-email-bmwill@google.com/
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`real_path()` returns result from a shared buffer, inviting subtle
reentrance bugs. One of these bugs occur when invoked this way:
set_git_dir(real_path(git_dir))
In this case, `real_path()` has reentrance:
real_path
read_gitfile_gently
repo_set_gitdir
setup_git_env
set_git_dir_1
set_git_dir
Later, `set_git_dir()` uses its now-dead parameter:
!is_absolute_path(path)
Fix this by using a dedicated `strbuf` to hold `strbuf_realpath()`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Management of sparsely checked-out working tree has gained a
dedicated "sparse-checkout" command.
* ds/sparse-cone: (21 commits)
sparse-checkout: improve OS ls compatibility
sparse-checkout: respect core.ignoreCase in cone mode
sparse-checkout: check for dirty status
sparse-checkout: update working directory in-process for 'init'
sparse-checkout: cone mode should not interact with .gitignore
sparse-checkout: write using lockfile
sparse-checkout: use in-process update for disable subcommand
sparse-checkout: update working directory in-process
sparse-checkout: sanitize for nested folders
unpack-trees: add progress to clear_ce_flags()
unpack-trees: hash less in cone mode
sparse-checkout: init and set in cone mode
sparse-checkout: use hashmaps for cone patterns
sparse-checkout: add 'cone' mode
trace2: add region in clear_ce_flags
sparse-checkout: create 'disable' subcommand
sparse-checkout: add '--stdin' option to set subcommand
sparse-checkout: 'set' subcommand
clone: add --sparse mode
sparse-checkout: create 'init' subcommand
...
* maint-2.23: (44 commits)
Git 2.23.1
Git 2.22.2
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
...
* maint-2.20: (36 commits)
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
...
* maint-2.19: (34 commits)
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
...
* maint-2.18: (33 commits)
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
...
* maint-2.17: (32 commits)
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
...
* maint-2.16: (31 commits)
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
...
* maint-2.15: (29 commits)
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
...
* maint-2.14: (28 commits)
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
test-path-utils: offer to run a protectNTFS/protectHFS benchmark
...
Back in the DOS days, in the FAT file system, file names always
consisted of a base name of length 8 plus a file extension of length 3.
Shorter file names were simply padded with spaces to the full 8.3
format.
Later, the FAT file system was taught to support _also_ longer names,
with an 8.3 "short name" as primary file name. While at it, the same
facility allowed formerly illegal file names, such as `.git` (empty base
names were not allowed), which would have the "short name" `git~1`
associated with it.
For backwards-compatibility, NTFS supports alternative 8.3 short
filenames, too, even if starting with Windows Vista, they are only
generated on the system drive by default.
We addressed the problem that the `.git/` directory can _also_ be
accessed via `git~1/` (when short names are enabled) in 2b4c6efc82
(read-cache: optionally disallow NTFS .git variants, 2014-12-16), i.e.
since Git v1.9.5, by introducing the config setting `core.protectNTFS`
and enabling it by default on Windows.
In the meantime, Windows 10 introduced the "Windows Subsystem for Linux"
(short: WSL), i.e. a way to run Linux applications/distributions in a
thinly-isolated subsystem on Windows (giving rise to many a "2016 is the
Year of Linux on the Desktop" jokes). WSL is getting increasingly
popular, also due to the painless way Linux application can operate
directly ("natively") on files on Windows' file system: the Windows
drives are mounted automatically (e.g. `C:` as `/mnt/c/`).
Taken together, this means that we now have to enable the safe-guards of
Git v1.9.5 also in WSL: it is possible to access a `.git` directory
inside `/mnt/c/` via the 8.3 name `git~1` (unless short name generation
was disabled manually). Since regular Linux distributions run in WSL,
this means we have to enable `core.protectNTFS` at least on Linux, too.
To enable Services for Macintosh in Windows NT to store so-called
resource forks, NTFS introduced "Alternate Data Streams". Essentially,
these constitute additional metadata that are connected to (and copied
with) their associated files, and they are accessed via pseudo file
names of the form `filename:<stream-name>:<stream-type>`.
In a recent patch, we extended `core.protectNTFS` to also protect
against accesses via NTFS Alternate Data Streams, e.g. to prevent
contents of the `.git/` directory to be "tracked" via yet another
alternative file name.
While it is not possible (at least by default) to access files via NTFS
Alternate Data Streams from within WSL, the defaults on macOS when
mounting network shares via SMB _do_ allow accessing files and
directories in that way. Therefore, we need to enable `core.protectNTFS`
on macOS by default, too, and really, on any Operating System that can
mount network shares via SMB/CIFS.
A couple of approaches were considered for fixing this:
1. We could perform a dynamic NTFS check similar to the `core.symlinks`
check in `init`/`clone`: instead of trying to create a symbolic link
in the `.git/` directory, we could create a test file and try to
access `.git/config` via 8.3 name and/or Alternate Data Stream.
2. We could simply "flip the switch" on `core.protectNTFS`, to make it
"on by default".
The obvious downside of 1. is that it won't protect worktrees that were
clone with a vulnerable Git version already. We considered patching code
paths that check out files to check whether we're running on an NTFS
system dynamically and persist the result in the repository-local config
setting `core.protectNTFS`, but in the end decided that this solution
would be too fragile, and too involved.
The obvious downside of 2. is that everybody will have to "suffer" the
performance penalty incurred from calling `is_ntfs_dotgit()` on every
path, even in setups where.
After the recent work to accelerate `is_ntfs_dotgit()` in most cases,
it looks as if the time spent on validating ten million random
file names increases only negligibly (less than 20ms, well within the
standard deviation of ~50ms). Therefore the benefits outweigh the cost.
Another downside of this is that paths that might have been acceptable
previously now will be forbidden. Realistically, though, this is an
improvement because public Git hosters already would reject any `git
push` that contains such file names.
Note: There might be a similar problem mounting HFS+ on Linux. However,
this scenario has been considered unlikely and in light of the cost (in
the aforementioned benchmark, `core.protectHFS = true` increased the
time from ~440ms to ~610ms), it was decided _not_ to touch the default
of `core.protectHFS`.
This change addresses CVE-2019-1353.
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Helped-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The sparse-checkout feature can have quadratic performance as
the number of patterns and number of entries in the index grow.
If there are 1,000 patterns and 1,000,000 entries, this time can
be very significant.
Create a new Boolean config option, core.sparseCheckoutCone, to
indicate that we expect the sparse-checkout file to contain a
more limited set of patterns. This is a separate config setting
from core.sparseCheckout to avoid breaking older clients by
introducing a tri-state option.
The config option does nothing right now, but will be expanded
upon in a later commit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we can have a different default partial clone filter for
each promisor remote, let's hide core_partial_clone_filter_default
as a static in promisor-remote.c to avoid it being use for
anything other than managing backward compatibility.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have has_promisor_remote() and can use many
promisor remotes, let's hide repository_format_partial_clone
as a static in promisor-remote.c to avoid it being use
for anything other than managing backward compatibility.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were many places the code relied on the string returned from
getenv() to be non-volatile, which is not true, that have been
corrected.
* jk/save-getenv-result:
builtin_diff(): read $GIT_DIFF_OPTS closer to use
merge-recursive: copy $GITHEAD strings
init: make a copy of $GIT_DIR string
config: make a copy of $GIT_CONFIG string
commit: copy saved getenv() result
get_super_prefix(): copy getenv() result
The return value of getenv() is not guaranteed to remain valid across
multiple calls (nor across calls to setenv()). Since this function
caches the result for the length of the program, we must make a copy to
ensure that it is still valid when we need it.
Reported-by: Yngve N. Pettersen <yngve@vivaldi.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up with optimization for the codepath that checks
(non-)existence of loose objects.
* jk/loose-object-cache:
odb_load_loose_cache: fix strbuf leak
fetch-pack: drop custom loose object cache
sha1-file: use loose object cache for quick existence check
object-store: provide helpers for loose_objects_cache
sha1-file: use an object_directory for the main object dir
handle alternates paths the same as the main object dir
sha1_file_name(): overwrite buffer instead of appending
rename "alternate_object_database" to "object_directory"
submodule--helper: prefer strip_suffix() to ends_with()
fsck: do not reuse child_process structs
Our handling of alternate object directories is needlessly different
from the main object directory. As a result, many places in the code
basically look like this:
do_something(r->objects->objdir);
for (odb = r->objects->alt_odb_list; odb; odb = odb->next)
do_something(odb->path);
That gets annoying when do_something() is non-trivial, and we've
resorted to gross hacks like creating fake alternates (see
find_short_object_filename()).
Instead, let's give each raw_object_store a unified list of
object_directory structs. The first will be the main store, and
everything after is an alternate. Very few callers even care about the
distinction, and can just loop over the whole list (and those who care
can just treat the first element differently).
A few observations:
- we don't need r->objects->objectdir anymore, and can just
mechanically convert that to r->objects->odb->path
- object_directory's path field needs to become a real pointer rather
than a FLEX_ARRAY, in order to fill it with expand_base_dir()
- we'll call prepare_alt_odb() earlier in many functions (i.e.,
outside of the loop). This may result in us calling it even when our
function would be satisfied looking only at the main odb.
But this doesn't matter in practice. It's not a very expensive
operation in the first place, and in the majority of cases it will
be a noop. We call it already (and cache its results) in
prepare_packed_git(), and we'll generally check packs before loose
objects. So essentially every program is going to call it
immediately once per program.
Arguably we should just prepare_alt_odb() immediately upon setting
up the repository's object directory, which would save us sprinkling
calls throughout the code base (and forgetting to do so has been a
source of subtle bugs in the past). But I've stopped short of that
here, since there are already a lot of other moving parts in this
patch.
- Most call sites just get shorter. The check_and_freshen() functions
are an exception, because they have entry points to handle local and
nonlocal directories separately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new repo extension is added, worktreeConfig. When it is present:
- Repository config reading by default includes $GIT_DIR/config _and_
$GIT_DIR/config.worktree. "config" file remains shared in multiple
worktree setup.
- The special treatment for core.bare and core.worktree, to stay
effective only in main worktree, is gone. These config settings are
supposed to be in config.worktree.
This extension is most useful in multiple worktree setup because you
now have an option to store per-worktree config (which is either
.git/config.worktree for main worktree, or
.git/worktrees/xx/config.worktree for linked ones).
This extension can be used in single worktree mode, even though it's
pretty much useless (but this can happen after you remove all linked
worktrees and move back to single worktree).
"git config" reads from both "config" and "config.worktree" by default
(i.e. without either --user, --file...) when this extension is
present. Default writes still go to "config", not "config.worktree". A
new option --worktree is added for that (*).
Since a new repo extension is introduced, existing git binaries should
refuse to access to the repo (both from main and linked worktrees). So
they will not misread the config file (i.e. skip the config.worktree
part). They may still accidentally write to the config file anyway if
they use with "git config --file <path>".
This design places a bet on the assumption that the majority of config
variables are shared so it is the default mode. A safer move would be
default writes go to per-worktree file, so that accidental changes are
isolated.
(*) "git config --worktree" points back to "config" file when this
extension is not present and there is only one worktree so that it
works in any both single and multiple worktree setups.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code hygiene improvement for the header files.
* en/incl-forward-decl:
Remove forward declaration of an enum
compat/precompose_utf8.h: use more common include guard style
urlmatch.h: fix include guard
Move definition of enum branch_track from cache.h to branch.h
alloc: make allocate_alloc_state and clear_alloc_state more consistent
Add missing includes and forward declarations
Many more strings are prepared for l10n.
* nd/i18n: (23 commits)
transport-helper.c: mark more strings for translation
transport.c: mark more strings for translation
sha1-file.c: mark more strings for translation
sequencer.c: mark more strings for translation
replace-object.c: mark more strings for translation
refspec.c: mark more strings for translation
refs.c: mark more strings for translation
pkt-line.c: mark more strings for translation
object.c: mark more strings for translation
exec-cmd.c: mark more strings for translation
environment.c: mark more strings for translation
dir.c: mark more strings for translation
convert.c: mark more strings for translation
connect.c: mark more strings for translation
config.c: mark more strings for translation
commit-graph.c: mark more strings for translation
builtin/replace.c: mark more strings for translation
builtin/pack-objects.c: mark more strings for translation
builtin/grep.c: mark strings for translation
builtin/config.c: mark more strings for translation
...
A new configuration variable core.usereplacerefs has been added,
primarily to help server installations that want to ignore the
replace mechanism altogether.
* jk/core-use-replace-refs:
add core.usereplacerefs config option
check_replace_refs: rename to read_replace_refs
check_replace_refs: fix outdated comment
'branch_track' feels more closely related to branching, and it is
needed later in branch.h; rather than #include'ing cache.h in branch.h
for this small enum, just move the enum and the external declaration
for git_branch_track to branch.h.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was added as a NEEDSWORK in c3c36d7de2 (replace-object:
check_replace_refs is safe in multi repo environment, 2018-04-11),
waiting for a calmer period. Since doing so now doesn't conflict
with anything in 'pu', it seems as good a time as any.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-grafts:
commit: allow lookup_commit_graft to handle arbitrary repositories
commit: allow prepare_commit_graft to handle arbitrary repositories
shallow: migrate shallow information into the object parser
path.c: migrate global git_path_* to take a repository argument
cache: convert get_graft_file to handle arbitrary repositories
commit: convert read_graft_file to handle arbitrary repositories
commit: convert register_commit_graft to handle arbitrary repositories
commit: convert commit_graft_pos() to handle arbitrary repositories
shallow: add repository argument to is_repository_shallow
shallow: add repository argument to check_shallow_file_for_update
shallow: add repository argument to register_shallow
shallow: add repository argument to set_alternate_shallow_file
commit: add repository argument to lookup_commit_graft
commit: add repository argument to prepare_commit_graft
commit: add repository argument to read_graft_file
commit: add repository argument to register_commit_graft
commit: add repository argument to commit_graft_pos
object: move grafts to object parser
object-store: move object access functions to object-store.h
Add a struct repository argument to the functions in commit-graph.h that
read the commit graph. (This commit does not affect functions that write
commit graphs.)
Because the commit graph functions can now read the commit graph of any
repository, the global variable core_commit_graph has been removed.
Instead, the config option core.commitGraph is now read on the first
time in a repository that a commit is attempted to be parsed using its
commit graph.
This commit includes a test that exercises the functionality on an
arbitrary repository that is not the_repository.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This conversion was done without the #define trick used in the earlier
series refactoring to have better repository access, because this function
is easy to review, as all lines are converted and it has only one caller.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of set_alternate_shallow_file
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new "checkout-encoding" attribute can ask Git to convert the
contents to the specified encoding when checking out to the working
tree (and the other way around when checking in).
* ls/checkout-encoding:
convert: add round trip check based on 'core.checkRoundtripEncoding'
convert: add tracing for 'working-tree-encoding' attribute
convert: check for detectable errors in UTF encodings
convert: add 'working-tree-encoding' attribute
utf8: add function to detect a missing UTF-16/32 BOM
utf8: add function to detect prohibited UTF-16/32 BOM
utf8: teach same_encoding() alternative UTF encoding names
strbuf: add a case insensitive starts_with()
strbuf: add xstrdup_toupper()
strbuf: remove unnecessary NUL assignment in xstrdup_tolower()
The effort to pass the repository in-core structure throughout the
API continues. This round deals with the code that implements the
refs/replace/ mechanism.
* sb/object-store-replace:
replace-object: allow lookup_replace_object to handle arbitrary repositories
replace-object: allow do_lookup_replace_object to handle arbitrary repositories
replace-object: allow prepare_replace_object to handle arbitrary repositories
refs: allow for_each_replace_ref to handle arbitrary repositories
refs: store the main ref store inside the repository struct
replace-object: add repository argument to lookup_replace_object
replace-object: add repository argument to do_lookup_replace_object
replace-object: add repository argument to prepare_replace_object
refs: add repository argument to for_each_replace_ref
refs: add repository argument to get_main_ref_store
replace-object: check_replace_refs is safe in multi repo environment
replace-object: eliminate replace objects prepared flag
object-store: move lookup_replace_object to replace-object.h
replace-object: move replace_map to object store
replace_object: use oidmap
Precompute and store information necessary for ancestry traversal
in a separate file to optimize graph walking.
* ds/commit-graph:
commit-graph: implement "--append" option
commit-graph: build graph from starting commits
commit-graph: read only from specific pack-indexes
commit: integrate commit graph with commit parsing
commit-graph: close under reachability
commit-graph: add core.commitGraph setting
commit-graph: implement git commit-graph read
commit-graph: implement git-commit-graph write
commit-graph: implement write_commit_graph()
commit-graph: create git-commit-graph builtin
graph: add commit graph design document
commit-graph: add format document
csum-file: refactor finalize_hashfile() method
csum-file: rename hashclose() to finalize_hashfile()
Some codepaths, including the refs API, get and keep relative
paths, that go out of sync when the process does chdir(2). The
chdir-notify API is introduced to let these codepaths adjust these
cached paths to the new current directory.
* jk/relative-directory-fix:
refs: use chdir_notify to update cached relative paths
set_work_tree: use chdir_notify
add chdir-notify API
trace.c: export trace_setup_key
set_git_dir: die when setenv() fails
UTF supports lossless conversion round tripping and conversions between
UTF and other encodings are mostly round trip safe as Unicode aims to be
a superset of all other character encodings. However, certain encodings
(e.g. SHIFT-JIS) are known to have round trip issues [1].
Add 'core.checkRoundtripEncoding', which contains a comma separated
list of encodings, to define for what encodings Git should check the
conversion round trip if they are used in the 'working-tree-encoding'
attribute.
Set SHIFT-JIS as default value for 'core.checkRoundtripEncoding'.
[1] https://support.microsoft.com/en-us/help/170559/prb-conversion-problem-between-shift-jis-and-unicode
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In e1111cef23 (inline lookup_replace_object() calls, 2011-05-15) a shortcut
for checking the object replacement was added by setting check_replace_refs
to 0 once the replacements were evaluated to not exist. This works fine in
with the assumption of only one repository in existence.
The assumption won't hold true any more when we work on multiple instances
of a repository structs (e.g. one struct per submodule), as the first
repository to be inspected may have no replacements and would set the
global variable. Other repositories would then completely omit their
evaluation of replacements.
This reverts back the meaning of the flag `check_replace_refs` of
"Do we need to check with the lookup table?" to "Do we need to read
the replacement definition?", adding the bypassing logic to
lookup_replace_object after the replacement definition was read.
As with the original patch, delay the renaming of the global variable
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactoring the internal global data structure to make it possible
to open multiple repositories, work with and then close them.
Rerolled by Duy on top of a separate preliminary clean-up topic.
The resulting structure of the topics looked very sensible.
* sb/object-store: (27 commits)
sha1_file: allow sha1_loose_object_info to handle arbitrary repositories
sha1_file: allow map_sha1_file to handle arbitrary repositories
sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
sha1_file: allow open_sha1_file to handle arbitrary repositories
sha1_file: allow stat_sha1_file to handle arbitrary repositories
sha1_file: allow sha1_file_name to handle arbitrary repositories
sha1_file: add repository argument to sha1_loose_object_info
sha1_file: add repository argument to map_sha1_file
sha1_file: add repository argument to map_sha1_file_1
sha1_file: add repository argument to open_sha1_file
sha1_file: add repository argument to stat_sha1_file
sha1_file: add repository argument to sha1_file_name
sha1_file: allow prepare_alt_odb to handle arbitrary repositories
sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
sha1_file: add repository argument to prepare_alt_odb
sha1_file: add repository argument to link_alt_odb_entries
sha1_file: add repository argument to read_info_alternates
sha1_file: add repository argument to link_alt_odb_entry
sha1_file: add raw_object_store argument to alt_odb_usable
pack: move approximate object count to object store
...
The commit graph feature is controlled by the new core.commitGraph config
setting. This defaults to 0, so the feature is opt-in.
The intention of core.commitGraph is that a user can always stop checking
for or parsing commit graph files if core.commitGraph=0.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up for the "repository" abstraction.
* nd/remove-ignore-env-field:
repository.h: add comment and clarify repo_set_gitdir
repository: delete ignore_env member
sha1_file.c: move delayed getenv(altdb) back to setup_git_env()
repository.c: delete dead functions
repository.c: move env-related setup code back to environment.c
repository: initialize the_repository in main()
When we change to the top of the working tree, we manually
re-adjust $GIT_DIR and call set_git_dir() again, in order to
update any relative git-dir we'd compute earlier.
Instead of the work-tree code having to know to call the
git-dir code, let's use the new chdir_notify interface.
There are two spots that need updating, with a few
subtleties in each:
1. the set_git_dir() code needs to chdir_notify_register()
so it can be told when to update its path.
Technically we could push this down into repo_set_gitdir(),
so that even repository structs besides the_repository
could benefit from this. But that opens up a lot of
complications:
- we'd still need to touch set_git_dir(), because it
does some other setup (like setting $GIT_DIR in the
environment)
- submodules using other repository structs get
cleaned up, which means we'd need to remove them
from the chdir_notify list
- it's unlikely to fix any bugs, since we shouldn't
generally chdir() in the middle of working on a
submodule
2. setup_work_tree now needs to call chdir_notify(), and
can lose its manual set_git_dir() call.
Note that at first glance it looks like this undoes the
absolute-to-relative optimization added by 044bbbcb63
(Make git_dir a path relative to work_tree in
setup_work_tree(), 2008-06-19). But for the most part
that optimization was just _undoing_ the
relative-to-absolute conversion which the function was
doing earlier (and which is now gone).
It is true that if you already have an absolute git_dir
that the setup_work_tree() function will no longer make
it relative as a side effect. But:
- we generally do have relative git-dir's due to the
way the discovery code works
- if we really care about making git-dir's relative
when possible, then we should be relativizing them
earlier (e.g., when we see an absolute $GIT_DIR we
could turn it relative, whether we are going to
chdir into a worktree or not). That would cover all
cases, including ones that 044bbbcb63 did not.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The set_git_dir() function returns an error if setenv()
fails, but there are zero callers who pay attention to this
return value. If this ever were to happen, it could cause
confusing results, as sub-processes would see a potentially
stale GIT_DIR (e.g., if it is relative and we chdir()-ed to
the root of the working tree).
We _could_ try to fix each caller, but there's really
nothing useful to do after this failure except die. Let's
just lump setenv() failure into the same category as malloc
failure: things that should never happen and cause us to
abort catastrophically.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The raw object store field will contain any objects needed for access
to objects in a given repository.
This patch introduces the raw object store and populates it with the
`objectdir`, which used to be part of the repository struct.
As the struct gains members, we'll also populate the function to clear
the memory for these members.
In a later step, we'll introduce a struct object_parser, that will
complement the object parsing in a repository struct: The raw object
parser is the layer that will provide access to raw object content,
while the higher level object parser code will parse raw objects and
keeps track of parenthood and other object relationships using 'struct
object'. For now only add the lower level to the repository struct.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
getenv() is supposed to work on the main repository only. This delayed
getenv() code in sha1_file.c makes it more difficult to convert
sha1_file.c to a generic object store that could be used by both
submodule and main repositories.
Move the getenv() back in setup_git_env() where other env vars are
also fetched.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It does not make sense that generic repository code contains handling
of environment variables, which are specific for the main repository
only. Refactor repo_set_gitdir() function to take $GIT_DIR and
optionally _all_ other customizable paths. These optional paths can be
NULL and will be calculated according to the default directory layout.
Note that some dead functions are left behind to reduce diff
noise. They will be deleted in the next patch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The machinery to clone & fetch, which in turn involves packing and
unpacking objects, have been told how to omit certain objects using
the filtering mechanism introduced by the jh/object-filtering
topic, and also mark the resulting pack as a promisor pack to
tolerate missing objects, taking advantage of the mechanism
introduced by the jh/fsck-promisors topic.
* jh/partial-clone:
t5616: test bulk prefetch after partial fetch
fetch: inherit filter-spec from partial clone
t5616: end-to-end tests for partial clone
fetch-pack: restore save_commit_buffer after use
unpack-trees: batch fetching of missing blobs
clone: partial clone
partial-clone: define partial clone settings in config
fetch: support filters
fetch: refactor calculation of remote list
fetch-pack: test support excluding large blobs
fetch-pack: add --no-filter
fetch-pack, index-pack, transport: partial clone
upload-pack: add object filtering for partial clone
In preparation for implementing narrow/partial clone, the machinery
for checking object connectivity used by gc and fsck has been
taught that a missing object is OK when it is referenced by a
packfile specially marked as coming from trusted repository that
promises to make them available on-demand and lazily.
* jh/fsck-promisors:
gc: do not repack promisor packfiles
rev-list: support termination at promisor objects
sha1_file: support lazily fetching missing objects
introduce fetch-object: fetch one promisor object
index-pack: refactor writing of .keep files
fsck: support promisor objects as CLI argument
fsck: support referenced promisor objects
fsck: support refs pointing to promisor objects
fsck: introduce partialclone extension
extension.partialclone: introduce partial clone extension
When calling convert_to_git(), the checksafe parameter defined what
should happen if the EOL conversion (CRLF --> LF --> CRLF) does not
roundtrip cleanly. In addition, it also defined if line endings should
be renormalized (CRLF --> LF) or kept as they are.
checksafe was an safe_crlf enum with these values:
SAFE_CRLF_FALSE: do nothing in case of EOL roundtrip errors
SAFE_CRLF_FAIL: die in case of EOL roundtrip errors
SAFE_CRLF_WARN: print a warning in case of EOL roundtrip errors
SAFE_CRLF_RENORMALIZE: change CRLF to LF
SAFE_CRLF_KEEP_CRLF: keep all line endings as they are
In some cases the integer value 0 was passed as checksafe parameter
instead of the correct enum value SAFE_CRLF_FALSE. That was no problem
because SAFE_CRLF_FALSE is defined as 0.
FALSE/FAIL/WARN are different from RENORMALIZE and KEEP_CRLF. Therefore,
an enum is not ideal. Let's use a integer bit pattern instead and rename
the parameter to conv_flags to make it more generically usable. This
allows us to extend the bit pattern in a subsequent commit.
Reported-By: Randall S. Becker <rsbecker@nexbridge.com>
Helped-By: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ancient part of codebase still shows dots after an abbreviated
object name just to show that it is not a full object name, but
these ellipses are confusing to people who newly discovered Git
who are used to seeing abbreviated object names and find them
confusing with the range syntax.
* ar/unconfuse-three-dots:
t2020: test variations that matter
t4013: test new output from diff --abbrev --raw
diff: diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value
t4013: prepare for upcoming "diff --raw --abbrev" output format change
checkout: describe_detached_head: remove ellipsis after committish
print_sha1_ellipsis: introduce helper
Documentation: user-manual: limit usage of ellipsis
Documentation: revisions: fix typo: "three dot" ---> "three-dot" (in line with "two-dot").
Create get and set routines for "partial clone" config settings.
These will be used in a future commit by clone and fetch to
remember the promisor remote and the default filter-spec.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce new repository extension option:
`extensions.partialclone`
See the update to Documentation/technical/repository-version.txt
in this patch for more information.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>