We meticulously pass the `exclude` flag to the `treat_directory()`
function so that we can indicate that files in it are excluded rather
than untracked when recursing.
But we did not yet treat submodules the same way.
Because of that, `git status --ignored --untracked` with a submodule
`submodule` in a gitignored `tracked/` would show the submodule in the
"Untracked files" section, e.g.
On branch master
Untracked files:
(use "git add <file>..." to include in what will be committed)
tracked/submodule/
Ignored files:
(use "git add -f <file>..." to include in what will be committed)
tracked/submodule/initial.t
Instead, we would want it to show the submodule in the "Ignored files"
section:
On branch master
Ignored files:
(use "git add -f <file>..." to include in what will be committed)
tracked/submodule/
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the declaration and definition of resolve_gitlink_ref to use
struct object_id and apply the following semantic patch:
@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3.hash)
+ resolve_gitlink_ref(E1, E2, &E3)
@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3->hash)
+ resolve_gitlink_ref(E1, E2, E3)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the performance of the directory listing logic when it wants to list
non-empty ignored directories. In order to show non-empty ignored directories,
the existing logic will recursively iterate through all contents of an ignored
directory. This change introduces the optimization to stop iterating through
the contents once it finds the first file. This can have a significant
improvement in 'git status --ignored' performance in repositories with a large
number of files in ignored directories.
For an example of the performance difference on an example repository with
196,000 files in 400 ignored directories:
| Command | Time (s) |
| -------------------------- | --------- |
| git status | 1.2 |
| git status --ignored (old) | 3.9 |
| git status --ignored (new) | 1.4 |
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The untracked cache is stored on disk by concatenating its memory
structures without any padding. Consequently some of the structs are
not aligned at a particular boundary when the whole extension is read
back in one go. That's only OK on platforms without strict alignment
requirements, or for byte-aligned data like strings or hash values.
Decode struct ondisk_untracked_cache carefully from the extension
blob by using explicit pointer arithmetic with offsets, avoiding
alignment issues. Use char pointers for passing stat_data objects to
stat_data_from_disk(), and use memcpy(3) in that function to get the
contents into a properly aligned struct, then perform the byte-order
adjustment in place there.
Found with Clang's UBSan.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the unused wildopts placeholder struct from being passed to all
wildmatch() invocations, or rather remove all the boilerplate NULL
parameters.
This parameter was added back in commit 9b3497cab9 ("wildmatch: rename
constants and update prototype", 2013-01-01) as a placeholder for
future use. Over 4 years later nothing has made use of it, let's just
remove it. It can be added in the future if we find some reason to
start using such a parameter.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar functions exist in apply.c and builtin/show-branch.c for
counting the number of slashes in a string. Also in the later
patches, we introduce a third caller for the same. Hence, we unify
it now by cleaning the existing functions and declaring a common
function count_slashes in dir.h and implementing it in dir.c to
remove this code duplication.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the is_missing_file_error() helper introduced in the previous
step, update all hits from
$ git grep -e ENOENT --and -e ENOTDIR
There are codepaths that only check ENOENT, and it is possible that
some of them should be checking both. Updating them is kept out of
this step deliberately, as we do not want to change behaviour in this
step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In many places, Git warns about an inaccessible file after a fopen()
failed. To discern these cases from other cases where we want to warn
about inaccessible files, introduce a new helper specifically to test
whether fopen() failed because the current user lacks the permission to
open file in question.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to use cmp_name() and check_contains() (which both compare
`struct dir_entry`s, the former in terms of the sort order, the latter
in terms of whether one lexically contains another) outside of dir.c,
so we have to (1) change their linkage and (2) rename them as
appropriate for the global namespace. The second is achieved by
renaming cmp_name() to cmp_dir_entry() and check_contains() to
check_dir_entry_contains().
Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It doesn't always make sense to return these,
though (we do need them for `clean -d`), so we introduce a flag
(DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
strips dir->entries of the untracked contents of untracked dirs.
We also introduce check_contains() to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.
This also fixes known breakages in t7061, since status --ignored now
searches untracked directories for ignored files.
Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We consider directories containing only untracked and ignored files to
be themselves untracked, which in the usual case means we don't have to
search these directories. This is problematic when we want to collect
ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
read_directory_recursive() to recurse into untracked directories to find
the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
has the side effect of also collecting all untracked files in untracked
directories as well.
Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to make it clearer where the_index is being referenced, stop
using the index compatibility macros in dir.c. This is to make it
easier to identify the functions which need to be convert to taking in a
'struct index_state' as a parameter.
The end goal would be to eliminate the need to reference global index
state in dir.c.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch we'll use connect_work_tree_and_git_dir when the
directory for the gitlink file doesn't exist yet. This patch makes
connect_work_tree_and_git_dir safe to use for both cases of
either the git dir or the working dir missing.
To do so, we need to call safe_create_leading_directories[_const]
on both directories. However this has to happen before we construct
the absolute paths as real_pathdup assumes the directories to
be there already.
So for both the config file in the git dir as well as the .git link
file we need to
a) construct the name
b) call SCLD
c) get the absolute path
d) once a-c is done for both we can consume the absolute path
to compute the relative path to each other and store those
relative paths.
The implementation provided here puts a) and b) for both cases first,
and then performs c and d after.
One of the two users of 'connect_work_tree_and_git_dir' already checked
for the directory being there, so we can loose that check as
connect_work_tree_and_git_dir handles this functionality now.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pathspec mechanism is extended via the new
":(attr:eol=input)pattern/to/match" syntax to filter paths so that it
requires paths to not just match the given pattern but also have the
specified attrs attached for them to be chosen.
Based on a patch by Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4ac9006f83 (real_path: have callers use real_pathdup and
strbuf_realpath, 2016-12-12), we changed the xstrdup(real_path())
pattern to use real_pathdup() directly.
The problem with this change is that real_path() calls
strbuf_realpath() with die_on_error = 1 while real_pathdup() calls
it with die_on_error = 0. Meaning that in cases where real_path()
causes Git to die() with an error message, real_pathdup() is silent
and returns NULL instead.
The callers, however, are ill-prepared for that change, as they expect
the return value to be non-NULL (and otherwise the function died
with an appropriate error message).
Fix this by extending real_pathdup()'s signature to accept the
die_on_error flag and simply pass it through to strbuf_realpath(),
and then adjust all callers after a careful audit whether they would
handle NULLs well.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass the match member of the first pathspec item directly to
read_directory() instead of using common_prefix() to duplicate it first,
thus avoiding memory duplication, strlen(3) and free(3).
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'fill_directory()' to use the pathspec struct interface from
using the '_raw' entry in the pathspec struct.
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach simplify_away() and exclude_matches_pathspec() to handle struct
pathspec directly, eliminating the need for the struct path_simplify.
Also renamed the len parameter to pathlen in exclude_matches_pathspec()
to match the parameter names used in simplify_away().
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a submodule has its git dir inside the working dir, the submodule
support for checkout that we plan to add in a later patch will fail.
Add functionality to migrate the git directory to be absorbed
into the superprojects git directory.
The newly added code in this patch is structured such that other areas of
Git can also make use of it. The code in the submodule--helper is a mere
wrapper and option parser for the function
`absorb_git_dir_into_superproject`, that takes care of embedding the
submodules git directory into the superprojects git dir. That function
makes use of the more abstract function for this use case
`relocate_gitdir`, which can be used by e.g. the worktree code eventually
to move around a git directory.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
That function was primarily used by submodule code, but the function
itself is not inherently about submodules. In the next patch we'll
introduce relocate_git_dir, which can be used by worktrees as well,
so find a neutral middle ground in dir.h.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level attribute and gitignore code will try to look
in $GIT_DIR/info for any repo-level configuration files,
even if we have not actually determined that we are in a
repository (e.g., running "git grep --no-index"). In such a
case they end up looking for ".git/info/attributes", etc.
This is generally harmless, as such a file is unlikely to
exist outside of a repository, but it's still conceptually
the wrong thing to do.
Let's detect this situation explicitly and skip reading the
file (i.e., the same behavior we'd get if we were in a
repository and the file did not exist).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pathspecs can be a bit tricky when trying to apply them to submodules.
The main challenge is that the pathspecs will be with respect to the
superproject and not with respect to paths in the submodule. The
approach this patch takes is to pass in the identical pathspec from the
superproject to the submodule in addition to the submodule-prefix, which
is the path from the root of the superproject to the submodule, and then
we can compare an entry in the submodule prepended with the
submodule-prefix to the pathspec in order to determine if there is a
match.
This patch also permits the pathspec logic to perform a prefix match against
submodules since a pathspec could refer to a file inside of a submodule.
Due to limitations in the wildmatch logic, a prefix match is only done
literally. If any wildcard character is encountered we'll simply punt
and produce a false positive match. More accurate matching will be done
once inside the submodule. This is due to the superproject not knowing
what files could exist in the submodule.
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT. The resulting code is
shorter and supports empty arrays with NULL pointers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:
@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use strbuf_addbuf() where possible; it's shorter and more efficient.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions compare two paths that are taken from file system.
Depending on the running file system, paths may need to be compared
case-sensitively or not, and maybe even something else in future. The
current names do not convey that well.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was largely replaced by fnmatch_icase_mem() and its last use was in
84b8b5d (remove match_pathspec() in favor of match_pathspec_depth() -
2013-07-14).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The values defined by the macro EXC_FLAG_* (1, 4, 8, 16) are stored
in fields of the structs "pattern" and "exclude", some functions
arguments and a local variable. None of these uses its most
significant bit in any special way and there is no good reason to
use a signed integer for them.
And while we're at it, document "flags" of "exclude" to explicitly
state the values it's supposed to take on.
Signed-off-by: Saurav Sachidanand <sauravsachidanand@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We perform unchecked additions when computing the size of a
"struct ondisk_untracked_cache". This is unlikely to have an
integer overflow in practice, but we'd like to avoid this
dangerous pattern to make further audits easier.
Note that there's one subtlety here, though. We protect
ourselves against a NULL exclude_per_dir entry in our
source, and avoid calling strlen() on it, keeping "len" at
0. But later, we unconditionally memcpy "len + 1" bytes to
get the trailing NUL byte. If we did have a NULL
exclude_per_dir, we would read from bogus memory.
As it turns out, though, we always create this field
pointing to a string literal, so there's no bug. We can just
get rid of the pointless extra conditional.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using FLEX_ARRAY macros reduces the amount of manual
computation size we have to do. It also ensures we don't
overflow size_t, and it makes sure we write the same number
of bytes that we allocated.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.
There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.
In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>