Both of these existed to allow us to reuse all the merge-related tests
in the testsuite while easily flipping between the 'recursive' and the
'ort' backends. Now that we have removed merge-recursive and remapped
'recursive' to mean 'ort', we don't need this scaffolding anymore.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there
is no longer a need to have that variable declared in all of our tests.
Drop it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix typos in documentation, comments, etc.
Via codespell.
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When "git sparse-checkout disable" turns a sparse checkout into a
regular checkout, the index is fully expanded. This totally
expected behaviour however had an "oops, we are expanding the
index" advice message, which has been corrected.
* ds/sparse-checkout-expansion-advice:
sparse-checkout: disable advice in 'disable'
When hitting a sparse directory in `wt_status_collect_changes_initial()`
we use a `struct strbuf` to assemble the directory's name. We never free
that buffer though, causing a memory leak.
Fix the leak by releasing the buffer. While at it, move the buffer
outside of the loop and reset it to save on some wasteful allocations.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git sparse-checkout disable' with the sparse index
enabled, Git is expected to expand the index into a full index. However,
it currently outputs the advice message saying that that is unexpected
and likely due to an issue with the working directory.
Disable this advice message when in this code path. Establish a pattern
for doing a similar removal in the future.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change affects how 'git cat-file' works with the index when
specifying an object with the ":<path>" syntax (which will give file
contents from the index).
'git cat-file' expands a sparse index to a full index any time contents
are requested from the index by specifying an object with the ":<path>"
syntax. This is true even when the requested file is part of the sparse
index, and results in much slower 'git cat-file' operations when working
within the sparse index.
Mark 'git cat-file' as not needing a full index, so that you only pay
the cost of expanding the sparse index to a full index when you request
a file outside of the sparse index.
Add tests to ensure both that:
- 'git cat-file' returns the correct file contents whether or not the
file is in the sparse index
- 'git cat-file' expands to the full index any time you request
something outside of the sparse index
Signed-off-by: Kevin Lyles <klyles+github@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'run_on_sparse' and 'run_on_all' functions do not work correctly for
commands accepting standard input, because they run the same command
multiple times and the first instance consumes it. This also indirectly
affects 'test_all_match' and 'test_sparse_match'.
To allow these functions to work with commands accepting standard input,
first slurp standard input to a temporary file, and then run the command
with its standard input redirected from the temporary file. This ensures
that each command sees the same contents from its standard input.
Note that this does not impact commands that do not read from standard
input; they continue to ignore it. Additionally, existing uses of the
run_on_* functions do not need to do anything differently, as the
standard input of the test environment is already connected to
/dev/null.
We do not explicitly clean up the input files because they are cleaned
up with the rest of the test repositories and their contents may be
useful for figuring out which command failed when a test case fails.
Signed-off-by: Kevin Lyles <klyles@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse index allows focusing the index data structure on the files
present in the sparse-checkout, leaving only tree entries for
directories not within the sparse-checkout. Each builtin needs a
repository setting to indicate that it has been tested with the sparse
index before Git will allow the index to be loaded into memory in its
sparse form. This is a safety precaution.
There are still some builtins that haven't been integrated due to the
complexity of the integration and the lack of significant use. However,
'git diff-index' was neglected only because of initial data showing low
usage. The diff machinery was already integrated and there is no more
work to be done there but add some tests to be sure 'git diff-index'
behaves as expected.
For this purpose, we can follow the testing pattern used in 51ba65b5c3
(diff: enable and test the sparse index, 2021-12-06). One difference
here is that we only verify that the sparse index case agrees with the
full index case, but do not generate the expected output. The 'git diff'
tests use the '--name-status' option to ease the creation of the
expected output, but that's not an option for 'diff-index'. Since the
underlying diff machinery is the same, a simple comparison is sufficient
to give some coverage.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Typically, forcing a sparse index to expand to a full index means that
Git could not determine the status of a file outside of the
sparse-checkout and needed to expand sparse trees into the full list of
sparse blobs. This operation can be very slow when the sparse-checkout
is much smaller than the full tree at HEAD.
When users are in this state, there is usually a modified or untracked
file outside of the sparse-checkout mentioned by the output of 'git
status'. There are a number of reasons why this is insufficient:
1. Users may not have a full understanding of which files are inside or
outside of their sparse-checkout. This is more common in monorepos
that manage the sparse-checkout using custom tools that map build
dependencies into sparse-checkout definitions.
2. In some cases, an empty directory could exist outside the
sparse-checkout and these empty directories are not reported by 'git
status' and friends.
3. If the user has '.gitignore' or 'exclude' files, then 'git status'
will squelch the warnings and not demonstrate any problems.
In order to help users who are in this state, add a new advice message
to indicate that a sparse index is expanded to a full index. This
message should be written at most once per process, so add a static
global 'give_advice_on_expansion' to sparse-index.c. Further, there is a
case in 'git sparse-checkout set' that uses the sparse index as an
in-memory data structure (even when writing a full index) so we need to
disable the message in that kind of case.
The t1092-sparse-checkout-compatibility.sh test script compares the
behavior of several Git commands across full and sparse repositories,
including sparse repositories with and without a sparse index. We need
to disable the advice in the sparse-index repo to avoid differences in
stderr. By leaving the advice on in the sparse-checkout repo (without
the sparse index), we can test the behavior of disabling the advice in
convert_to_sparse(). (Indeed, these tests are how that necessity was
discovered.) Add a test that reenables the advice and demonstrates that
the message is output.
The advice message is defined outside of expand_index() to avoid super-
wide lines. It is also defined as a macro to avoid compile issues with
-Werror=format-security.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Another step to deprecate test_i18ngrep.
* jc/test-i18ngrep:
tests: teach callers of test_i18ngrep to use test_grep
test framework: further deprecate test_i18ngrep
They are equivalents and the former still exists, so as long as the
only change this commit makes are to rewrite test_i18ngrep to
test_grep, there won't be any new bug, even if there still are
callers of test_i18ngrep remaining in the tree, or when merged to
other topics that add new uses of test_i18ngrep.
This patch was produced more or less with
git grep -l -e 'test_i18ngrep ' 't/t[0-9][0-9][0-9][0-9]-*.sh' |
xargs perl -p -i -e 's/test_i18ngrep /test_grep /'
and a good way to sanity check the result yourself is to run the
above in a checkout of c4603c1c (test framework: further deprecate
test_i18ngrep, 2023-10-31) and compare the resulting working tree
contents with the result of applying this patch to the same commit.
You'll see that test_i18ngrep in a few t/lib-*.sh files corrected,
in addition to the manual reproduction.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set the requires-full-index to false for "check-attr".
Add a test to ensure that the index is not expanded whether the files
are outside or inside the sparse-checkout cone when the sparse index is
enabled.
The `p2000` tests demonstrate a ~63% execution time reduction for
'git check-attr' using a sparse index.
Test before after
-----------------------------------------------------------------------
2000.106: git check-attr -a f2/f4/a (full-v3) 0.05 0.05 +0.0%
2000.107: git check-attr -a f2/f4/a (full-v4) 0.05 0.05 +0.0%
2000.108: git check-attr -a f2/f4/a (sparse-v3) 0.04 0.02 -50.0%
2000.109: git check-attr -a f2/f4/a (sparse-v4) 0.04 0.01 -75.0%
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before this patch, git check-attr was unable to read the attributes from
a .gitattributes file within a sparse directory. The original comment
was operating under the assumption that users are only interested in
files or directories inside the cones. Therefore, in the original code,
in the case of a cone-mode sparse-checkout, we didn't load the
.gitattributes file.
However, this behavior can lead to missing attributes for files inside
sparse directories, causing inconsistencies in file handling.
To resolve this, revise 'git check-attr' to allow attribute reading for
files in sparse directories from the corresponding .gitattributes files:
1.Utilize path_in_cone_mode_sparse_checkout() and index_name_pos_sparse
to check if a path falls within a sparse directory.
2.If path is inside a sparse directory, employ the value of
index_name_pos_sparse() to find the sparse directory containing path and
path relative to sparse directory. Proceed to read attributes from the
tree OID of the sparse directory using read_attr_from_blob().
3.If path is not inside a sparse directory,ensure that attributes are
fetched from the index blob with read_blob_data_from_index().
Change the test 'check-attr with pathspec outside sparse definition' to
'test_expect_success' to reflect that the attributes inside a sparse
directory can now be read. Ensure that the sparse index case works
correctly for git check-attr to illustrate the successful handling of
attributes within sparse directories.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests for `git check-attr`, make sure attribute file does get read
from index when path is either inside or outside of sparse-checkout
definition.
Add a test named 'diff --check with pathspec outside sparse definition'.
It starts by disabling the trailing whitespace and space-before-tab
checks using the core. whitespace configuration option. Then, it
specifically re-enables the trailing whitespace check for a file located
in a sparse directory by adding a whitespace=trailing-space rule to the
.gitattributes file within that directory. Next, create and populate the
folder1 directory, and then add the .gitattributes file to the index.
Edit the contents of folder1/a, add it to the index, and proceed to
"re-sparsify" 'folder1/' with 'git sparse-checkout reapply'. Finally,
use 'git diff --check --cached' to compare the 'index vs. HEAD',
ensuring the correct application of the attribute rules even when the
file's path is outside the sparse-checkout definition.
Mark the two tests 'check-attr with pathspec outside sparse definition'
and 'diff --check with pathspec outside sparse definition' as
'test_expect_failure' to reflect an existing issue where the attributes
inside a sparse directory are ignored. Ensure that the 'check-attr'
command fails to read the required attributes to demonstrate this
expected failure.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index is read in 'worktree.c' at two points:
1.The 'validate_no_submodules' function, which checks if there are any
submodules present in the worktree.
2.The 'check_clean_worktree' function, which verifies if a worktree is
'clean', i.e., there are no untracked or modified but uncommitted files.
This is done by running the 'git status' command, and an error message
is thrown if the worktree is not clean. Given that 'git status' is
already sparse-aware, the function is also sparse-aware.
Hence we can just set the requires-full-index to false for
"git worktree".
Add tests that verify that 'git worktree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.
The `p2000` tests demonstrate a ~20% execution time reduction for
'git worktree' using a sparse index:
(Note:the p2000 test results didn't reflect the huge speedup because of
the index reading time is minuscule comparing to the filesystem
operations.)
Test before after
-----------------------------------------------------------------------
2000.102: git worktree add....(full-v3) 3.15 2.82 -10.5%
2000.103: git worktree add....(full-v4) 3.14 2.84 -9.6%
2000.104: git worktree add....(sparse-v3) 2.59 2.14 -16.4%
2000.105: git worktree add....(sparse-v4) 2.10 1.57 -25.2%
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index is read in 'cmd_diff_tree' at two points:
1. The first index read was added in fd66bcc31f (diff-tree: read the
index so attribute checks work in bare repositories, 2017-12-06) to deal
with reading '.gitattributes' content. 77efbb366a (attr: be careful
about sparse directories, 2021-09-08) established that, in a sparse
index, we do _not_ try to load a '.gitattributes' file from within a
sparse directory.
2. The second index access point is involved in rename detection,
specifically when reading from stdin.This was initially added in
f0c6b2a2fd ([PATCH] Optimize diff-tree -[CM]--stdin, 2005-05-27), where
'setup' was set to 'DIFF_SETUP_USE_SIZE_CACHE |DIFF_SETUP_USE_CACHE'.
That assignment was later modified to drop the'DIFF_SETUP_USE_CACHE' in
ff7fe37b05 (diff.c: move read_index() code back to the caller,
2018-08-13).However, 'DIFF_SETUP_USE_SIZE_CACHE' seems to be unused as
of 6e0b8ed6d3 (diff.c: do not use a separate "size cache"., 2007-05-07)
and nothing about 'detect_rename' otherwise indicates index usage.
Hence we can just set the requires-full-index to false for "diff-tree".
Add tests that verify that 'git diff-tree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.
The `p2000` tests demonstrate a ~98% execution time reduction for
'git diff-tree' using a sparse index:
Test before after
-----------------------------------------------------------------------
2000.94: git diff-tree HEAD (full-v3) 0.05 0.04 -20.0%
2000.95: git diff-tree HEAD (full-v4) 0.06 0.05 -16.7%
2000.96: git diff-tree HEAD (sparse-v3) 0.59 0.01 -98.3%
2000.97: git diff-tree HEAD (sparse-v4) 0.61 0.01 -98.4%
2000.98: git diff-tree HEAD -- f2/f4/a (full-v3) 0.05 0.05 +0.0%
2000.99: git diff-tree HEAD -- f2/f4/a (full-v4) 0.05 0.04 -20.0%
2000.100: git diff-tree HEAD -- f2/f4/a (sparse-v3) 0.58 0.01 -98.3%
2000.101: git diff-tree HEAD -- f2/f4/a (sparse-v4) 0.55 0.01 -98.2%
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove full index requirement for `git diff-files`. Refactor the
ensure_expanded and ensure_not_expanded functions by introducing a
common helper function, ensure_index_state. Add test to ensure the index
is no expanded in `git diff-files`.
The `p2000` tests demonstrate a ~96% execution time reduction for 'git
diff-files' and a ~97% execution time reduction for 'git diff-files'
for a file using a sparse index:
Test before after
-----------------------------------------------------------------------
2000.94: git diff-files (full-v3) 0.09 0.08 -11.1%
2000.95: git diff-files (full-v4) 0.09 0.09 +0.0%
2000.96: git diff-files (sparse-v3) 0.52 0.02 -96.2%
2000.97: git diff-files (sparse-v4) 0.51 0.02 -96.1%
2000.98: git diff-files -- f2/f4/a (full-v3) 0.06 0.07 +16.7%
2000.99: git diff-files -- f2/f4/a (full-v4) 0.08 0.08 +0.0%
2000.100: git diff-files -- f2/f4/a (sparse-v3) 0.46 0.01 -97.8%
2000.101: git diff-files -- f2/f4/a (sparse-v4) 0.51 0.02 -96.1%
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before integrating the 'git diff-files' builtin with the sparse index
feature, add tests to t1092-sparse-checkout-compatibility.sh to ensure
it currently works with sparse-checkout and will still work with sparse
index after that integration.
When adding tests against a sparse-checkout definition, we test two
modes: all changes are within the sparse-checkout cone and some changes
are outside the sparse-checkout cone.
In order to have staged changes outside of the sparse-checkout cone,
make a directory called 'folder1' and copy `a` into 'folder1/a'.
'folder1/a' is identical to `a` in the base commit. These make
'folder1/a' in the index, while leaving it outside of the
sparse-checkout definition. Change content inside 'folder1/a' in order
to test 'folder1/a' being present on-disk with modifications.
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'on all' in the title of the test 'write-tree on all' was unclear;
remove it.
* Add a baseline 'test_all_match git write-tree' before making any
changes to the index, providing a reference point for the 'write-tree'
prior to any modifications.
* Add a comparison of the output of 'git status --porcelain=v2' to test
the working tree after 'write-tree' exits.
* Ensure SKIP_WORKTREE files weren't materialized on disk by using
"test_path_is_missing".
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'git write-tree' to allow using the sparse-index in memory
without expanding to a full one.
The recursive algorithm for update_one() was already updated in 2de37c5
(cache-tree: integrate with sparse directory entries, 2021-03-03) to
handle sparse directory entries in the index. Hence we can just set the
requires-full-index to false for "write-tree".
The `p2000` tests demonstrate a ~96% execution time reduction for 'git
write-tree' using a sparse index:
Test before after
-----------------------------------------------------------------
2000.78: git write-tree (full-v3) 0.34 0.33 -2.9%
2000.79: git write-tree (full-v4) 0.32 0.30 -6.3%
2000.80: git write-tree (sparse-v3) 0.47 0.02 -95.8%
2000.81: git write-tree (sparse-v4) 0.45 0.02 -95.6%
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git describe compares the index with the working tree when (and only
when) it is run with the "--dirty" flag. This is done by the
run_diff_index() function. The function has been made aware of the
sparse-index in the series that led to 8d2c3732 (Merge branch
'ld/sparse-diff-blame', 2021-12-21). Hence we can just set the
requires-full-index to false for "describe".
Performance metrics
Test HEAD~1 HEAD
-------------------------------------------------------------------------------------------------
2000.2: git describe --dirty (full-v3) 0.08(0.09+0.01) 0.08(0.06+0.03) +0.0%
2000.3: git describe --dirty (full-v4) 0.09(0.07+0.03) 0.08(0.05+0.04) -11.1%
2000.4: git describe --dirty (sparse-v3) 0.88(0.82+0.06) 0.02(0.01+0.05) -97.7%
2000.5: git describe --dirty (sparse-v4) 0.68(0.60+0.08) 0.02(0.02+0.04) -97.1%
2000.6: echo >>new && git describe --dirty (full-v3) 0.08(0.04+0.05) 0.08(0.05+0.04) +0.0%
2000.7: echo >>new && git describe --dirty (full-v4) 0.08(0.07+0.03) 0.08(0.05+0.04) +0.0%
2000.8: echo >>new && git describe --dirty (sparse-v3) 0.75(0.69+0.07) 0.02(0.03+0.03) -97.3%
2000.9: echo >>new && git describe --dirty (sparse-v4) 0.81(0.73+0.09) 0.02(0.01+0.05) -97.5%
Signed-off-by: Raghul Nanth A <nanth.raghul@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git grep" learned to expand the sparse-index more lazily and on
demand in a sparse checkout.
* sy/sparse-grep:
builtin/grep.c: integrate with sparse index
Explicitly cloning over the "file://" protocol in t1092 in preparation
for merging a security release which will change the default value of
this configuration to be "user".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Turn on sparse index and remove ensure_full_index().
Before this patch, `git-grep` utilizes the ensure_full_index() method to
expand the index and search all the entries. Because this method
requires walking all the trees and constructing the index, it is the
slow part within the whole command.
To achieve better performance, this patch uses grep_tree() to search the
sparse directory entries and get rid of the ensure_full_index() method.
Why grep_tree() is a better choice over ensure_full_index()?
1) grep_tree() is as correct as ensure_full_index(). grep_tree() looks
into every sparse-directory entry (represented by a tree) recursively
when looping over the index, and the result of doing so matches the
result of expanding the index.
2) grep_tree() utilizes pathspecs to limit the scope of searching.
ensure_full_index() always expands the index, which means it will
always walk all the trees and blobs in the repo without caring if
the user only wants a subset of the content, i.e. using a pathspec.
On the other hand, grep_tree() will only search the contents that
match the pathspec, and thus possibly walking fewer trees.
3) grep_tree() does not construct and copy back a new index, while
ensure_full_index() does. This also saves some time.
----------------
Performance test
- Summary:
p2000 tests demonstrate a ~71% execution time reduction for
`git grep --cached bogus -- "f2/f1/f1/*"` using tree-walking logic.
However, notice that this result varies depending on the pathspec
given. See below "Command used for testing" for more details.
Test HEAD~ HEAD
-------------------------------------------------------
2000.78: git grep ... (full-v3) 0.35 0.39 (≈)
2000.79: git grep ... (full-v4) 0.36 0.30 (≈)
2000.80: git grep ... (sparse-v3) 0.88 0.23 (-73.8%)
2000.81: git grep ... (sparse-v4) 0.83 0.26 (-68.6%)
- Command used for testing:
git grep --cached bogus -- "f2/f1/f1/*"
The reason for specifying a pathspec is that, if we don't specify a
pathspec, then grep_tree() will walk all the trees and blobs to find the
pattern, and the time consumed doing so is not too different from using
the original ensure_full_index() method, which also spends most of the
time walking trees. However, when a pathspec is specified, this latest
logic will only walk the area of trees enclosed by the pathspec, and the
time consumed is reasonably a lot less.
Generally speaking, because the performance gain is acheived by walking
less trees, which are specified by the pathspec, the HEAD time v.s.
HEAD~ time in sparse-v[3|4], should be proportional to
"pathspec enclosed area" v.s. "all area", respectively. Namely, the
wider the <pathspec> is encompassing, the less the performance
difference between HEAD~ and HEAD, and vice versa.
That is, if we don't specify a pathspec, the performance difference [1]
is indistinguishable: both methods walk all the trees and take generally
same amount of time (even with the index construction time included for
ensure_full_index()).
[1] Performance test result without pathspec (hence walking all trees):
Command used:
git grep --cached bogus
Test HEAD~ HEAD
---------------------------------------------------
2000.78: git grep ... (full-v3) 6.17 5.19 (≈)
2000.79: git grep ... (full-v4) 6.19 5.46 (≈)
2000.80: git grep ... (sparse-v3) 6.57 6.44 (≈)
2000.81: git grep ... (sparse-v4) 6.65 6.28 (≈)
--------------------------
NEEDSWORK about submodules
There are a few NEEDSWORKs that belong to improvements beyond this
topic. See the NEEDSWORK in builtin/grep.c::grep_submodule() for
more context. The other two NEEDSWORKs in t1092 are also relative.
Suggested-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Segfault fix-up to an earlier fix to the topic to teach "git reset"
and "git checkout" work better in a sparse checkout.
* vd/sparse-reset-checkout-fixes:
unpack-trees: fix sparse directory recursion check
Ensure 'is_sparse_directory_entry()' receives a valid 'name_entry *' if one
exists in the list of tree(s) being unpacked in 'unpack_callback()'.
Currently, 'is_sparse_directory_entry()' is called with the first
'name_entry' in the 'names' list of entries on 'unpack_callback()'. However,
this entry may be empty even when other elements of 'names' are not (such as
when switching from an orphan branch back to a "normal" branch). As a
result, 'is_sparse_directory_entry()' could incorrectly indicate that a
sparse directory is *not* actually sparse because the name of the index
entry does not match the (empty) 'name_entry' path.
Fix the issue by using the existing 'name_entry *p' value in
'unpack_callback()', which points to the first non-empty entry in 'names'.
Because 'p' is 'const', also update 'is_sparse_directory_entry()'s
'name_entry *' argument to be 'const'.
Finally, add a regression test case.
Reported-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Failures within `for` and `while` loops can go unnoticed if not detected
and signaled manually since the loop itself does not abort when a
contained command fails, nor will a failure necessarily be detected when
the loop finishes since the loop returns the exit code of the last
command it ran on the final iteration, which may not be the command
which failed. Therefore, detect and signal failures manually within
loops using the idiom `|| return 1` (or `|| exit 1` within subshells).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test wants to verify that `git blame` errors out when asked to
blame a file _not_ in the sparse checkout. However, the very first file
it asks to blame _is_ present in the checkout, thus `test_must_fail git
blame $file` gives an unexpected result (the "blame" succeeds). This
problem went unnoticed because the test invokes `test_must_fail git
blame $file` in loop but forgets to break out of the loop early upon
failure, thus the failure gets swallowed.
Fix the test by having it not ask to blame a file present in the sparse
checkout, and instead only blame files not present, as intended. While
at it, also add the missing `|| return 1` which allowed this bug to go
unnoticed.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable the sparse index within the `git-rm` command.
The `p2000` tests demonstrate a ~92% execution time reduction for
'git rm' using a sparse index.
Test HEAD~1 HEAD
--------------------------------------------------------------------------
2000.74: git rm ... (full-v3) 0.41(0.37+0.05) 0.43(0.36+0.07) +4.9%
2000.75: git rm ... (full-v4) 0.38(0.34+0.05) 0.39(0.35+0.05) +2.6%
2000.76: git rm ... (sparse-v3) 0.57(0.56+0.01) 0.05(0.05+0.00) -91.2%
2000.77: git rm ... (sparse-v4) 0.57(0.55+0.02) 0.03(0.03+0.00) -94.7%
----
Also, normalize a behavioral difference of `git-rm` under sparse-index.
See related discussion [1].
`git-rm` a sparse-directory entry within a sparse-index enabled repo
behaves differently from a sparse directory within a sparse-checkout
enabled repo.
For example, in a sparse-index repo, where 'folder1' is a
sparse-directory entry, `git rm -r --sparse folder1` provides this:
rm 'folder1/'
Whereas in a sparse-checkout repo *without* sparse-index, doing so
provides this:
rm 'folder1/0/0/0'
rm 'folder1/0/1'
rm 'folder1/a'
Because `git rm` a sparse-directory entry does not need to expand the
index, therefore we should accept the current behavior, which is faster
than "expand the sparse-directory entry to match the sparse-checkout
situation".
Modify a previous test so such difference is not considered as an error.
[1] https://github.com/ffyuanda/git/pull/6#discussion_r934861398
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `ensure_full_index()` method so `git-rm` does not always
expand the index when the expansion is unnecessary, i.e. when
<pathspec> does not have any possibilities to match anything outside
of sparse-checkout definition.
Expand the index when the <pathspec> needs an expanded index, i.e. the
<pathspec> contains wildcard that may need a full-index or the
<pathspec> is simply outside of sparse-checkout definition.
Notice that the test 'rm pathspec expands index when necessary' in
t1092 *is* testing this code change behavior, though it will be marked
as 'test_expect_success' only in the next patch, where we officially
mark `command_requires_full_index = 0`, so the index does not expand
unless we tell it to do so.
Notice that because we also want `ensure_full_index` to record the
stdout and stderr from Git command, a corresponding modification
is also included in this patch. The reason we want the "sparse-index-out"
and "sparse-index-err", is that we need to make sure there is no error
from Git command itself, so we can rely on the `test_region` result
and determine if the index is expanded or not.
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests for `git-rm`, make sure it behaves as expected when
<pathspec> is both inside or outside of sparse-checkout definition.
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If 'unpack_single_entry()' is unpacking a new directory tree (that is, one
not already present in the index) into a sparse index, unpack the tree as a
sparse directory rather than traversing its contents and unpacking each file
individually. This helps keep the sparse index as collapsed as possible in
cases such as 'git reset --hard' restoring a outside-of-cone directory
removed with 'git rm -r --sparse'.
Without this patch, 'unpack_single_entry()' will only unpack a directory
into the index as a sparse directory (rather than traversing into it and
unpacking its files one-by-one) if an entry with the same name already
exists in the index. This patch allows sparse directory unpacking without a
matching index entry when the following conditions are met:
1. the directory's path is outside the sparse cone, and
2. there are no children of the directory in the index
If a directory meets these requirements (as determined by
'is_new_sparse_dir()'), 'unpack_single_entry()' unpacks the sparse directory
index entry and propagates the decision back up to 'unpack_callback()' to
prevent unnecessary tree traversal into the unpacked directory.
Reported-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the 'recursive' diff flag to the local changes reporting done by 'git
checkout' in 'show_local_changes()'. Without the flag enabled, unexpanded
sparse directories will not be recursed into to report the diff of each
file's contents, resulting in the reported local changes including
"modified" sparse directories.
The same issue was found and fixed for 'git status' in 2c521b0e49 (status:
fix nested sparse directory diff in sparse index, 2022-03-01)
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".
Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
such directories.
Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add test for "mv: add check_dir_in_index() and solve general dir check
issue" in this series.
This change tests the following:
1. mv <source> as a directory on the sparse index boundary (where it
would be a sparse directory in a sparse index).
2. mv <source> as a directory which is deeper than the boundary (so
the sparse index would expand in the cache_name_pos() method).
These tests can be written now for correctness, but later the first case
can be updated to use the 'ensure_not_expanded' helper in t1092.
Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"sparse-checkout" learns to work well with the sparse-index
feature.
* ds/sparse-sparse-checkout:
sparse-checkout: integrate with sparse index
p2000: add test for 'git sparse-checkout [add|set]'
sparse-index: complete partial expansion
sparse-index: partially expand directories
sparse-checkout: --no-sparse-index needs a full index
cache-tree: implement cache_tree_find_path()
sparse-index: introduce partially-sparse indexes
sparse-index: create expand_index()
t1092: stress test 'git sparse-checkout set'
t1092: refactor 'sparse-index contents' test
When modifying the sparse-checkout definition, the sparse-checkout
builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all
cache entries in the index. Before, we needed the index to be fully
expanded in order to ensure we had the full list of files necessary that
match the new patterns.
Insert a call to reset_sparse_directories() that expands sparse
directories that are within the new pattern list, but only far enough
that every necessary file path now exists as a cache entry. The
remaining logic within update_sparsity() will modify the SKIP_WORKTREE
bits appropriately.
This allows us to disable command_requires_full_index within the
sparse-checkout builtin. Add tests that demonstrate that we are not
expanding to a full index unnecessarily.
We can see the improved performance in the p2000 test script:
Test HEAD~1 HEAD
------------------------------------------------------------------------
2000.24: git ... (sparse-v3) 2.14(1.55+0.58) 1.57(1.03+0.53) -26.6%
2000.25: git ... (sparse-v4) 2.20(1.62+0.57) 1.58(0.98+0.59) -28.2%
These reductions of 26-28% are small compared to most examples, but the
time is dominated by writing a new copy of the base repository to the
worktree and then deleting it again. The fact that the previous index
expansion was such a large portion of the time is telling how important
it is to complete this sparse index integration.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'sparse-index contents' test checks that the sparse index has the
correct set of sparse directories in the index after modifying the cone
mode patterns using 'git sparse-checkout set'. Add to the coverage here
by adding more complicated scenarios that were not previously tested.
In order to check paths that do not exist at HEAD, we need to modify the
test_sparse_checkout_set helper slightly:
1. Add the --skip-checks argument to the 'set' command to avoid failures
when passing paths that do not exist at HEAD.
2. When looking for the non-existence of sparse directories for the
paths in $CONE_DIRS, allow the rev-list command to fail because the
path does not exist at HEAD.
This allows us to add some interesting test cases.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before expanding this test with more involved cases, first extract the
repeated logic into a new test_sparse_checkout_set helper. This helper
checks that 'git sparse-checkout set ...' succeeds and then verifies
that certain directories have sparse directory entries in the sparse
index. It also verifies that the in-cone directories are _not_ sparse
directory entries in the sparse index.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>