do_match_pathspec() started life as match_pathspec_depth_1() and for
correctness was only supposed to be called from match_pathspec_depth().
match_pathspec_depth() was later renamed to match_pathspec(), so the
invariant we expect today is that do_match_pathspec() has no direct
callers outside of match_pathspec().
Unfortunately, this intention was lost with the renames of the two
functions, and additional calls to do_match_pathspec() were added in
commits 75a6315f74 ("ls-files: add pathspec matching for submodules",
2016-10-07) and 89a1f4aaf7 ("dir: if our pathspec might match files
under a dir, recurse into it", 2019-09-17). Of course,
do_match_pathspec() had an important advantge over match_pathspec() --
match_pathspec() would hardcode flags to one of two values, and these
new callers needed to pass some other value for flags. Also, although
calling do_match_pathspec() directly was incorrect, there likely wasn't
any difference in the observable end output, because the bug just meant
that fill_diretory() would recurse into unneeded directories. Since
subsequent does-this-path-match checks on individual paths under the
directory would cause those extra paths to be filtered out, the only
difference from using the wrong function was unnecessary computation.
The second of those bad calls to do_match_pathspec() was involved -- via
either direct movement or via copying+editing -- into a number of later
refactors. See commits 777b420347 ("dir: synchronize
treat_leading_path() and read_directory_recursive()", 2019-12-19),
8d92fb2927 ("dir: replace exponential algorithm with a linear one",
2020-04-01), and 95c11ecc73 ("Fix error-prone fill_directory() API; make
it only return matches", 2020-04-01). The last of those introduced the
usage of do_match_pathspec() on an individual file, and thus resulted in
individual paths being returned that shouldn't be.
The problem with calling do_match_pathspec() instead of match_pathspec()
is that any negated patterns such as ':!unwanted_path` will be ignored.
Add a new match_pathspec_with_flags() function to fulfill the needs of
specifying special flags while still correctly checking negated
patterns, add a big comment above do_match_pathspec() to prevent others
from misusing it, and correct current callers of do_match_pathspec() to
instead use either match_pathspec() or match_pathspec_with_flags().
One final note is that DO_MATCH_LEADING_PATHSPEC needs special
consideration when working with DO_MATCH_EXCLUDE. The point of
DO_MATCH_LEADING_PATHSPEC is that if we have a pathspec like
*/Makefile
and we are checking a directory path like
src/module/component
that we want to consider it a match so that we recurse into the
directory because it _might_ have a file named Makefile somewhere below.
However, when we are using an exclusion pattern, i.e. we have a pathspec
like
:(exclude)*/Makefile
we do NOT want to say that a directory path like
src/module/component
is a (negative) match. While there *might* be a file named 'Makefile'
somewhere below that directory, there could also be other files and we
cannot pre-emptively rule all the files under that directory out; we
need to recurse and then check individual files. Adjust the
DO_MATCH_LEADING_PATHSPEC logic to only get activated for positive
pathspecs.
Reported-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git - fast, scalable, distributed revision control system
Git is a fast, scalable, distributed revision control system with an
unusually rich command set that provides both high-level operations
and full access to internals.
Git is an Open Source project covered by the GNU General Public
License version 2 (some parts of it are under different licenses,
compatible with the GPLv2). It was originally written by Linus
Torvalds with help of a group of hackers around the net.
Please read the file INSTALL for installation instructions.
Many Git online resources are accessible from https://git-scm.com/
including full documentation and Git related tools.
See Documentation/gittutorial.txt to get started, then see
Documentation/giteveryday.txt for a useful minimum set of commands, and
Documentation/git-<commandname>.txt for documentation of each command.
If git has been correctly installed, then the tutorial can also be
read with man gittutorial or git help tutorial, and the
documentation of each command with man git-<commandname> or git help <commandname>.
CVS users may also want to read Documentation/gitcvs-migration.txt
(man gitcvs-migration or git help cvs-migration if git is
installed).
Issues which are security relevant should be disclosed privately to
the Git Security mailing list git-security@googlegroups.com.
The maintainer frequently sends the "What's cooking" reports that
list the current status of various development topics to the mailing
list. The discussion following them give a good reference for
project status, development direction and remaining tasks.
The name "git" was given by Linus Torvalds when he wrote the very
first version. He described the tool as "the stupid content tracker"
and the name as (depending on your mood):
random three-letter combination that is pronounceable, and not
actually used by any common UNIX command. The fact that it is a
mispronunciation of "get" may or may not be relevant.
stupid. contemptible and despicable. simple. Take your pick from the
dictionary of slang.
"global information tracker": you're in a good mood, and it actually
works for you. Angels sing, and a light suddenly fills the room.
"goddamn idiotic truckload of sh*t": when it breaks