kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Jeff King	5aa02f9868	tree-walk: harden make_traverse_path() length computations The make_traverse_path() function isn't very careful about checking its output buffer boundaries. In fact, it doesn't even _know_ the size of the buffer it's writing to, and just assumes that the caller used traverse_path_len() correctly. And even then we assume that our traverse_info.pathlen components are all correct, and just blindly write into the buffer. Let's improve this situation a bit: - have the caller pass in their allocated buffer length, which we'll check against our own computations - check for integer underflow as we do our backwards-insertion of pathnames into the buffer - check that we do not run out items in our list to traverse before we've filled the expected number of bytes None of these should be triggerable in practice (especially since our switch to size_t everywhere in a previous commit), but it doesn't hurt to check our assumptions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Jeff King	c43ab06259	tree-walk: add a strbuf wrapper for make_traverse_path() All but one of the callers of make_traverse_path() allocate a new heap buffer to store the path. Let's give them an easy way to write to a strbuf, which saves them from computing the length themselves (which is especially tricky when they want to add to the path). It will also make it easier for us to change the make_traverse_path() interface in a future patch to improve its bounds-checking. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Jeff King	37806080d7	tree-walk: use size_t consistently We store and manipulate the cumulative traverse_info.pathlen as an "int", which can overflow when we are fed ridiculously long pathnames (e.g., ones at the edge of 2GB or 4GB, even if the individual tree entry names are smaller than that). The results can be confusing, though after some prodding I was not able to use this integer overflow to cause an under-allocated buffer. Let's consistently use size_t to generate and store these, and make sure our addition doesn't overflow. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Jeff King	9055384710	tree-walk: drop oid from traverse_info As the previous commit shows, the presence of an oid in each level of the traverse_info is confusing and ultimately not necessary. Let's drop it to make it clear that it will not always be set (as well as convince us that it's unused, and let the compiler catch any merges with other branches that do add new uses). Since the oid is part of name_entry, we'll actually stop embedding a name_entry entirely, and instead just separately hold the pathname, its length, and the mode. This makes the resulting code slightly more verbose as we have to pass those elements around individually. But it also makes it more clear what each code path is going to use (and in most of the paths, we really only care about the pathname itself). A few of these conversions are noisier than they need to be, as they also take the opportunity to rename "len" to "namelen" for clarity (especially where we also have "pathlen" or "ce_len" alongside). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Jeff King	947208b725	setup_traverse_info(): stop copying oid We assume that if setup_traverse_info() is passed a non-empty "base" string, that string is pointing into a tree object and we can read the object oid by skipping past the trailing NUL. As it turns out, this is not true for either of the two calls, and we may end up reading garbage bytes: 1. In git-merge-tree, our base string is either empty (in which case we'd never run this code), or it comes from our traverse_path() helper. The latter overallocates a buffer by the_hash_algo->rawsz bytes, but then fills it with only make_traverse_path(), leaving those extra bytes uninitialized (but part of a legitimate heap buffer). 2. In unpack_trees(), we pass o->prefix, which is some arbitrary string from the caller. In "git read-tree --prefix=foo", for instance, it will point to the command-line parameter, and we'll read 20 bytes past the end of the string. Interestingly, tools like ASan do not detect (2) because the process argv is part of a big pre-allocated buffer. So we're reading trash, but it's trash that's probably part of the next argument, or the environment. You can convince it to fail by putting something like this at the beginning of common-main.c's main() function: { int i; for (i = 0; i < argc; i++) argv[i] = xstrdup_or_null(argv[i]); } That puts the arguments into their own heap buffers, so running: make SANITIZE=address test will find problems when "read-tree --prefix" is used (e.g., in t3030). Doubly interesting, even with the hackery above, this does not fail prior to `ea82b2a085` (tree-walk: store object_id in a separate member, 2019-01-15). That commit switched setup_traverse_info() to actually copying the hash, rather than simply pointing to it. That pointer was always pointing to garbage memory, but that commit started actually dereferencing the bytes, which is what triggers ASan. That also implies that nobody actually cares about reading these oid bytes anyway (or at least no path covered by our tests). And manual inspection of the code backs that up (I'll follow this patch with some cleanups that show definitively this is the case, but they're quite invasive, so it's worth doing this fix on its own). So let's drop the bogus hashcpy(), along with the confusing oversizing in merge-tree. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Nguyễn Thái Ngọc Duy	0dd1f0c3a6	tree-walk.c: remove the_repo from get_tree_entry_follow_symlinks() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Nguyễn Thái Ngọc Duy	50ddb089ff	tree-walk.c: remove the_repo from get_tree_entry() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Nguyễn Thái Ngọc Duy	5e57580733	tree-walk.c: remove the_repo from fill_tree_descriptor() While at there, clean up the_repo usage in builtin/merge-tree.c a tiny bit. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Nguyễn Thái Ngọc Duy	d3b4705ab8	sha1-file.c: remove the_repo from read_object_with_reference() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Elijah Newren	5ec1e72823	Use 'unsigned short' for mode, like diff_filespec does struct diff_filespec defines mode to be an 'unsigned short'. Several other places in the API which we'd like to interact with using a diff_filespec used a plain unsigned (or unsigned int). This caused problems when taking addresses, so switch to unsigned short. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
David Turner	d1dd94b308	Do not print 'dangling' for cat-file in case of ambiguity The return values -1 and -2 from get_oid could mean two different things, depending on whether they were from an enum returned by get_tree_entry_follow_symlinks, or from a different code path. This caused 'dangling' to be printed from a git cat-file in the case of an ambiguous (-2) result. Unify the results of get_oid* and get_tree_entry_follow_symlinks to be one common type, with unambiguous values. Signed-off-by: David Turner <novalis@novalis.org> Reported-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
brian m. carlson	ea82b2a085	tree-walk: store object_id in a separate member When parsing a tree, we read the object ID directly out of the tree buffer. This is normally fine, but such an object ID cannot be used with oidcpy, which copies GIT_MAX_RAWSZ bytes, because if we are using SHA-1, there may not be that many bytes to copy. Instead, store the object ID in a separate struct member. Since we can no longer efficiently compute the path length, store that information as well in struct name_entry. Ensure we only copy the object ID into the new buffer if the path length is nonzero, as some callers will pass us an empty path with no object ID following it, and we will not want to read past the end of the buffer. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
brian m. carlson	0a3faa45b1	tree-walk: copy object ID before use In a future commit, the pointer returned by tree_entry_extract will point into the struct tree_desc, causing its lifetime to be bound to that of the struct tree_desc itself. To ensure this code path keeps working, copy the object_id into a local variable so that it lives long enough. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Nguyễn Thái Ngọc Duy	5a0b97b34c	tree-walk: support :(attr) matching This lets us use :(attr) with "git grep <tree-ish>" or "git log". :(attr) requires another round of checking before we can declare that a path is matched. This is done after path matching since we have lots of optimization to take a shortcut when things don't match. Note that if :(attr) is present, we can't return all_entries_interesting / all_entries_not_interesting anymore because we can't be certain about that. Not until match_pathspec_attrs() can tell us "yes all these paths satisfy :(attr)". Second note. Even though we walk a specific tree, we use attributes from _worktree_ (or falling back to the index), not from .gitattributes files on that tree. This by itself is not necessarily wrong, but the user just have to be aware of this. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Nguyễn Thái Ngọc Duy	67022e0214	tree-walk.c: make tree_entry_interesting() take an index In order to support :(attr) when matching pathspec on a tree, tree_entry_interesting() needs to take an index (because git_check_attr() needs it). This is the preparation step for it. This also makes it clearer what index we fall back to when looking up attributes during an unpack-trees operation: the source index. This also fixes revs->pruning.repo initialization that should have been done in `2abf350385` (revision.c: remove implicit dependency on the_index - 2018-09-21). Without it, skip_uninteresting() will dereference a NULL pointer through this call chain get_revision(revs) get_revision_internal get_revision_1 try_to_simplify_commit rev_compare_tree diff_tree_oid(..., &revs->pruning) ll_diff_tree_oid diff_tree_paths ll_diff_tree skip_uninteresting Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
Nguyễn Thái Ngọc Duy	b7845cebc0	tree-walk.c: fix overoptimistic inclusion in :(exclude) matching tree_entry_interesting() is used for matching pathspec on a tree. The interesting thing about this function is that, because the tree entries are known to be sorted, this function can return more than just "yes, matched" and "no, not matched". It can also say "yes, this entry is matched and so is the remaining entries in the tree". This is where I made a mistake when matching exclude pathspec. For exclude pathspec, we do matching twice, one with positive patterns and one with negative ones, then a rule table is applied to determine the final "include or exclude" result. Note that "matched" does not necessarily mean include. For negative patterns, "matched" means exclude. This particular rule is too eager to include everything. Rule 8 says that "if all entries are positively matched" and the current entry is not negatively matched (i.e. not excluded), then all entries are positively matched and therefore included. But this is not true. If the _current_ entry is not negatively matched, it does not mean the next one will not be and we cannot conclude right away that all remaining entries are positively matched and can be included. Rules 8 and 18 are now updated to be less eager. We conclude that the current entry is positively matched and included. But we say nothing about remaining entries. tree_entry_interesting() will be called again for those entries where we will determine entries individually. Reported-by: Christophe Bliard <christophe.bliard@trux.info> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	6 years ago
brian m. carlson	83e4b7571c	tree-walk: replace hard-coded constants with the_hash_algo Remove the hard-coded 20-based values and replace them with uses of the_hash_algo. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
Stefan Beller	cbd53a2193	object-store: move object access functions to object-store.h This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
brian m. carlson	3b683bcf85	tree-walk: convert get_tree_entry_follow_symlinks to object_id Since the only caller of this function already uses struct object_id, update get_tree_entry_follow_symlinks to use it in parameters and internally. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
brian m. carlson	e84bc23cb6	tree-walk: avoid hard-coded 20 constant Use the_hash_algo to look up the length of our current hash instead of hard-coding the value 20. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
brian m. carlson	b4f5aca40e	sha1_file: convert read_sha1_file to struct object_id Convert read_sha1_file to take a pointer to struct object_id and rename it read_object_file. Do the same for read_sha1_file_extended. Convert one use in grep.c to use the new function without any other code change, since the pointer being passed is a void pointer that is already initialized with a pointer to struct object_id. Update the declaration and definitions of the modified functions, and apply the following semantic patch to convert the remaining callers: @@ expression E1, E2, E3; @@ - read_sha1_file(E1.hash, E2, E3) + read_object_file(&E1, E2, E3) @@ expression E1, E2, E3; @@ - read_sha1_file(E1->hash, E2, E3) + read_object_file(E1, E2, E3) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1.hash, E2, E3, E4) + read_object_file_extended(&E1, E2, E3, E4) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1->hash, E2, E3, E4) + read_object_file_extended(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
brian m. carlson	02f0547eaa	sha1_file: convert read_object_with_reference to object_id Convert read_object_with_reference to take pointers to struct object_id. Update the internals of the function accordingly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
brian m. carlson	916bc35b29	tree-walk: convert tree entry functions to object_id Convert get_tree_entry and find_tree_entry to take pointers to struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
brian m. carlson	0d4a8b5b6c	tree-walk: convert get_tree_entry_follow_symlinks internals to object_id Convert the internals of this function to use struct object_id. This is one of the last remaining callers of read_sha1_file_extended that has not been converted yet. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
Brandon Williams	eef3df5a93	pathspec: only match across submodule boundaries when requested Commit `74ed43711f` (grep: enable recurse-submodules to work on <tree> objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to match across submodule boundaries in the presence of wildcards. This is done by performing literal matching up to the first wildcard and then punting to the submodule itself to perform more accurate pattern matching. Instead of introducing a new flag to request this behavior, commit `74ed43711f` overloaded the already existing 'recursive' flag in 'struct pathspec' to request this behavior. This leads to a bug where whenever any other caller has the 'recursive' flag set as well as a pathspec with wildcards that all submodules will be indicated as matches. One simple example of this is: git init repo cd repo git init submodule git -C submodule commit -m initial --allow-empty touch "[bracket]" git add "[bracket]" git commit -m bracket git add submodule git commit -m submodule git rev-list HEAD -- "[bracket]" Fix this by introducing the new flag 'recurse_submodules' in 'struct pathspec' and using this flag to determine if matches should be allowed to cross submodule boundaries. This fixes https://github.com/git-for-windows/git/issues/1371. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
Ramsay Jones	071bcaab64	ALLOC_GROW: avoid -Wsign-compare warnings Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	7 years ago
René Scharfe	5c377d3d59	tree-walk: convert fill_tree_descriptor() to object_id All callers of fill_tree_descriptor() have been converted to object_id already, so convert that function as well. As a nice side-effect we get rid of NULL checks in tree-diff.c, as fill_tree_descriptor() already does them for us. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	8 years ago
Jeff King	d72cae12b9	get_sha1_with_context: always initialize oc->symlink_path The get_sha1_with_context() function zeroes out the oc->symlink_path strbuf, but doesn't use strbuf_init() to set up the usual invariants (like pointing to the slopbuf). We don't actually write to the oc->symlink_path strbuf unless we call get_tree_entry_follow_symlinks(), and that function does initialize it. However, readers may still look at the zero'd strbuf. In practice this isn't a triggerable bug. The only caller that looks at it only does so when the mode we found is 0. This doesn't happen for non-tree-entries (where we return S_IFINVALID). A broken tree entry could have a mode of 0, but canon_mode() quietly rewrites that into S_IFGITLINK. So the "0" mode should only come up when we did indeed find a symlink. This is mostly just an accident of how the code happens to work, though. Let's future-proof ourselves to make sure the strbuf is properly initialized for all calls (it's only a few struct member assignments, not a heap allocation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	8 years ago
Junio C Hamano	5840eb9d14	doc: replace more gmane links Signed-off-by: Junio C Hamano <gitster@pobox.com>	8 years ago
Brandon Williams	74ed43711f	grep: enable recurse-submodules to work on <tree> objects Teach grep to recursively search in submodules when provided with a <tree> object. This allows grep to search a submodule based on the state of the submodule that is present in a commit of the super project. When grep is provided with a <tree> object, the name of the object is prefixed to all output. In order to provide uniformity of output between the parent and child processes the option `--parent-basename` has been added so that the child can preface all of it's output with the name of the parent's object instead of the name of the commit SHA1 of the submodule. This changes output from the command `git grep -e. -l --recurse-submodules HEAD` from: HEAD:file <commit sha1 of submodule>:sub/file to: HEAD:file HEAD:sub/file Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	8 years ago
David Turner	8354fa3d4c	fsck: handle bad trees like other errors Instead of dying when fsck hits a malformed tree object, log the error like any other and continue. Now fsck can tell the user which tree is bad, too. Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	8 years ago
Jeff King	2edffef233	tree-walk: be more specific about corrupt tree errors When the tree-walker runs into an error, it just calls die(), and the message is always "corrupt tree file". However, we are actually covering several cases here; let's give the user a hint about what happened. Let's also avoid using the word "corrupt", which makes it seem like the data bit-rotted on disk. Our sha1 check would already have found that. These errors are ones of data that is malformed in the first place. Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	8 years ago
brian m. carlson	ce6663a9da	tree-walk: convert tree_entry_extract() to use struct object_id Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	9 years ago
brian m. carlson	7d924c9139	struct name_entry: use struct object_id instead of unsigned char sha1[20] Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	9 years ago
David Turner	d9c2bd560e	do_compare_entry: use already-computed path In traverse_trees, we generate the complete traverse path for a traverse_info. Later, in do_compare_entry, we used to go do a bunch of work to compare the traverse_info to a cache_entry's name without computing that path. But since we already have that path, we don't need to do all that work. Instead, we can just put the generated path into the traverse_info, and do the comparison more directly. We copy the path because prune_traversal might mutate `base`. This doesn't happen in any codepaths where do_compare_entry is called, but it's better to be safe. This makes git checkout much faster -- about 25% on Twitter's monorepo. Deeper directory trees are likely to benefit more than shallower ones. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	9 years ago
David Turner	275721c267	tree-walk: learn get_tree_entry_follow_symlinks Add a new function, get_tree_entry_follow_symlinks, to tree-walk.[ch]. The function is not yet used. It will be used to implement git cat-file --batch --follow-symlinks. The function locates an object by path, following symlinks in the repository. If the symlinks lead outside the repository, the function reports this to the caller. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	10 years ago
Jeremiah Mahler	ccdd4a0f3c	cleanup duplicate name_compare() functions We often represent our strings as a counted string, i.e. a pair of the pointer to the beginning of the string and its length, and the string may not be NUL terminated to that length. To compare a pair of such counted strings, unpack-trees.c and read-cache.c implement their own name_compare() functions identically. In addition, the cache_name_compare() function in read-cache.c is nearly identical. The only difference is when one string is the prefix of the other string, in which case name_compare() returns -1/+1 to show which one is longer, and cache_name_compare() returns the difference of the lengths to show the same information. Unify these three functions by using the implementation from cache_name_compare(). This does not make any difference to the existing and future callers, as they must be paying attention only to the sign of the returned value (and not the magnitude) because the original implementations of these two functions return values returned by memcmp(3) when the one string is not a prefix of the other string, and the only thing memcmp(3) guarantees its callers is the sign of the returned value, not the magnitude. Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	11 years ago
Kirill Smelkov	7146e66f08	tree-walk: finally switch over tree descriptors to contain a pre-parsed entry This continues `4651ece8` (Switch over tree descriptors to contain a pre-parsed entry) and moves the only rest computational part mode = canon_mode(mode) from tree_entry_extract() to tree entry decode phase - to decode_tree_entry(). The reason to do it, is that canon_mode() is at least 2 conditional jumps for regular files, and that could be noticeable should canon_mode() be invoked several times. That does not matter for current Git codebase, where typical tree traversal is while (t->size) { sha1 = tree_entry_extract(t, &path, &mode); ... update_tree_entry(t); } i.e. we do t -> sha1,path.mode "extraction" only once per entry. In such cases, it does not matter performance-wise, where that mode canonicalization is done - either once in tree_entry_extract(), or once in decode_tree_entry() called by update_tree_entry() - it is approximately the same. But for future code, which could need to work with several tree_desc's in parallel, it could be handy to operate on tree_desc descriptors, and do "extracts" only when needed, or at all, access only relevant part of it through structure fields directly. And for such situations, having canon_mode() be done once in decode phase is better - we won't need to pay the performance price of 2 extra conditional jumps on every t->mode access. So let's move mode canonicalization to decode_tree_entry(). That was the final bit. Now after tree entry is decoded, it is fully ready and could be accessed either directly via field, or through tree_entry_extract() which this time got really "totally trivial". Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	11 years ago
Andy Spencer	e4ddb05720	tree_entry_interesting: match against all pathspecs The current basedir compare aborts early in order to avoid futile recursive searches. However, a match may still be found by another pathspec. This can cause an error while checking out files from a branch when using multiple pathspecs: $ git checkout master -- 'a/.txt' 'b/.txt' error: pathspec 'a/*.txt' did not match any file(s) known to git. Signed-off-by: Andy Spencer <andy753421@gmail.com> Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	11 years ago
Nguyễn Thái Ngọc Duy	74b4f7f277	tree-walk.c: ignore trailing slash on submodule in tree_entry_interesting() We do ignore trailing slash on a directory, so pathspec "abc/" matches directory "abc". A submodule is also a directory. Apply the same logic to it. This makes "git log submodule-path" and "git log submodule-path/" produce the same output. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	11 years ago
Nguyễn Thái Ngọc Duy	ef79b1f870	Support pathspec magic :(exclude) and its short form :! Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	11 years ago
Stefan Beller	a04f8196a8	traverse_trees(): clarify return value of the callback The variable name "ret" sounds like the variable to be returned, but since `e6c111b4` we return error, and it is misleading. As this variable tells us which trees in t[] array were used in the callback function, so that this caller can know the entries in which of the trees need advancing, "trees_used" is a better name. Also the assignment to 0 was removed at the start of the function as well after the "if (interesting)" block. Those are unneeded as that variable is set to the callback return value any time we enter the "if (interesting)" block, so we'd overwrite old values anyway. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	93d9353716	parse_pathspec: accept :(icase)path syntax Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	bd30c2e484	pathspec: support :(glob) syntax :(glob)path differs from plain pathspec that it uses wildmatch with WM_PATHNAME while the other uses fnmatch without FNM_PATHNAME. The difference lies in how '' (and '*') is processed. With the introduction of :(glob) and :(literal) and their global options --[no]glob-pathspecs, the user can: - make everything literal by default via --noglob-pathspecs --literal-pathspecs cannot be used for this purpose as it disables _all_ pathspec magic. - individually turn on globbing with :(glob) - make everything globbing by default via --glob-pathspecs - individually turn off globbing with :(literal) The implication behind this is, there is no way to gain the default matching behavior (i.e. fnmatch without FNM_PATHNAME). You either get new globbing or literal. The old fnmatch behavior is considered deprecated and discouraged to use. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	5c6933d201	pathspec: support :(literal) syntax for noglob pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	8f4f8f4579	guard against new pathspec magic in pathspec matching code GUARD_PATHSPEC() marks pathspec-sensitive code, basically all those that touch anything in 'struct pathspec' except fields "nr" and "original". GUARD_PATHSPEC() is not supposed to fail. It's mainly to help the designers catch unsupported codepaths. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	6330a17199	parse_pathspec: add special flag for max_depth feature match_pathspec_depth() and tree_entry_interesting() check max_depth field in order to support "git grep --max-depth". The feature activation is tied to "recursive" field, which led to some unwanted activation, e.g. `5c8eeb8` (diff-index: enable recursive pathspec matching in unpack_trees - 2012-01-15). This patch decouples the activation from "recursive" field, puts it in "magic" field instead. This makes sure that only "git grep" can activate this feature. And because parse_pathspec knows when the feature is not used, it does not need to sort pathspec (required for max_depth to work correctly). A small win for non-grep cases. Even though a new magic flag is introduced, no magic syntax is. The magic can be only enabled by parse_pathspec() caller. We might someday want to support ":(maxdepth:10)src." It all depends on actual use cases. max_depth feature cannot be enabled via init_pathspec() anymore. But that's ok because init_pathspec() is on its way to /dev/null. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	64acde94ef	move struct pathspec and related functions to pathspec.[ch] Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	c904cd89e4	tree_entry_interesting: do basedir compare on wildcard patterns when possible Currently we treat ".c" and "path/to/.c" the same way. Which means we check all possible paths in repo against "path/to/.c". One could see that "path/elsewhere/foo.c" obviously cannot match "path/to/.c" and we only need to check all paths _inside_ "path/to/" against that pattern. This patch checks the leading fixed part of a pathspec against base directory and exit early if possible. We could even optimize further in "path/to/something.c" case (i.e. check the fixed part against name_entry as well) but that's more complicated and probably does not gain us much. -O2 build on linux-2.6, without and with this patch respectively: $ time git rev-list --quiet HEAD -- 'drivers/.c' real 1m9.484s user 1m9.128s sys 0m0.181s $ time ~/w/git/git rev-list --quiet HEAD -- 'drivers/*.c' real 0m15.710s user 0m15.564s sys 0m0.107s Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	8c6abbcd27	pathspec: apply ".c" optimization from exclude When a pattern contains only a single asterisk as wildcard, e.g. "foobar", after literally comparing the leading part "foo" with the string, we can compare the tail of the string and make sure it matches "bar", instead of running fnmatch() on "bar" against the remainder of the string. -O2 build on linux-2.6, without the patch: $ time git rev-list --quiet HEAD -- '.c' real 0m40.770s user 0m40.290s sys 0m0.256s With the patch $ time ~/w/git/git rev-list --quiet HEAD -- '.c' real 0m34.288s user 0m33.997s sys 0m0.205s The above command is not supposed to be widely popular. It's chosen because it exercises pathspec matching a lot. The point is it cuts down matching time for popular patterns like .c, which could be used as pathspec in other places. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago

1 2 3

107 Commits (c555caab7a303109d6c712d757bc4621a3ee0bbd)