From 88ea8112b4abca416993e2a3145616cd732771c8 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sat, 19 Apr 2008 14:22:38 -0700 Subject: [PATCH] Optimize match_pathspec() to avoid fnmatch() "git add *" is actually fundamentally different from "git add .", and yeah, you should generally use the latter. The reason? The argument list is actually something different from what you think it is. For git, it's a "pathspec", so what actualy happens is that in *both* cases, it will really traverse the whole tree, and then match every file it finds against the pathspec. So think of the arguments not as a file list, but as a random bunch of patterns to match against the files you have! Which is why the cost is actually approximately O(n*m), where "n" is the size of the working tree, and "m" is the number of pathspecs. So the reason "git add ." is fast is actually that "m" in that case is just 1 (just one trivial pattern), and then "git add *" is slow because "m" is large (lots of complicated patterns). In both cases, 'n' is the same (== the whole set of files in your working tree). Anyway, here's a trivial patch that doesn't change this fundamental fact, but that avoids doing anything *expensive* until we've done some cheap initial tests. It may or may not help your test-case, but it's pretty simple and it matches the other git optimizations in this area (ie "conceptually handle the general case, but optimize the simple cases where we can exit early") Notice how this patch doesn' actually change the fundamental O(n^2) behaviour, but it makes it much cheaper by generally avoiding the expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not needed. Signed-off-by: Linus Torvalds Signed-off-by: Junio C Hamano --- dir.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/dir.c b/dir.c index d79762c7c0..9501476ecd 100644 --- a/dir.c +++ b/dir.c @@ -52,6 +52,11 @@ int common_prefix(const char **pathspec) return prefix; } +static inline int special_char(unsigned char c1) +{ + return !c1 || c1 == '*' || c1 == '[' || c1 == '?'; +} + /* * Does 'match' matches the given name? * A match is found if @@ -69,14 +74,27 @@ static int match_one(const char *match, const char *name, int namelen) int matchlen; /* If the match was just the prefix, we matched */ - matchlen = strlen(match); - if (!matchlen) + if (!*match) return MATCHED_RECURSIVELY; + for (;;) { + unsigned char c1 = *match; + unsigned char c2 = *name; + if (special_char(c1)) + break; + if (c1 != c2) + return 0; + match++; + name++; + namelen--; + } + + /* * If we don't match the matchstring exactly, * we need to match by fnmatch */ + matchlen = strlen(match); if (strncmp(match, name, matchlen)) return !fnmatch(match, name, 0) ? MATCHED_FNMATCH : 0;