This shrinks "struct object" by a small amount, by getting rid of the
"struct type *" pointer and replacing it with a 3-bit bitfield instead.
In addition, we merge the bitfields and the "flags" field, which
incidentally should also remove a useless 4-byte padding from the object
when in 64-bit mode.
Now, our "struct object" is still too damn large, but it's now less
obviously bloated, and of the remaining fields, only the "util" (which is
not used by most things) is clearly something that should be eventually
discarded.
This shrinks the "git-rev-list --all" memory use by about 2.5% on the
kernel archive (and, perhaps more importantly, on the larger mozilla
archive). That may not sound like much, but I suspect it's more on a
64-bit platform.
There are other remaining inefficiencies (the parent lists, for example,
probably have horrible malloc overhead), but this was pretty obvious.
Most of the patch is just changing the comparison of the "type" pointer
from one of the constant string pointers to the appropriate new TYPE_xxx
small integer constant.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Don't just read the --ignore-case/-i option, pass the flag on to the
external grep program.
Signed-off-by: Robert Fitzsimons <robfitz@273k.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds a "tree_entry()" function that combines the common operation of
doing a "tree_entry_extract()" + "update_tree_entry()".
It also has a simplified calling convention, designed for simple loops
that traverse over a whole tree: the arguments are pointers to the tree
descriptor and a name_entry structure to fill in, and it returns a boolean
"true" if there was an entry left to be gotten in the tree.
This allows tree traversal with
struct tree_desc desc;
struct name_entry entry;
desc.buf = tree->buffer;
desc.size = tree->size;
while (tree_entry(&desc, &entry) {
... use "entry.{path, sha1, mode, pathlen}" ...
}
which is not only shorter than writing it out in full, it's hopefully less
error prone too.
[ It's actually a tad faster too - we don't need to recalculate the entry
pathlength in both extract and update, but need to do it only once.
Also, some callers can avoid doing a "strlen()" on the result, since
it's returned as part of the name_entry structure.
However, by now we're talking just 1% speedup on "git-rev-list --objects
--all", and we're definitely at the point where tree walking is no
longer the issue any more. ]
NOTE! Not everybody wants to use this new helper function, since some of
the tree walkers very much on purpose do the descriptor update separately
from the entry extraction. So the "extract + update" sequence still
remains as the core sequence, this is just a simplified interface.
We should probably add a silly two-line inline helper function for
initializing the descriptor from the "struct tree" too, just to cut down
on the noise from that common "desc" initializer.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Of course, it still ignores the fact that not all grep's support some of
the flags like -F/-L/-A/-C etc, but for those cases, the external grep
itself will happily just say "unrecognized option -F" or similar.
So with this change, "git grep" should handle all the flags the native
grep handles, which is really quite fine. We don't _need_ to expose
anything more, and if you do want our extensions, you can get them with
"--uncached" and an up-to-date index.
No configuration necessary, and we automatically take advantage of any
native grep we have, if possible.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Some implementations do not know what to do with -H; define
NO_H_OPTION_IN_GREP when you build git if your grep lacks -H.
Most of the time, it can be worked around by prepending
/dev/null to the argument list, but that causes -L and -c to
slightly misbehave (they both expose /dev/null is given), so
when these options are given, do not run external grep that does
not understand -H.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The "-F" flag apparently got mis-translated due to some over-eager
copy-paste work into a duplicate "-H" when using the external grep.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It's not perfect, but it gets the "git grep some-random-string" down to
the good old half-a-second range for the kernel.
It should convert more of the argument flags for "grep", that should be
trivial to expand (I did a few just as an example). It should also bother
to try to return the right "hit" value (which it doesn't, right now - the
code is kind of there, but I didn't actually bother to do it _right_).
Also, right now it _just_ limits by number of arguments, but it should
also strictly speaking limit by total argument size (ie add up the length
of the filenames, and do the "exec_grep()" flush call if it's bigger than
some random value like 32kB).
But I think that it's _conceptually_ doing all the right things, and it
seems to work. So maybe somebody else can do some of the final polish.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
I mistyped
git grep next -e '"^@"' '*.c'
and got many hits that contain "next" without complaint.
Obviously what I meant to say was:
git grep -e '"^@"' next -- '*.c'
This tightens the argument parsing rule a bit:
- All "grep" parameters should come first;
- If there is no -e nor -f to specify pattern, the first non
option string is the parameter;
- After that, zero or more revs can follow.
- An optional '--' can be present, and is skipped.
- All the rest are pathspecs. If '--' was not there, they must
be paths that exist in the working tree.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The earlier code descended into Documentation/technical when
given "Documentation/how*" as the pattern, which was too loose.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Grep may want to grok multiple revisions, but it does not make
much sense to walk revisions while doing so. This stops calling
the code to parse parameters for the revision walker. The
parameter parsing for the optional "-e" option becomes a lot
simpler with it as well.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This tweaks the pathspec wildcard used in builtin-grep to match
that of ls-files. With this:
git grep -e DEBUG -- '*/Kconfig*'
would work like the shell script version, and you could even do:
git grep -e DEBUG --cached -- '*/Kconfig*' ;# from index
git grep -e DEBUG v2.6.12 -- '*/Kconfig*' ;# from rev
Signed-off-by: Junio C Hamano <junkio@cox.net>
This attempts to set up built-in "git grep" to further reduce
our dependence on the shell, while at the same time optionally
allowing to run grep against object database. You could do
funky things like these:
git grep --cached -e pattern ;# grep from index
git grep -e pattern master ;# or in a rev
git grep -e pattern master next ;# or in multiple revs
git grep -e pattern pu^@ ;# even like this with an
;# extension from another topic ;-)
git grep -e pattern master..next ;# or even from rev ranges
git grep -e pattern master~20:Documentation
;# or an arbitrary tree
git grep -e pattern next:git-commit.sh
;# or an arbitrary blob
Right now, it does not understand and/or obey many options grep
should accept, and the pattern must be given with -e option due
to the way the parameter parser is structured, both of which
obviously need to be fixed for usability.
But this is going in the right direction. The shell script
version is one of the worst Portability offender in the git
barebone Porcelainish; it uses xargs -0 to pass paths around and
shell arrays to sift flags and parameters.
Signed-off-by: Junio C Hamano <junkio@cox.net>