As far as I know, not all grep programs support coloring, so we should
rely on builtin grep. If you want external grep, set
color.grep.external to empty string.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the config variable color.grep.external, which can be used to
switch on coloring of external greps. To enable auto coloring with
GNU grep, one needs to set color.grep.external to --color=always to
defeat the pager started by git grep. The value of the config
variable will be passed to the external grep only if it would
colorize internal grep's output, so automatic terminal detected
works. The default is to not pass any option, because the external
grep command could be a program without color support.
Also set the environment variables GREP_COLOR and GREP_COLORS to
pass the configured color for matches to the external grep. This
works with GNU grep; other variables could be added as needed.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Coloring matches makes them easier to spot in the output.
Add two options and two parameters: color.grep (to turn coloring on
or off), color.grep.match (to set the color of matches), --color
and --no-color (to turn coloring on or off, respectively).
The output of external greps is not changed.
This patch is based on earlier ones by Nguyễn Thái Ngọc Duy and
Thiago Alves.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"Assume unchanged" bit means "please pretend that I have never touched
this file", so if user removes the file, we should not care.
This patch teaches "git grep" to use cache version in such
situations. External grep case has not been fixed yet. But given that
on the platform that CE_VALID bit may be used like Windows, external
grep is not available anyway, I would wait for people to raise their
hands before touching it.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Here's a trivial patch that adds "-z" and "--null" options to "git
grep". It was discussed on the mailing-list that git's "-z"
convention should be used instead of GNU grep's "-Z".
So things like 'git grep -l -z "$FOO" | xargs -0 sed -i "s/$FOO/$BOO/"'
do work now.
Signed-off-by: Raphael Zimmerer <killekulla@rdrz.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This is a mechanical conversion of all '*.c' files with:
s/((?:die|error|warning)\("git)-(\S+:)/$1 $2/;
The result was manually inspected and no false positive was found.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless used with --cached or grepping on a tree, "git grep" will
search on working directory, so set up worktree properly
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this patch, git-grep gives confusing usage information:
$ git grep --confused
usage: git grep <option>* <rev>* [-e] <pattern> [<path>...]
$ git grep HEAD pattern
fatal: ambiguous argument 'pattern': unknown revision or path no
t in the working tree.
Use '--' to separate paths from revisions
So put <pattern> before the <rev>s, in accordance with actual correct
usage. While we're changing the usage string, we might as well include
the "--" separating revisions and paths, too.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If PATH_MAX on your system is smaller than any path stored in the git
repository, that can cause memory corruption inside of the grep_tree
function used by git-grep.
Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you misuse a git command, you are shown the usage string.
But this is currently shown in the dashed form. So if you just
copy what you see, it will not work, when the dashed form
is no longer supported.
This patch makes git commands show the dash-less version.
For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh
generates a dash-less usage string now.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, we just chose whether to allow external grep
based on the __unix__ define. However, there are systems
which define this macro but which have an inferior group
(e.g., one that does not support all options used by t7002).
This allows users to accept the potential speed penalty to
get a more consistent grep experience (and to pass the
testsuite).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I expected git grep --name-only to give me only the file names,
much as git diff --name-only only generates filenames. Alas the
option is -l, which matches common external greps but doesn't match
other parts of the git UI.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.
In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.
This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A NUL byte at beginning of file, or just after a newline
would provoke an invalid buf[-1] access in a few places.
* builtin-grep.c (cmd_grep): Don't access buf[-1].
* builtin-pack-objects.c (get_object_list): Likewise.
* builtin-rev-list.c (read_revisions_from_stdin): Likewise.
* bundle.c (read_bundle_header): Likewise.
* server-info.c (read_pack_info_file): Likewise.
* transport.c (insert_packed_refs): Likewise.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the index is unmerged, e.g.
$ git ls-files -u
100644 faf413748eb6ccb15161a212156c5e348302b1b6 1 setup.c
100644 145eca50f4 2 setup.c
100644 cb9558c49b6027bf225ba2a6154c4d2a52bcdbe2 3 setup.c
running "git grep" for work tree files repeats hits for each unmerged
stage.
$ git grep -n -e setup_work_tree -- '*.[ch]'
setup.c:209:void setup_work_tree(void)
setup.c:209:void setup_work_tree(void)
setup.c:209:void setup_work_tree(void)
This should fix it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building command line to invoke external grep, the
arguments to -A/-B/-C options were placd in randarg[] buffer,
but the code forgot that snprintf() does not count terminating
NUL in its return value. This caused "git grep -A1 -B2" to
invoke external grep with "-B21 -A1".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We called flush_grep() every time we saw an unmerged entry in
the index. If we happen to find an unmerged entry before we saw
more than two paths, we incorrectly declared that the user had
too many non-paths options in front.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to (almost) always show the name of the file without
relying on "-H" option of GNU grep, we used to add /dev/null to
the argument list unless we are doing -l or -L. This caused
"/dev/null:0" to show up when -c is given in the output.
It is not enough to add -c to the set of options we do not pass
/dev/null for. When we have too many files, we invoke grep
multiple times and we need to avoid giving a widow filename to
the last invocation -- otherwise we will not see the name.
This keeps two filenames when the argv[] buffer is about to
overflow and we have not finished iterating over the index, so
that the last round will always have at least two paths to work
with (and not require /dev/null).
An obvious and the only exception is when there is only 1 file
that is given to the underlying grep, and in that case we avoid
passing /dev/null and let the external "grep -c" report only the
number of matches.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* builtin-grep.c (strtoul_ui): Move function definition from here, to...
* git-compat-util.h (strtoul_ui): ...here, with an added "base" parameter.
* builtin-grep.c (cmd_grep): Update use of strtoul_ui to include base, "10".
* builtin-update-index.c (read_index_info): Diagnose an invalid mode integer
that is out of range or merely larger than INT_MAX.
(cmd_update_index): Use strtoul_ui, not sscanf.
* convert-objects.c (write_subdirectory): Likewise.
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep.c (strtoul_ui): Move function definition from here, to...
* git-compat-util.h (strtoul_ui): ...here, with an added "base" parameter.
* builtin-grep.c (cmd_grep): Update use of strtoul_ui to include base, "10".
* builtin-update-index.c (read_index_info): Diagnose an invalid mode integer
that is out of range or merely larger than INT_MAX.
(cmd_update_index): Use strtoul_ui, not sscanf.
* convert-objects.c (write_subdirectory): Likewise.
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes slightly more lines than it adds, but the real reason for
doing this is that future optimizations will require more setup of the
tree descriptor, and so we want to do it in one place.
Also renamed the "desc.buf" field to "desc.buffer" just to trigger
compiler errors for old-style manual initializations, making sure I
didn't miss anything.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Since we have the "tree_entry_len()" helper function these days, and
don't need to do a full strlen(), there's no point in saving the path
length - it's just redundant information.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If you use scanf or sscanf to parse integers, your code probably
accepts bogus inputs. For example, builtin-grep (aka git-grep) uses
sscanf(scan, "%u", &num) to parse the integer argument to -A, -B, -C.
Currently, "-C 1,000" and "-C 4294967297" are both treated just like
"-C 1":
$ git-grep -h -C 4294967297 juggle
out and you may find it easier to switch back and forth if you
juggle multiple lines of development simultaneously. Of
course, you will pay the price of more disk usage to hold
The obvious fix is to use strtoul instead. But using a bare strtoul is
too messy, at least when done properly, so I've added a wrapper function.
The new function in the patch below belongs elsewhere if it would be
useful in replacing any of the four remaining uses of sscanf.
One final note: With this change, I get a slightly different
diagnostic depending on the context size:
$ ./git-grep -h -C 4294967296 juggle
fatal: 4294967296: invalid context length argument
[Exit 128]
$ ./git-grep -h -C 4294967295 juggle
grep: 4294967295: invalid context length argument
[Exit 1]
A common convention that makes it easy to identify the source
of a diagnostic is to include the program name before the first ":".
Whether that should be "git" or "git-grep" is another question.
Using "grep" or "fatal" is misleading.
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.
On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().
In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value. One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.
This is an initial step for the removal of the version using a char array
found in object reading code paths. The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Previous step converted use of strncmp() with literal string
mechanically even when the result is only used as a boolean:
if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo")))
This step manually cleans them up to read:
if (!prefixcmp(arg, "foo"))
Signed-off-by: Junio C Hamano <junkio@cox.net>
This mechanically converts strncmp() to use prefixcmp(), but only when
the parameters match specific patterns, so that they can be verified
easily. Leftover from this will be fixed in a separate step, including
idiotic conversions like
if (!strncmp("foo", arg, 3))
=>
if (!(-prefixcmp(arg, "foo")))
This was done by using this script in px.perl
#!/usr/bin/perl -i.bak -p
if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) {
s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|;
}
if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) {
s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|;
}
and running:
$ git grep -l strncmp -- '*.c' | xargs perl px.perl
Signed-off-by: Junio C Hamano <junkio@cox.net>
We have a number of badly checked read() calls. Often we are
expecting read() to read exactly the size we requested or fail, this
fails to handle interrupts or short reads. Add a read_in_full()
providing those semantics. Otherwise we at a minimum need to check
for EINTR and EAGAIN, where this is appropriate use xread().
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a mechanical clean-up of the way *.c files include
system header files.
(1) sources under compat/, platform sha-1 implementations, and
xdelta code are exempt from the following rules;
(2) the first #include must be "git-compat-util.h" or one of
our own header file that includes it first (e.g. config.h,
builtin.h, pkt-line.h);
(3) system headers that are included in "git-compat-util.h"
need not be included in individual C source files.
(4) "git-compat-util.h" does not have to include subsystem
specific header files (e.g. expat.h).
Signed-off-by: Junio C Hamano <junkio@cox.net>
We used to skip unmerged entries, which made sense for grepping
in the cached copies, but not for grepping in the working tree.
Noticed by Johannes Sixt.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This lets you say:
git grep --all-match -e A -e B -e C
to find lines that match A or B or C but limit the matches from
the files that have all of A, B and C.
This is different from
git grep -e A --and -e B --and -e C
in that the latter looks for a single line that has all of these
at the same time.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes three functions and associated option structures from
builtin-grep available from other parts of the system.
* options to drive built-in grep engine is stored in struct
grep_opt;
* pattern strings and extended grep expressions are added to
struct grep_opt with append_grep_pattern();
* when finished calling append_grep_pattern(), call
compile_grep_patterns() to prepare for execution;
* call grep_buffer() to find matches in the in-core buffer.
This also adds an internal option "status_only" to grep_opt,
which suppresses any output from grep_buffer(). Callers of the
function as library can use it to check if there is a match
without producing any output.
Signed-off-by: Junio C Hamano <junkio@cox.net>
It turns out that I actually wanted to avoid the filenames (because I
didn't care - I just wanted to see the context in which something was
used) when doing a grep. But since "git grep" didn't take the "-h"
parameter, I ended up having to do "grep -5 -h *.c" instead.
So here's a trivial patch that adds "-h" (and thus has to enable -H too)
to "git grep" parsing.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.
I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.
[jc: removed the part that deals with last_XXX, which I am
finding more and more dubious these days.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
By default, the command shows pathnames relative to the current
directory. Use --full-name (the same flag to do so in ls-files)
if you want to see the full pathname relative to the project root.
This makes it very pleasant to run in Emacs compilation (or
"grep-find") buffer.
Signed-off-by: Junio C Hamano <junkio@cox.net>
We used to find the first match of the pattern and then if the
match is not for the entire word, declared that the whole line
does not match.
But that is wrong. The command "git grep -w -e mmap" should
find that a line "foo_mmap bar mmap baz" matches, by tring the
second instance of pattern "mmap" on the same line.
Problems an earlier round of "fix" had were pointed out by Morten
Welinder, which have been incorporated in the t7002 tests.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This changes the calling convention of built-in commands and
passes the "prefix" (i.e. pathname of $PWD relative to the
project root level) down to them.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This updates the type-enumeration constants introduced to reduce
the memory footprint of "struct object" to match the type bits
already used in the packfile format, by removing the former
(i.e. TYPE_* constant macros) and using the latter (i.e. enum
object_type) throughout the code for consistency.
Eventually we can stop passing around the "type strings"
entirely, and this will help - no confusion about two different
integer enumeration.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The only visible change is that git-blame doesn't understand
"--compability" anymore, but it does accept "--compatibility" instead,
which is already documented.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This extends the behaviour of git-grep when multiple -e options
are given. So far, we allowed multiple -e to behave just like
regular grep with multiple -e, i.e. the patterns are OR'ed
together.
With this change, you can also have multiple patterns AND'ed
together, or form boolean expressions, like this (the
parentheses are quoted from the shell in this example):
$ git grep -e _PATTERN --and \( -e atom -e token \)
Signed-off-by: Junio C Hamano <junkio@cox.net>