kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Junio C Hamano	8f309aeb82	strbuf: introduce strbuf_getline_{lf,nul}() The strbuf_getline() interface allows a byte other than LF or NUL as the line terminator, but this is only because I wrote these codepaths anticipating that there might be a value other than NUL and LF that could be useful when I introduced line_termination long time ago. No useful caller that uses other value has emerged. By now, it is clear that the interface is overly broad without a good reason. Many codepaths have hardcoded preference to read either LF terminated or NUL terminated records from their input, and then call strbuf_getline() with LF or NUL as the third parameter. This step introduces two thin wrappers around strbuf_getline(), namely, strbuf_getline_lf() and strbuf_getline_nul(), and mechanically rewrites these call sites to call either one of them. The changes contained in this patch are: * introduction of these two functions in strbuf.[ch] * mechanical conversion of all callers to strbuf_getline() with either '\n' or '\0' as the third parameter to instead call the respective thin wrapper. After this step, output from "git grep 'strbuf_getline('" would become a lot smaller. An interim goal of this series is to make this an empty set, so that we can have strbuf_getline_crlf() take over the shorter name strbuf_getline(). Signed-off-by: Junio C Hamano <gitster@pobox.com>	9 years ago
René Scharfe	4441549995	grep: stop using PARSE_OPT_NO_INTERNAL_HELP The flag PARSE_OPT_NO_INTERNAL_HELP is set to allow overriding the option -h, except when it's the only one given. This is the default behavior now, so remove the flag and the hand-rolled --help-all handling. The internal --help-all handler now actually shows hidden options, i.e. --debug in this case. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net>	9 years ago
brian m. carlson	ed1c9977cb	Remove get_object_hash. Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	9 years ago
brian m. carlson	f2fd0760f6	Convert struct object to object_id struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	9 years ago
brian m. carlson	7999b2cf77	Add several uses of get_object_hash. Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	9 years ago
Patrick Steinhardt	5dcd1b1577	grep: correctly initialize help-all option The "help-all" option is being initialized with a wrong value. While being semantically wrong this can also cause a segmentation fault in gcc on ARMv7 hardfloat platforms with a hardened toolchain. Fix this by initializing with a NULL value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	10 years ago
Wilhelm Schuermann	c2048f0b39	grep: fix "--quiet" overwriting current output When grep is called with the --quiet option, the pager is initialized despite not being used. When the pager is "less", anything output by previous commands and not ended with a newline is overwritten: $ echo -n aaa; echo bbb aaabbb $ echo -n aaa; git grep -q foo; echo bbb bbb This can be worked around, for example, by making sure STDOUT is not a TTY or more directly by setting git's pager to "cat": $ echo -n aaa; git grep -q foo > /dev/null; echo bbb aaabbb $ echo -n aaa; PAGER=cat git grep -q foo; echo bbb aaabbb But prevent calling the pager in the first place, which would also save an unnecessary fork(). Signed-off-by: Wilhelm Schuermann <wimschuermann@googlemail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	10 years ago
Nguyễn Thái Ngọc Duy	77fdb8a82c	grep: correct help string for --exclude-standard The current help string is about --no-exclude-standard. But "git grep -h" would show --exclude-standard instead. Flip the string. See `0a93fb8` (grep: teach --untracked and --exclude-standard options - 2011-09-27) for more info about these options. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	10 years ago
Alex Henrie	9c9b4f2f8b	standardize usage info string format This patch puts the usage info strings that were not already in docopt- like format into docopt-like format, which will be a litle easier for end users and a lot easier for translators. Changes include: - Placing angle brackets around fill-in-the-blank parameters - Putting dashes in multiword parameter names - Adding spaces to [-f\|--foobar] to make [-f \| --foobar] - Replacing <foobar>* with [<foobar>...] Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	10 years ago
Jeff King	9e0c3c4fcd	make add_object_array_with_context interface more sane When you resolve a sha1, you can optionally keep any context found during the resolution, including the path and mode of a tree entry (e.g., when looking up "HEAD:subdir/file.c"). The add_object_array_with_context function lets you then attach that context to an entry in a list. Unfortunately, the interface for doing so is horrible. The object_context structure is large and most object_array users do not use it. Therefore we keep a pointer to the structure to avoid burdening other users too much. But that means when we do use it that we must allocate the struct ourselves. And the struct contains a fixed PATH_MAX-sized buffer, which makes this wholly unsuitable for any large arrays. We can observe that there is only a single user of the "with_context" variant: builtin/grep.c. And in that use case, the only element we care about is the path. We can therefore store only the path as a pointer (the context's mode field was redundant with the object_array_entry itself, and nobody actually cared about the surrounding tree). This still requires a strdup of the pathname, but at least we are only consuming the minimum amount of memory for each string. We can also handle the copying ourselves in add_object_array_*, and free it as appropriate in object_array_release_entry. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	10 years ago
Johannes Schindelin	f7febbea07	git grep -O -i: if the pager is 'less', pass the '-I' option When <command> happens to be the magic string "less", today git grep -O<command> -e<pattern> helpfully passes +/<pattern> to less so you can navigate through the results within a file using the n and shift+n keystrokes. Alas, that doesn't do the right thing for a case-insensitive match, i.e. git grep -i -O<command> -e<pattern> For that case we should pass --IGNORE-CASE to "less" so that n and shift+n can move between results ignoring case in the pattern. The original patch came from msysgit and used "-i", but that was not due to lack of support for "-I" but it merely overlooked that it ought to work even when the pattern contains capital letters. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Stepan Kasal <kasal@ucw.cz> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	11 years ago
Jeff King	26ecfe3e20	grep: use run-command's "dir" option for --open-files-in-pager Git generally changes directory to the repository root on startup. When running "grep --open-files-in-pager" from a subdirectory, we chdir back to the original directory before running the pager, so that we can feed the relative pathnames to the pager. We currently do this chdir manually, but we can ask run_command to do it for us. This is fewer lines of code, and as a bonus, the chdir is limited to the child process, which avoids any unexpected surprises for code running after the pager (there isn't any currently, but this is future-proofing). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	11 years ago
Nguyễn Thái Ngọc Duy	ebb32893ba	pathspec: convert some match_pathspec_depth() to dir_path_match() This helps reduce the number of match_pathspec_depth() call sites and show how m_p_d() is used. And it usage is: - match against an index entry (ce_path_match or match_pathspec_depth in ls-files) - match against a dir_entry from read_directory (dir_path_match and match_pathspec_depth in clean.c, which will be converted later) - resolve-undo (rerere.c and ls-files.c) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	11 years ago
Nguyễn Thái Ngọc Duy	429bb40abd	pathspec: convert some match_pathspec_depth() to ce_path_match() This helps reduce the number of match_pathspec_depth() call sites and show how match_pathspec_depth() is used. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	11 years ago
Stefan Beller	d5d09d4754	Replace deprecated OPT_BOOLEAN by OPT_BOOL This task emerged from `b04ba2bb` (parse-options: deprecate OPT_BOOLEAN, 2011-09-27). All occurrences of the respective variables have been reviewed and none of them relied on the counting up mechanism, but all of them were using the variable as a true boolean. This patch does not change semantics of any command intentionally. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	7327d3d1b7	convert {read,fill}_directory to take struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	6330a17199	parse_pathspec: add special flag for max_depth feature match_pathspec_depth() and tree_entry_interesting() check max_depth field in order to support "git grep --max-depth". The feature activation is tied to "recursive" field, which led to some unwanted activation, e.g. `5c8eeb8` (diff-index: enable recursive pathspec matching in unpack_trees - 2012-01-15). This patch decouples the activation from "recursive" field, puts it in "magic" field instead. This makes sure that only "git grep" can activate this feature. And because parse_pathspec knows when the feature is not used, it does not need to sort pathspec (required for max_depth to work correctly). A small win for non-grep cases. Even though a new magic flag is introduced, no magic syntax is. The magic can be only enabled by parse_pathspec() caller. We might someday want to support ":(maxdepth:10)src." It all depends on actual use cases. max_depth feature cannot be enabled via init_pathspec() anymore. But that's ok because init_pathspec() is on its way to /dev/null. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	0fdc2ae512	convert some get_pathspec() calls to parse_pathspec() These call sites follow the pattern: paths = get_pathspec(prefix, argv); init_pathspec(&pathspec, paths); which can be converted into a single parse_pathspec() call. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	64acde94ef	move struct pathspec and related functions to pathspec.[ch] Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	9c5e6c802c	Convert "struct cache_entry " to "const ..." wherever possible I attempted to make index_state->cache[] a "const struct cache_entry " to find out how existing entries in index are modified and where. The question I have is what do we do if we really need to keep track of on-disk changes in the index. The result is - diff-lib.c: setting CE_UPTODATE - name-hash.c: setting CE_HASHED - preload-index.c, read-cache.c, unpack-trees.c and builtin/update-index: obvious - entry.c: write_entry() may refresh the checked out entry via fill_stat_cache_info(). This causes "non-const struct cache_entry " in builtin/apply.c, builtin/checkout-index.c and builtin/checkout.c - builtin/ls-files.c: --with-tree changes stagemask and may set CE_UPDATE Of these, write_entry() and its call sites are probably most interesting because it modifies on-disk info. But this is stat info and can be retrieved via refresh, at least for porcelain commands. Other just uses ce_flags for local purposes. So, keeping track of "dirty" entries is just a matter of setting a flag in index modification functions exposed by read-cache.c. Except unpack-trees, the rest of the code base does not do anything funny behind read-cache's back. The actual patch is less valueable than the summary above. But if anyone wants to re-identify the above sites. Applying this patch, then this: diff --git a/cache.h b/cache.h index 430d021..1692891 100644 --- a/cache.h +++ b/cache.h @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode) #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1) struct index_state { - struct cache_entry cache; + const struct cache_entry cache; unsigned int version; unsigned int cache_nr, cache_alloc, cache_changed; struct string_list *resolve_undo; will help quickly identify them without bogus warnings. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Jiang Xin	39598f9983	quote_path_relative(): remove redundant parameter quote_path_relative() used to take a counted string as its parameter (the string to be quoted). With an earlier change, it now uses relative_path() that does not take a counted string, and we have been passing only the pointer to the string since then. Remove the length parameter from quote_path_relative() to show that this parameter was redundant. All the changed lines show that the caller passed either -1 (to ask the function run strlen() on the string), or the length of the string, so the earlier conversion was safe. All the callers of quote_path_relative() that used to take counted string have been audited to make sure that they are passing length of the actual string (or -1 to ask the callee run strlen()) Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Michael J Gruber	afa15f3cd8	grep: honor --textconv for the case rev:path Make "grep" honor the "--textconv" option also for the object case, i.e. when used with an argument "rev:path". Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Jeff King	335ec3bf41	grep: allow to use textconv filters Recently and not so recently, we made sure that log/grep type operations use textconv filters when a userfacing diff would do the same: `ef90ab6` (pickaxe: use textconv for -S counting, 2012-10-28) `b1c2f57` (diff_grep: use textconv buffers for add/deleted files, 2012-10-28) `0508fe5` (combine-diff: respect textconv attributes, 2011-05-23) "git grep" currently does not use textconv filters at all, that is neither for displaying the match and context nor for the actual grepping, even when requested by --textconv. Introduce an option "--textconv" which makes git grep use any configured textconv filters for grepping and output purposes. It is off by default. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Jeff King	f7892d1817	use parse_object_or_die instead of die("bad object") Some call-sites do: o = parse_object(sha1); if (!o) die("bad object %s", some_name); We can now handle that as a one-liner, and get more consistent output. In the third case of this patch, it looks like we are losing information, as the existing message also outputs the sha1 hex; however, parse_object will already have written a more specific complaint about the sha1, so there is no point in repeating it here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	0b0ecaac2a	grep: avoid accepting ambiguous revision Unlike other commands that take both revs and pathspecs without "--" disamiguators only when the boundary is clear, "git grep" treated what can be interpreted as a rev as-is, without making sure that it could also have meant a pathspec. E.g. $ git grep -e foo master when 'master' is in the working tree, should have triggered an ambiguity error, but it didn't, and searched in the tree of the commit named by 'master'. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Nguyễn Thái Ngọc Duy	55c61688ea	grep: stop looking at random places for .gitattributes grep searches for .gitattributes using "name" field in struct grep_source but that field is not real on-disk path name. For example, "grep pattern rev" fills the field with "rev:path", and Git looks for .gitattributes in the (non-existent but exploitable) path "rev:path" instead of "path". This patch passes real paths down to grep_source_load_driver() when: - grep on work tree - grep on the index - grep a commit (or a tag if it points to a commit) so that these cases look up .gitattributes at proper paths. .gitattributes lookup is disabled in all other cases. Initial-work-by: Jeff King <peff@peff.net> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Junio C Hamano	c5c31d3381	grep: move pattern-type bits support to top-level grep.[ch] Switching between -E/-G/-P/-F correctly needs a lot more than just flipping opt->regflags bit these days, and we have a nice helper function buried in builtin/grep.c for the sole use of "git grep". Extract it so that "log --grep" family can also use it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Junio C Hamano	7687a0541e	grep: move the configuration parsing logic to grep.[ch] The configuration handling is a library-ish part of this program, that is not specific to "git grep" command. It should be reusable by "log" and others. Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Junio C Hamano	15fabd1bbd	builtin/grep.c: make configuration callback more reusable The grep_config() function takes one instance of grep_opt as its callback parameter, and populates it by running git_config(). This has three practical implications: - You have to have an instance of grep_opt already when you call the configuration, but that is not necessarily always true. You may be trying to initialize the grep_filter member of rev_info, but are not ready to call init_revisions() on it yet. - It is not easy to enhance grep_config() in such a way to make it cascade to other callback functions to grab other variables in one call of git_config(); grep_config() can be cascaded into from other callbacks, but it has to be at the leaf level of a cascade. - If you ever need to use more than one instance of grep_opt, you will have to open and read the configuration file(s) every time you initialize them. Rearrange the configuration mechanism and model it after how diff configuration variables are handled. An early call to git_config() reads and remembers the values taken from the configuration in the default "template", and a separate call to grep_init() uses this template to instantiate a grep_opt. The next step will be to move some of this out of this file so that the other user of the grep machinery (i.e. "log") can use it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	12 years ago
Michael J Gruber	208f5aa426	grep: show --debug output only once When threaded grep is in effect, the patterns are duplicated and recompiled for each thread. Avoid "--debug" output during the recompilation so that the output is given once instead of "1+nthreads" times. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Junio C Hamano	17bf35a3c7	grep: teach --debug option to dump the parse tree Our "grep" allows complex boolean expressions to be formed to match each individual line with operators like --and, '(', ')' and --not. Introduce the "--debug" option to show the parse tree to help people who want to debug and enhance it. Also "log" learns "--grep-debug" option to do the same. The command line parser to the log family is a lot more limited than the general "git grep" parser, but it has special handling for header matching (e.g. "--author"), and a parse tree is valuable when working on it. Note that "--all-match" is not any individual node in the parse tree. It is an instruction to the evaluator to check all the nodes in the top-level backbone have matched and reject a document as non-matching otherwise. Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Nguyễn Thái Ngọc Duy	f63cf8c9fb	Use imperative form in help usage to describe an action Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Nguyễn Thái Ngọc Duy	4b407bc506	i18n: grep: mark parseopt strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
J Smith	84befcd0a4	grep: add a grep.patternType configuration setting The grep.extendedRegexp configuration setting enables the -E flag on grep by default but there are no equivalents for the -G, -F and -P flags. Rather than adding an additional setting for grep.fooRegexp for current and future pattern matching options, add a grep.patternType setting that can accept appropriate values for modifying the default grep pattern matching behavior. The current values are "basic", "extended", "fixed", "perl" and "default" for setting -G, -E, -F, -P and the default behavior respectively. When grep.patternType is set to a value other than "default", the grep.extendedRegexp setting is ignored. The value of "default" restores the current default behavior, including the grep.extendedRegexp behavior. Signed-off-by: J Smith <dark.panda@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Matthieu Moy	023e37c377	verify_filename(): ask the caller to chose the kind of diagnosis verify_filename() can be called in two different contexts. Either we just tried to interpret a string as an object name, and it fails, so we try looking for a working tree file (i.e. we finished looking at revs that come earlier on the command line, and the next argument must be a pathname), or we _know_ that we are looking for a pathname, and shouldn't even try interpreting the string as an object name. For example, with this change, we get: $ git log COPYING HEAD:inexistant fatal: HEAD:inexistant: no such path in the working tree. Use '-- <path>...' to specify paths that do not exist locally. $ git log HEAD:inexistant fatal: Path 'inexistant' does not exist in 'HEAD' Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
René Scharfe	ec83061156	grep: stop leaking line strings with -f When reading patterns from a file, we pass the lines as allocated string buffers to append_grep_pat() and never free them. That's not a problem because they are needed until the program ends anyway. However, now that the function duplicates the pattern string, we can reuse the strbuf after calling that function. This simplifies the code a bit and plugs a minor memory leak. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
René Scharfe	cbb08c2e0b	parse-options: remove PARSE_OPT_NEGHELP PARSE_OPT_NEGHELP is confusing because short options defined with that flag do the opposite of what the helptext says. It is also not needed anymore now that options starting with no- can be negated by removing that prefix. Convert its only two users to OPT_NEGBIT() and OPT_BOOL() and then remove support for PARSE_OPT_NEGHELP. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Jeff King	6680a0874f	drop odd return value semantics from userdiff_config When the userdiff_config function was introduced in `be58e70` (diff: unify external diff and funcname parsing code, 2008-10-05), it used a return value convention unlike any other config callback. Like other callbacks, it used "-1" to signal error. But it returned "1" to indicate that it found something, and "0" otherwise; other callbacks simply returned "0" to indicate that no error occurred. This distinction was necessary at the time, because the userdiff namespace overlapped slightly with the color configuration namespace. So "diff.color.foo" could mean "the 'foo' slot of diff coloring" or "the 'foo' component of the "color" userdiff driver". Because the color-parsing code would die on an unknown color slot, we needed the userdiff code to indicate that it had matched the variable, letting us bypass the color-parsing code entirely. Later, in `8b8e862` (ignore unknown color configuration, 2009-12-12), the color-parsing code learned to silently ignore unknown slots. This means we no longer need to protect userdiff-matched variables from reaching the color-parsing code. We can therefore change the userdiff_config calling convention to a more normal one. This drops some code from each caller, which is nice. But more importantly, it reduces the cognitive load for readers who may wonder why userdiff_config is unlike every other config callback. There's no need to add a new test confirming that this works; t4020 already contains a test that sets diff.color.external. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Jeff King	9dd5245c10	grep: pre-load userdiff drivers when threaded The low-level grep_source code will automatically load the userdiff driver to see whether a file is binary. However, when we are threaded, it will load the drivers in a non-deterministic order, handling each one as its assigned thread happens to be scheduled. Meanwhile, the attribute lookup code (which underlies the userdiff driver lookup) is optimized to handle paths in sequential order (because they tend to share the same gitattributes files). Multi-threading the lookups destroys the locality and makes this optimization less effective. We can fix this by pre-loading the userdiff driver in the main thread, before we hand off the file to a worker thread. My best-of-five for "git grep foo" on the linux-2.6 repository went from: real 0m0.391s user 0m1.708s sys 0m0.584s to: real 0m0.360s user 0m1.576s sys 0m0.572s Not a huge speedup, but it's quite easy to do. The only trick is that we shouldn't perform this optimization if "-a" was used, in which case we won't bother checking whether the files are binary at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Jeff King	8f24a6323e	convert git-grep to use grep_source interface The grep_source interface (as opposed to grep_buffer) will eventually gives us a richer interface for telling the low-level grep code about our buffers. Eventually this will lead to things like better binary-file handling. For now, it lets us drop a lot of now-redundant code. The conversion is mostly straight-forward. One thing to note is that the memory ownership rules for "struct grep_source" are different than the "struct work_item" found here (the former will copy things like the filename, rather than taking ownership). Therefore you will also see some slight tweaking of when filename buffers are released. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Jeff King	b3aeb285d0	grep: move sha1-reading mutex into low-level code The multi-threaded git-grep code needs to serialize access to the thread-unsafe read_sha1_file call. It does this with a mutex that is local to builtin/grep.c. Let's instead push this down into grep.c, where it can be used by both builtin/grep.c and grep.c. This will let us safely teach the low-level grep.c code tricks that involve reading from the object db. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Jeff King	78db6ea9dc	grep: make locking flag global The low-level grep code traditionally didn't care about threading, as it doesn't do any threading itself and didn't call out to other non-thread-safe code. That changed with `0579f91` (grep: enable threading with -p and -W using lazy attribute lookup, 2011-12-12), which pushed the lookup of funcname attributes (which is not thread-safe) into the low-level grep code. As a result, the low-level code learned about a new global "grep_attr_mutex" to serialize access to the attribute code. A multi-threaded caller (e.g., builtin/grep.c) is expected to initialize the mutex and set "use_threads" in the grep_opt structure. The low-level code only uses the lock if use_threads is set. However, putting the use_threads flag into the grep_opt struct is not the most logical place. Whether threading is in use is not something that matters for each call to grep_buffer, but is instead global to the whole program (i.e., if any thread is doing multi-threaded grep, every other thread, even if it thinks it is doing its own single-threaded grep, would need to use the locking). In practice, this distinction isn't a problem for us, because the only user of multi-threaded grep is "git-grep", which does nothing except call grep. This patch turns the opt->use_threads flag into a global flag. More important than the nit-picking semantic argument above is that this means that the locking functions don't need to actually have access to a grep_opt to know whether to lock. Which in turn can make adding new locks simpler, as we don't need to pass around a grep_opt. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Albert Yale	50dd0f2fd9	grep: fix -l/-L interaction with decoration lines In threaded mode, git-grep emits file breaks (enabled with context, -W and --break) into the accumulation buffers even if they are not required. The output collection thread then uses skip_first_line to skip the first such line in the output, which would otherwise be at the very top. This is wrong when the user also specified -l/-L/-c, in which case every line is relevant. While arguably giving these options together doesn't make any sense, git-grep has always quietly accepted it. So do not skip anything in these cases. Signed-off-by: Albert Yale <surfingalbert@gmail.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Thomas Rast	53b8d931b6	grep: disable threading in non-worktree case Measurements by various people have shown that grepping in parallel is not beneficial when the object store is involved. For example, with a simple regex: Threads \| --cached case \| worktree case ---------------------------------------------------------------- 8 (default) \| 2.88u 0.21s 0:02.94real \| 0.19u 0.32s 0:00.16real 4 \| 2.89u 0.29s 0:02.99real \| 0.16u 0.34s 0:00.17real 2 \| 2.83u 0.36s 0:02.87real \| 0.18u 0.32s 0:00.26real NO_PTHREADS \| 2.16u 0.08s 0:02.25real \| 0.12u 0.17s 0:00.31real This happens because all the threads contend on read_sha1_mutex almost all of the time. A more complex regex allows the threads to do more work in parallel, but as Jeff King found out, the "super boost" (much higher clock when only one core is active) feature of recent CPUs still causes the unthreaded case to win by a large margin. So until the pack machinery allows unthreaded access, we disable grep's threading in all but the worktree case. Helped-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Thomas Rast	0579f91dd7	grep: enable threading with -p and -W using lazy attribute lookup Lazily load the userdiff attributes in match_funcname(). Use a separate mutex around this loading to protect the (not thread-safe) attributes machinery. This lets us re-enable threading with -p and -W while reducing the overhead caused by looking up attributes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Nguyễn Thái Ngọc Duy	d688cf07b1	tree_entry_interesting(): give meaningful names to return values It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Nguyễn Thái Ngọc Duy	0de1633783	tree-walk.c: do not leak internal structure in tree_entry_len() tree_entry_len() does not simply take two random arguments and return a tree length. The two pointers must point to a tree item structure, or struct name_entry. Passing random pointers will return incorrect value. Force callers to pass struct name_entry instead of two pointers (with hope that they don't manually construct struct name_entry themselves) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Junio C Hamano	764161391f	builtin/grep: simplify lock_and_read_sha1_file() As read_sha1_lock/unlock have been made aware of use_threads, this caller can be made a lot simpler. Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Junio C Hamano	1487a12ba2	builtin/grep: make lock/unlock into static inline functions Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago
Johannes Schindelin	cdf0553769	git grep: be careful to use mutexes only when they are initialized Rather nasty things happen when a mutex is not initialized but locked nevertheless. Now, when we're not running in a threaded manner, the mutex is not initialized, which is correct. But then we went and used the mutex anyway, which -- at least on Windows -- leads to a hard crash (ordinarily it would be called a segmentation fault, but in Windows speak it is an access violation). This problem was identified by our faithful tests when run in the msysGit environment. To avoid having to wrap the line due to the 80 column limit, we use the name "WHEN_THREADED" instead of "IF_USE_THREADS" because it is one character shorter. Which is all we need in this case. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	13 years ago

1 2 3

133 Commits (9096ee162b7e4e426b1719bb43f9e42e84c95816)