xdiff_set_find_func() is used to set user defined regular expressions
for finding function signatures. Add xdiff_clear_find_func(), which
frees the memory allocated by the former, making the API complete.
Also, use the new function in diff.c (the only call site of
xdiff_set_find_func()) to clean up after ourselves.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These variables were unused and can be removed safely:
builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
builtin-fetch-pack.c::find_common(): len
builtin-remote.c::mv(): symref
diff.c::show_stats():show_stats(): total
diffcore-break.c::should_break(): base_size
fast-import.c::validate_raw_date(): date, sign
fsck.c::fsck_tree(): o_sha1, sha1
xdiff-interface.c::parse_num(): read_some
Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Based on a patch by Brian Downing, this uses the xdiff emit_func feature
to implement xdi_diff_hunks(). It's a function that calls a callback for
each hunk of a diff, passing its lengths.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
POSIX doth sayeth:
"In the regular expression processing described in IEEE Std 1003.1-2001,
the <newline> is regarded as an ordinary character and both a period and
a non-matching list can match one. ... Those utilities (like grep) that
do not allow <newline>s to match are responsible for eliminating any
<newline> from strings before matching against the RE."
Thus far git has not been removing the trailing newline from strings matched
against regular expression patterns. This has the effect that (quoting
Jonathan del Strother) "... a line containing just 'FUNCNAME' (terminated by
a newline) will be matched by the pattern '^(FUNCNAME.$)' but not
'^(FUNCNAME$)'", and more simply not '^FUNCNAME$'.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
POSIX doth sayeth:
"In the regular expression processing described in IEEE Std 1003.1-2001,
the <newline> is regarded as an ordinary character and both a period and
a non-matching list can match one. ... Those utilities (like grep) that
do not allow <newline>s to match are responsible for eliminating any
<newline> from strings before matching against the RE."
Thus far git has not been removing the trailing newline from strings matched
against regular expression patterns. This has the effect that (quoting
Jonathan del Strother) "... a line containing just 'FUNCNAME' (terminated by
a newline) will be matched by the pattern '^(FUNCNAME.$)' but not
'^(FUNCNAME$)'", and more simply not '^FUNCNAME$'.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When multiple regular expressions are concatenated with "\n", they were
traditionally AND'ed together, and only a line that matches _all_ of them
is taken as a match. This however is unwieldy when multiple regexp
feature is used to specify alternatives.
This fixes the semantics to take the first match. A nagative pattern, if
matches, makes the line to fail as before. A match with a positive
pattern will be the final match, and what it captures in $1 is used as the
hunk header comment.
We could write alternatives using "|" in ERE, but the machinery can only
use captured $1 as the hunk header comment (or $0 if there is no match in
$1), so you cannot write:
"junk ( A | B ) | garbage ( C | D )"
and expect both "junk" and "garbage" to get stripped with the existing
code. With this fix, you can write it as:
"junk ( A | B ) \n garbage ( C | D )"
and the way capture works would match the user expectation more
naturally.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is in preparation for allowing extended regular expression patterns.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This teaches "git merge-file" to honor merge.conflictstyle configuration
variable, whose value can be "merge" (default) or "diff3".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This further enhances xdi_diff_outf() interface so that it takes two
common parameters: the callback function that processes one line at a
time, and a pointer to its application specific callback data structure.
xdi_diff_outf() creates its own "xdiff_emit_state" structure and stashes
these two away inside it, which is used by the lowest level output
function in the xdiff_outf() callchain, consume_one(), to call back to the
application layer. With this restructuring, we lift the requirement that
the caller supplied callback data structure embeds xdiff_emit_state
structure as its first member.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continually xreallocing and freeing the remainder member of struct
xdiff_emit_state was a noticeable performance hit. Use a strbuf
instead.
This yields a decent performance improvement on "git blame" on certain
repositories. For example, before this commit:
$ time git blame -M -C -C -p --incremental server.c >/dev/null
101.52user 0.17system 1:41.73elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+39561minor)pagefaults 0swaps
With this commit:
$ time git blame -M -C -C -p --incremental server.c >/dev/null
80.38user 0.30system 1:20.81elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+50979minor)pagefaults 0swaps
Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prepare for the need to initialize and release resources for an
xdi_diff with the xdiff_outf output function, make a new function to
wrap this usage.
Old:
ecb.outf = xdiff_outf;
ecb.priv = &state;
...
xdi_diff(file_p, file_o, &xpp, &xecfg, &ecb);
New:
xdi_diff_outf(file_p, file_o, &state.xm, &xpp, &xecfg, &ecb);
Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier, it would error out while trying to read and/or writing them.
Now, calling merge-file with empty files is neither interesting nor
useful, but it is a bug that needed fixing.
Noticed by Clemens Buchacher.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This change removes all obvious useless if-before-free tests.
E.g., it replaces code like this:
if (some_expression)
free (some_expression);
with the now-equivalent:
free (some_expression);
It is equivalent not just because POSIX has required free(NULL)
to work for a long time, but simply because it has worked for
so long that no reasonable porting target fails the test.
Here's some evidence from nearly 1.5 years ago:
http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html
FYI, the change below was prepared by running the following:
git ls-files -z | xargs -0 \
perl -0x3b -pi -e \
's/\bif\s*\(\s*(\S+?)(?:\s*!=\s*NULL)?\s*\)\s+(free\s*\(\s*\1\s*\))/$2/s'
Note however, that it doesn't handle brace-enclosed blocks like
"if (x) { free (x); }". But that's ok, since there were none like
that in git sources.
Beware: if you do use the above snippet, note that it can
produce syntactically invalid C code. That happens when the
affected "if"-statement has a matching "else".
E.g., it would transform this
if (x)
free (x);
else
foo ();
into this:
free (x);
else
foo ();
There were none of those here, either.
If you're interested in automating detection of the useless
tests, you might like the useless-if-before-free script in gnulib:
[it *does* detect brace-enclosed free statements, and has a --name=S
option to make it detect free-like functions with different names]
http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free
Addendum:
Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tar-ball and the git archive itself is fine, but yes, the diff from
2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization
that has caused way too much pain.
Very interesting breakage. The patch was actually "correct" in a (rather
limited) technical sense, but the context at the end was missing because
while the trim_common_tail() code made sure to keep enough common context
to allow a valid diff to be generated, the diff machinery itself could
decide that it could generate the diff differently than the "obvious"
solution.
Thee sad fact is that the git optimization (which is very important for
"git blame", which needs no context), is only really valid for that one
case where we really don't need any context.
[jc: since this is shared with "git diff -U0" codepath, context recovery
to the end of line needs to be done even for zero context case.]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We need to be extra careful recovering the removed common section, so
that we do not break context nor the changed incomplete line (i.e. the
last line that does not end with LF).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The recovered context lines were not LF terminated due to off-by-one
error, which also caused the outer loop to count the number of recovered
lines to terminate after running only once.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inside xdiff library, the number of context lines is represented in
long, not int.
Noticed by Peter Baumann.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This implements earlier Linus's optimization to trim common lines at the
end before passing them down to low level xdiff interface for all of our
xdiff users.
We could later enhance this to also trim common leading lines, but that
would need tweaking the output function to add the number of lines
trimmed at the beginning to line numbers that appear in the hunk
headers.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This inserts a new function xdi_diff() that currently does not
do anything other than calling the underlying xdl_diff() to the
callchain of current callers of xdl_diff() function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes"diff -p" hunk headers customizable via gitattributes mechanism.
It is based on Johannes's earlier patch that allowed to define a single
regexp to be used for everything.
The mechanism to arrive at the regexp that is used to define hunk header
is the same as other use of gitattributes. You assign an attribute, funcname
(because "diff -p" typically uses the name of the function the patch is about
as the hunk header), a simple string value. This can be one of the names of
built-in pattern (currently, "java" is defined) or a custom pattern name, to
be looked up from the configuration file.
(in .gitattributes)
*.java funcname=java
*.perl funcname=perl
(in .git/config)
[funcname]
java = ... # ugly and complicated regexp to override the built-in one.
perl = ... # another ugly and complicated regexp to define a new one.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have two instances where we want to determine if a buffer
contains binary data as opposed to text.
[jc: cherry-picked 6bfce93e from 'master']
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time. There are a few files that need
to have trailing whitespaces (most notably, test vectors). The results
still passes the test, and build result in Documentation/ area is unchanged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have two instances where we want to determine if a buffer
contains binary data as opposed to text.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.
On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().
In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
read_file() was a useful function if you want to work with the xdiff stuff,
so it was renamed and put into a more central place.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Change places that use realloc, without a proper error path, to instead use
xrealloc. Drop an erroneous error path in the daemon code that used errno
in the die message in favour of the simpler xrealloc.
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This refactors the line-by-line callback mechanism used in
combine-diff so that other programs can reuse it more easily.
Signed-off-by: Junio C Hamano <junkio@cox.net>