Few of us use git to compare or even version-control 2GB files,
but when we do, we'll want it to work.
Reading a recent patch, I noticed two lines like this:
int len = st.st_size;
Instead of "int", that should be "size_t". Otherwise, in the
non-symlink case, with 64-bit size_t, if the file's size is 2GB,
the following xmalloc will fail:
result = xmalloc(len + 1);
trying to allocate 2^64 - 2^31 + 1 bytes (assuming sign-extension
in the int-to-size_t promotion). And even if it didn't fail, the
subsequent "result[len] = 0;" would be equivalent to an unpleasant
"result[-2147483648] = 0;"
The other nearby "int"-declared size variable, sz, should also be of
type size_t, for the same reason. If sz ever wraps around and becomes
negative, xread will corrupt memory _before_ the "result" buffer.
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
"git-diff-files --cc" to show conflicts during merge did not pass
the correct mode information for the working tree down, and showed
bogus combined diff.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Even when --unified=0 is given, the main loop to show the
combined textual diff needs to handle a line that is unchanged
but has lines that were deleted relative to a parent before it
(because that is where the lost lines hang). However, such a
line should not be emitted in the final output.
Signed-off-by: Junio C Hamano <junkio@cox.net>
"new file" and "deleted file" were already reported in the
original code, but the logic was not as transparent as it could
have. This uses a few variables and more comments to clarify
the flow. The rule is: (1) if a path exists in the merge result
when no parent had it, we report "new" (otherwise it came from
the parents, as opposed to have added by the evil merge). (2) if
the path does not exist in the merge result, it is "deleted".
Since we can say "new" and "deleted", there is no reason not to
follow the /dev/null convention. This fixes it.
Appending function name after @@@ ... @@@ is trivial, so
implement it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This abstracts away the size of the hash values when copying them
from memory location to memory location, much as the introduction
of hashcmp abstracted away hash value comparsion.
A few call sites were using char* rather than unsigned char* so
I added the cast rather than open hashcpy to be void*. This is a
reasonable tradeoff as most call sites already use unsigned char*
and the existing hashcmp is also declared to be unsigned char*.
[jc: Splitted the patch to "master" part, to be followed by a
patch for merge-recursive.c which is not in "master" yet.
Fixed the cast in the latter hunk to combine-diff.c which was
wrong in the original.
Also converted ones left-over in combine-diff.c, diff-lib.c and
upload-pack.c ]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Introduces global inline:
hashcmp(const unsigned char *sha1, const unsigned char *sha2)
Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Replace sha1 comparisons to null_sha1 with a global inline (which previously an
unused static inline in builtin-apply.c)
[jc: with a fix from Jonas Fonseca.]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
A patch from David Rientjes made me realize we do not have to have
this function -- just call diff_unmodified_pair() directly.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The only visible change is that git-blame doesn't understand
"--compability" anymore, but it does accept "--compatibility" instead,
which is already documented.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
In diff_tree_combined(), show_log_first boolean is initialized with
rev->loginfo (pointer to a string); the intention is that if we have
some string to be emitted we would want to remember that fact. Picky
compilers are offended by this, so make the expression a bit type-safer.
Signed-off-by: Junio C Hamano <junkio@cox.net>
- combine_diff() took cnt (count) which is unsigned in nature but the
parameter type was declared as "int";
- find_next() took "uninteresting" parameter, which masked a static
function of the same name;
- show_parent_lno() took an unused parameter "cnt";
- show_patch_diff() used a local variable in nested inner scope with
the same name with different type, masking the one in the outer scope;
- the last loop in show_patch_diff iterated over lines so it should use
the local variable "lno"
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes various problems in the new diff options code.
- Fix --cc/-c --patch; it showed two-tree diff used internally.
- Use "---\n" only where it matters -- that is, use it
immediately after the commit log text when we show a
commit log and something else before the patch text.
- Do not output spurious extra "\n"; have an extra newline
after the commit log text always when we have diff output and
we are not doing oneline.
- When running a pickaxe you need to go recursive.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add msg_sep variable to struct diff_options. msg_sep is printed after
commit message. Default is "\n", format-patch sets it to "---\n".
This also removes the second argument from show_log() because all
callers derived it from the first argument:
show_log(rev, rev->loginfo, ...
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
DIFF_FORMAT_* are now bit-flags instead of enumerated values.
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We used to parse "-U" and "--unified" as part of the GIT_DIFF_OPTS
environment variable, but strangely enough we would _not_ parse them as
part of the normal diff command line (where we only accepted "-u").
This adds parsing of -U and --unified, both with an optional numeric
argument. So now you can just say
git diff --unified=5
to get a unified diff with a five-line context, instead of having to do
something silly like
GIT_DIFF_OPTS="--unified=5" git diff -u
(that silly format does continue to still work, of course).
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
"git diff(n)" without --base, --ours, etc. defaults to --cc,
which usually is the same as -p unless you are in the middle of
a conflicted merge, just like the shell script version.
"git diff(n) blobA blobB path" complains and dies.
"git diff(n) tree0 tree1 tree2...treeN" does combined diff that
shows a merge of tree1..treeN to result in tree0.
Giving "-c" option to any command that defaults to "--cc" turns
off dense-combined flag.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Asking for stat (either with --stat or --patch-with-stat) gives
you diffstat for the first parent, even under combine-diff.
While the combined patch is useful to highlight the complexity
and interaction of the parts touched by all branches when
reviewing a merge commit, diffstat is a tool to assess the
extent of damage the merge brings in, and showing stat with the
first parent is more sensible than clever per-parent diffstat.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Here's a further patch on top of the previous one with cosmetic
improvements (no "real" code changes, just trivial updates):
- it gets the "---" before a diffstat right, including for the combined
merge case. Righ now the logic is that we always use "---" when we have
a diffstat, and an empty line otherwise. That's how I visually prefer
it, but hey, it can be tweaked later.
- I made "diff --cc/combined" add the "---/+++" header lines too. The
thing won't be mistaken for a valid diff, since the "@@" lines have too
many "@" characters (three or more), but it just makes it visually
match a real diff, which at least to me makes a big difference in
readability. Without them, it just looks very "wrong".
I guess I should have taken the filename from each individual entry
(and had one "---" file per parent), but I didn't even bother to try to
see how that works, so this was the simple thing.
With this, doing a
git log --cc --patch-with-stat
looks quite readable, I think. The only nagging issue - as far as I'm
concerned - is that diffstats for merges are pretty questionable the way
they are done now. I suspect it would be better to just have the _first_
diffstat, and always make the merge diffstat be the one for "result
against first parent".
Signed-off-by: Junio C Hamano <junkio@cox.net>
On Sun, 16 Apr 2006, Junio C Hamano wrote:
>
> In the mid-term, I am hoping we can drop the generate_header()
> callchain _and_ the custom code that formats commit log in-core,
> found in cmd_log_wc().
Ok, this was nastier than expected, just because the dependencies between
the different log-printing stuff were absolutely _everywhere_, but here's
a patch that does exactly that.
The patch is not very easy to read, and the "--patch-with-stat" thing is
still broken (it does not call the "show_log()" thing properly for
merges). That's not a new bug. In the new world order it _should_ do
something like
if (rev->logopt)
show_log(rev, rev->logopt, "---\n");
but it doesn't. I haven't looked at the --with-stat logic, so I left it
alone.
That said, this patch removes more lines than it adds, and in particular,
the "cmd_log_wc()" loop is now a very clean:
while ((commit = get_revision(rev)) != NULL) {
log_tree_commit(rev, commit);
free(commit->buffer);
commit->buffer = NULL;
}
so it doesn't get much prettier than this. All the complexity is entirely
hidden in log-tree.c, and any code that needs to flush the log literally
just needs to do the "if (rev->logopt) show_log(...)" incantation.
I had to make the combined_diff() logic take a "struct rev_info" instead
of just a "struct diff_options", but that part is pretty clean.
This does change "git whatchanged" from using "diff-tree" as the commit
descriptor to "commit", and I changed one of the tests to reflect that new
reality. Otherwise everything still passes, and my other tests look fine
too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Mod_type in particular sure looks like it wants to be used, but isn't.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The variable hunk_end points at a line number, which is
represented as unsigned long by all the other variables.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The previous round showed the delete-only hunks at the end, but
forgot to mark them interesting when they were.
Signed-off-by: Junio C Hamano <junkio@cox.net>
We used to lose hunks that appear at the end and have only
deletion. This makes sure that the record beyond the end of
file (which holds such deletions) is examined.
Signed-off-by: Junio C Hamano <junkio@cox.net>
More friendly for human reading I believe, and possibly friendlier to some
parsers (although only by an epsilon).
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This refactors the line-by-line callback mechanism used in
combine-diff so that other programs can reuse it more easily.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This replaces occurences of "blob", "commit", "tag", and "tree",
where they're really used as type specifiers, which we already
have defined global constants for.
Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Introduce tree-walk.[ch] and move "struct tree_desc" and
associated functions from various places.
Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and
move it to cache.h. This macro returns the canonicalized
st_mode value in the host byte order for files, symlinks and
directories -- to be compared with a tree_desc entry.
create_ce_mode(mode) in cache.h is similar but is intended to be
used for index entries (so it does not work for directories) and
returns the value in the network byte order.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Combined diffs don't null terminate things in the same way as standard
diffs. This is presumably wrong.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 6baf0484ef commit)
For some reason, combined diffs don't honour the --full-index flag when
emitting patches. Fix this.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from e70c6b3574 commit)
Combined diffs don't null terminate things in the same way as standard
diffs. This is presumably wrong.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
For some reason, combined diffs don't honour the --full-index flag when
emitting patches. Fix this.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When showing a conflicted merge from index stages and working
tree file, we did not fetch the mode from the working tree,
and mistook that as a deleted file. Also if the manual
resolution (or automated resolution by git rerere) ended up
taking either parent's version, we did not show _anything_ for
that path. Either was quite bad and confusing.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This shows "new file mode XXXX" and "deleted file mode XXXX"
lines like two-way diff-patch output does, by checking the
status from each parent.
The diff-raw output for combined diff is made a bit uglier by
showing diff status letters with each parent. While most of the
case you would see "MM" in the output, an Evil Merge that
touches a path that was added by inheriting from one parent is
possible and it would be shown like these:
$ git-diff-tree --abbrev -c HEAD
2d7ca89675eb8888b0b88a91102f096d4471f09f
::000000 000000 100644 0000000... 0000000... 31dd686... AA b
::000000 100644 100644 0000000... 6c884ae... c6d4fa8... AM d
::100644 100644 100644 4f7cbe7... f8c295c... 19d5d80... RR e
Signed-off-by: Junio C Hamano <junkio@cox.net>
Earlier it did not grok the 0{40} SHA1 very well, but what it
needed to do was to find the shortest 0{N} that is not used as a
valid object name to be consistent with the way names of valid
objects are abbreviated. This makes some users simpler.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This way, diff-files can make use of it. Also implement the
full suite of what diff_flush_raw() supports just for
consistency. With this, 'diff-tree -c -r --name-status' would
show what is expected.
There is no way to get the historical output (useful for
debugging and low-level Plumbing work) anymore, so tentatively
it makes '-m' to mean "do not combine and show individual diffs
with parents".
diff-files matches diff-tree to produce raw output for -c. For
textual combined diff, use -p -c.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is needed to make "diff-tree -c -M" to work semi-sensibly.
Otherwise rename detection, pickaxe and friends would never be
invoked.
Signed-off-by: Junio C Hamano <junkio@cox.net>
NOTE! This makes "-c" be the default, which effectively means that merges
are never ignored any more, and "-m" is a no-op. So it changes semantics.
I would also like to make "--cc" the default if you do patches, but didn't
actually do that.
The raw output format is not wonderfully pretty, but it's distinguishable
from a "normal patch" in that a normal patch with just one parent has just
one colon at the beginning, while a multi-parent raw diff has <n> colons
for <n> parents.
So now, in the kernel, when you do
git-diff-tree cce0cac125623f9b68f25dd1350f6d616220a8dd
(to see the manual ARM merge that had a conflict in arch/arm/Kconfig), you
get
cce0cac125623f9b68f25dd1350f6d616220a8dd
::100644 100644 100644 4a63a8e2e45247a11c068c6ed66c6e7aba29ddd9 77eee38762d69d3de95ae45dd9278df9b8225e2c 2f61726d2f4b636f6e66696700dbf71a59dad287 arch/arm/Kconfig
ie you see two colons (two parents), then three modes (parent modes
followed by result mode), then three sha1s (parent sha1s followed by
result sha1).
Which is pretty close to the normal raw diff output.
Cool/stupid exercise:
$ git-whatchanged | grep '^::' | cut -f2- | sort |
uniq -c | sort -n | less -S
will show which files have needed the most file-level merge conflict
resolution. Useful? Probably not. But kind of interesting.
For the kernel, it's
....
10 arch/ia64/Kconfig
11 drivers/scsi/Kconfig
12 drivers/net/Makefile
17 include/linux/libata.h
18 include/linux/pci_ids.h
23 drivers/net/Kconfig
24 drivers/scsi/libata-scsi.c
28 drivers/scsi/libata-core.c
43 MAINTAINERS
Signed-off-by: Junio C Hamano <junkio@cox.net>
When we remove a file, the parents' contents are all removed so
it is not that interesting to show all of them, but the fact it
was removed when all parents had it *is* unusual. When we add a
file, similarly the fact it was added when no parent wanted it
*is* unusual, and in addition the result matters, so show it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When we run combined diff from working tree (diff-files --cc),
we sent NULL to printf that is returned by find_unique_abbrev().
Signed-off-by: Junio C Hamano <junkio@cox.net>