This introduces the diff-core, the layer between the diff-tree
family and the external diff interface engine. The calls to the
interface diff-tree family uses (diff_change and diff_addremove)
have not changed and will not change. The purpose of the
diff-core layer is to provide an infrastructure to transform the
set of differences sent from the applications, before sending
them to the external diff interface.
The recently introduced rename detection code has been rewritten
to use the diff-core facility. When applications send in
separate creates and deletes, matching ones are transformed into
a single rename-and-edit diff, and sent out to the external diff
interface as such.
This patch also enhances the rename detection code further to be
able to detect copies. Currently this happens only as long as
copy sources appear as part of the modified files, but there
already is enough provision for callers to report unmodified
files to diff-core, so that they can be also used as copy source
candidates. Extending the callers this way will be done in a
separate patch.
Please see and marvel at how well this works by trying out the
newly added t/t4003-diff-rename-1.sh test script.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The version of wc I have (GNU textutils-2.1) puts spaces at the beginning
of lines. This patch should work for any version of wc.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Acked-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds the ability to actually create delta objects using a new tool:
git-mkdelta. It uses an ordered list of potential objects to deltafy
against earlier objects in the list. A cap on the depth of delta
references can be provided as well, otherwise the default is to not have
any limit. A limit of 0 will also undeltafy any given object.
Also provided is the beginning of a script to deltafy an entire
repository.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds knowledge of delta objects to fsck-cache and various object
parsing code. A new switch to git-fsck-cache is provided to display the
maximum delta depth found in a repository.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This makes the core code aware of delta objects and undeltafy them as
needed. The convention is to use read_sha1_file() to have
undeltafication done automatically (most users do that already so this
is transparent).
If the delta object itself has to be accessed then it must be done
through map_sha1_file() and unpack_sha1_file().
In that context mktag.c has been switched to read_sha1_file() as there
is no reason to do the full map+unpack manually.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix various things that sparse complains about:
- use NULL instead of 0
- make sure we declare everything properly, or mark it static
- use proper function declarations ("fn(void)" instead of "fn()")
Sparse is always right.
Instead of swapping the arguments just before output, this patch
makes the swapping happen on the input side of the diff core,
when "reverse-diff" is in effect. This greatly simplifies the
logic, but more importantly it is necessary for upcoming "copy
detection" work.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The same check we added earlier to update-cache to catch ENOTDIR
turns out to be missing from diff-files. This causes a
difference not being reported when you have DF/DF (a file in a
subdirectory) in the cache and DF is a file on the filesystem.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This one compares two pathnames that may be partial basenames, not
full paths. We need to get the path sorting right, since a directory
name will sort as if it had the final '/' at the end.
Add '-R' flag to diff-tree, and change the test subdirectory
shell files to be executable (something that Junio couldn't
get me to do through the pure patch with my current patch
handling infrastructure).
This cleans up the way calls are made into the diff core from diff-tree
family and diff-helper. Earlier, these programs had "if
(generating_patch)" sprinkled all over the place, but those ugliness are
gone and handled uniformly from the diff core, even when not generating
patch format.
This also allowed diff-cache and diff-files to acquire -R
(reverse) option to generate diff in reverse. Users of
diff-tree can swap two trees easily so I did not add -R there.
[ Linus' note: I'll add -R to "diff-tree" too, since a "commit
diff" doesn't have another tree to switch around: the other
tree is always the parent(s) of the commit ]
Also -M<digits-as-mantissa> suggestion made by Linus has been
implemented.
Documentation updates are also included.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fixes all in-code names that leaved during "big name change".
Signed-off-by: Alexey Nezhdanov <snake@penza-gsm.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A bit of clean-up of diff.c which fixes up some comments and removes a
memory leak.
This also re-introduces the rename score debugging fprintf(), but leaves
it #idef'ed it out for normal use.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This moves the git manpage to man7, since "git" isn't a direct command
per se. It also does two other things:
* Sort of works around the asciidoc 6.0.3 bug where the manpages all
get called "git.1". It just renames them to what they should have
been called.
* Fixes a cut-n-paste bug in git-diff-helper.txt that was making
asciidoc choke.
With -u flag, git-checkout-cache picks up the stat information
from newly created file and updates the cache. This removes the
need to run git-update-cache --refresh immediately after running
git-checkout-cache.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This rips out the rename detection engine from diff-helper and moves it
to the diff core, and updates the internal calling convention used by
diff-tree family into the diff core. In order to give the same option
name to diff-tree family as well as to diff-helper, I've changed the
earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the
natural abbreviation 'r' for 'rename' is already taken for 'recursive').
Although I did a fair amount of test with the git-diff-tree with
existing rename commits in the core GIT repository, this should still be
considered beta (preview) release. This patch depends on the diff-delta
infrastructure just committed.
This implements almost everything I wanted to see in this series of
patch, except a few minor cleanups in the calling convention into diff
core, but that will be a separate cleanup patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds the basic library functions to create and replay delta
information. Also included is a test-delta utility to validate the
code.
diff-delta was based on LibXDiff written by Davide Libenzi
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This test would have caught the strbuf eof condition gotcha,
hopefully fixed with my previous patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I just remembered why I placed that bogus "sb->len ==0 implies
sb->eof" condition there. We need at least something like this
to catch the normal EOF (that is, line termination immediately
followed by EOF) case. "if (feof(fp))" fires when we have
already read the eof, not when we are about read it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This normally doesn't matter, but if you have a filename that is
sometimes a directory and sometimes a regular file (or symlink),
we don't want the regular file case to trigger a "partial match".
We used to trigger the "interesting subdirectory" check for any
matching name that started with the same character series, regardless
of whether it had the matching slash or not.
We can't just do the "sha1_to_hex()" thing directly, since the
buffer in question will be overwritten by the name of the parent.
So teach diff_tree_commit() to generate the proper hex name itself.
We use "--" to mark end of command line switches, not "-". Also,
allow more flexibility in the passed-in sha1 names, in that a
single sha1 uses the "commit-diff" logic that compares against
its parent(s).
This patch adds a framework and a stub implementation of rename
detection to diff-helper program.
The current stub code is just enough to detect pure renames in
diff-tree output and not fancier. The plan is perhaps to use
the same delta code when Nico's delta storage patch is merged
for similarity evaluation purposes.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
explicit references for reachability analysis.
We already had that as separate logic in git-prune-script, so this
is not a new special case - it's an old special case moved into
fsck, making normal usage be much simpler.
This implements the output format suggested by Linus in
<Pine.LNX.4.58.0505161556260.18337@ppc970.osdl.org>, except the
imaginary diff option is spelled "diff --git" with double dashes as
suggested by Matthias Urlichs.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Raw hashes should be unsigned char.
- String functions want signed char.
- Hash and compress functions want unsigned char.
Signed-off By: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The documentation of the test harness still refer to old
numbering and also contains an obvious typo.
Also "make test" should be run after making sure we have built
all binaries, since test is designed to test the newly built
ones.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Petr Baudis <pasky@ucw.cz>
xmalloc() and xrealloc() now take their sizes as size_t-type arguments.
Introduced complementary xcalloc().
Signed-off-by: Brad Roberts <braddr@puremagic.com>
Signed-off-by: Petr Baudis <pasky@ucw.cz>