Commit Graph

146 Commits (b2fe06572226a6056776e3a4e5cd4ad273e60c08)

Author SHA1 Message Date
Linus Torvalds d1f2d7e8ca Make run_diff_index() use unpack_trees(), not read_tree()
A plain "git commit" would still run lstat() a lot more than necessary,
because wt_status_print() would cause the index to be repeatedly flushed
and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit
to be lost, resulting in the files in the index being lstat'ed three
times each.

The reason why wt-status.c ended up invalidating and re-reading the
cache multiple times was that it uses "run_diff_index()", which in turn
uses "read_tree()" to populate the index with *both* the old index and
the tree we want to compare against.

So this patch re-writes run_diff_index() to not use read_tree(), but
instead use "unpack_trees()" to diff the index to a tree.  That, in
turn, means that we don't need to modify the index itself, which then
means that we don't need to invalidate it and re-read it!

This, together with the lstat() optimizations, means that "git commit"
on the kernel tree really only needs to lstat() the index entries once.
That noticeably cuts down on the cached timings.

Best time before:

	[torvalds@woody linux]$ time git commit > /dev/null
	real    0m0.399s
	user    0m0.232s
	sys     0m0.164s

Best time after:

	[torvalds@woody linux]$ time git commit > /dev/null
	real    0m0.254s
	user    0m0.140s
	sys     0m0.112s

so it's a noticeable improvement in addition to being a nice conceptual
cleanup (it's really not that pretty that "run_diff_index()" dirties the
index!)

Doing an "strace -c" on it also shows that as it cuts the number of
lstat() calls by two thirds, it goes from being lstat()-limited to being
limited by getdents() (which is the readdir system call):

Before:
	% time     seconds  usecs/call     calls    errors syscall
	------ ----------- ----------- --------- --------- ----------------
	 60.69    0.000704           0     69230        31 lstat
	 23.62    0.000274           0      5522           getdents
	  8.36    0.000097           0      5508      2638 open
	  2.59    0.000030           0      2869           close
	  2.50    0.000029           0       274           write
	  1.47    0.000017           0      2844           fstat

After:
	% time     seconds  usecs/call     calls    errors syscall
	------ ----------- ----------- --------- --------- ----------------
	 45.17    0.000276           0      5522           getdents
	 26.51    0.000162           0     23112        31 lstat
	 19.80    0.000121           0      5503      2638 open
	  4.91    0.000030           0      2864           close
	  1.48    0.000020           0       274           write
	  1.34    0.000018           0      2844           fstat
	...

It passes the test-suite for me, but this is another of one of those
really core functions, and certainly pretty subtle, so..

NOTE! The Linux lstat() system call is really quite cheap when everything
is cached, so the fact that this is quite noticeable on Linux is likely to
mean that it is *much* more noticeable on other operating systems. I bet
you'll see a much bigger performance improvement from this on Windows in
particular.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 13:05:27 -08:00
Linus Torvalds 7a51ed66f6 Make on-disk index representation separate from in-core one
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.

In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.

This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Steffen Prohaska ecf4831d89 Use is_absolute_path() in diff-lib.c, lockfile.c, setup.c, trace.c
Using the helper function to test for absolute paths makes porting easier.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-26 12:32:05 -08:00
Junio C Hamano e6cb314c08 Merge branch 'ph/diffopts'
* ph/diffopts:
  Reorder diff_opt_parse options more logically per topics.
  Make the diff_options bitfields be an unsigned with explicit masks.
  Use OPT_BIT in builtin-pack-refs
  Use OPT_BIT in builtin-for-each-ref
  Use OPT_SET_INT and OPT_BIT in builtin-branch
  parse-options new features.
2007-11-18 15:50:16 -08:00
Pierre Habouzit 8f67f8aefb Make the diff_options bitfields be an unsigned with explicit masks.
reverse_diff was a bit-value in disguise, it's merged in the flags now.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 16:54:15 -08:00
Junio C Hamano fb63d7f889 git-add: make the entry stat-clean after re-adding the same contents
Earlier in commit 0781b8a9b2
(add_file_to_index: skip rehashing if the cached stat already
matches), add_file_to_index() were taught not to re-add the path
if it already matches the index.

The change meant well, but was not executed quite right.  It
used ie_modified() to see if the file on the work tree is really
different from the index, and skipped adding the contents if the
function says "not modified".

This was wrong.  There are three possible comparison results
between the index and the file in the work tree:

 - with lstat(2) we _know_ they are different.  E.g. if the
   length or the owner in the cached stat information is
   different from the length we just obtained from lstat(2), we
   can tell the file is modified without looking at the actual
   contents.

 - with lstat(2) we _know_ they are the same.  The same length,
   the same owner, the same everything (but this has a twist, as
   described below).

 - we cannot tell from lstat(2) information alone and need to go
   to the filesystem to actually compare.

The last case arises from what we call 'racy git' situation,
that can be caused with this sequence:

    $ echo hello >file
    $ git add file
    $ echo aeiou >file ;# the same length

If the second "echo" is done within the same filesystem
timestamp granularity as the first "echo", then the timestamp
recorded by "git add" and the timestamp we get from lstat(2)
will be the same, and we can mistakenly say the file is not
modified.  The path is called 'racily clean'.  We need to
reliably detect racily clean paths are in fact modified.

To solve this problem, when we write out the index, we mark the
index entry that has the same timestamp as the index file itself
(that is the time from the point of view of the filesystem) to
tell any later code that does the lstat(2) comparison not to
trust the cached stat info, and ie_modified() then actually goes
to the filesystem to compare the contents for such a path.

That's all good, but it should not be used for this "git add"
optimization, as the goal of "git add" is to actually update the
path in the index and make it stat-clean.  With the false
optimization, we did _not_ cause any data loss (after all, what
we failed to do was only to update the cached stat information),
but it made the following sequence leave the file stat dirty:

    $ echo hello >file
    $ git add file
    $ echo hello >file ;# the same contents
    $ git add file

The solution is not to use ie_modified() which goes to the
filesystem to see if it is really clean, but instead use
ie_match_stat() with "assume racily clean paths are dirty"
option, to force re-adding of such a path.

There was another problem with "git add -u".  The codepath
shares the same issue when adding the paths that are found to be
modified, but in addition, it asked "git diff-files" machinery
run_diff_files() function (which is "git diff-files") to list
the paths that are modified.  But "git diff-files" machinery
uses the same ie_modified() call so that it does not report
racily clean _and_ actually clean paths as modified, which is
not what we want.

The patch allows the callers of run_diff_files() to pass the
same "assume racily clean paths are dirty" option, and makes
"git-add -u" codepath to use that option, to discover and re-add
racily clean _and_ actually clean paths.

We could further optimize on top of this patch to differentiate
the case where the path really needs re-adding (i.e. the content
of the racily clean entry was indeed different) and the case
where only the cached stat information needs to be refreshed
(i.e. the racily clean entry was actually clean), but I do not
think it is worth it.

This patch applies to maint and all the way up.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:37:39 -08:00
Junio C Hamano 4bd5b7dacc ce_match_stat, run_diff_files: use symbolic constants for readability
ce_match_stat() can be told:

 (1) to ignore CE_VALID bit (used under "assume unchanged" mode)
     and perform the stat comparison anyway;

 (2) not to perform the contents comparison for racily clean
     entries and report mismatch of cached stat information;

using its "option" parameter.  Give them symbolic constants.

Similarly, run_diff_files() can be told not to report anything
on removed paths.  Also give it a symbolic constant for that.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:24:51 -08:00
Junio C Hamano b78281f721 diff --no-index: do not forget to run diff_setup_done()
Code inspection by Linus found this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-14 12:12:32 -07:00
René Scharfe 6d2d9e8666 diff: squelch empty diffs even more
When we compare two non-tracked files, or explicitly
specify --no-index, the suggestion to run git-status
is not helpful.

The patch adds a new diff_options bitfield member, no_index, that
is used instead of the special value of -2 of the rev_info field
max_count to indicate that the index is not to be used.  This makes
it possible to pass that flag down to diffcore_skip_stat_unmatch(),
which only has one diff_options parameter.

This could even become a cleanup if we removed all assignments of
max_count to a value of -2 (viz. replacement of a magic value with
a self-documenting field name) but I didn't dare to do that so late
in the rc game..

The no_index bit, if set, then tells diffcore_skip_stat_unmatch()
to not account for any skipped stat-mismatches, which avoids the
suggestion to run git-status.

Signed-off-by: Rene Scharfe <rene.scharfe@lsfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-14 22:34:58 -07:00
René Scharfe 4d3f4b80e4 diff-lib.c: don't strdup twice
The static function read_directory in diff-lib.c is only ever called
with struct path_list lists with .strdup_paths turned on, i.e.
path_list_insert will strdup the paths for us (again).  Let's take
advantage of that and stop doing it twice.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-07 11:53:49 -07:00
Junio C Hamano a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
Junio C Hamano afb5b6a24b Merge branch 'lt/gitlink'
* lt/gitlink:
  Tests for core subproject support
  Expose subprojects as special files to "git diff" machinery
  Fix some "git ls-files -o" fallout from gitlinks
  Teach "git-read-tree -u" to check out submodules as a directory
  Teach git list-objects logic to not follow gitlinks
  Fix gitlink index entry filesystem matching
  Teach "git-read-tree -u" to check out submodules as a directory
  Teach git list-objects logic not to follow gitlinks
  Don't show gitlink directories when we want "other" files
  Teach git-update-index about gitlinks
  Teach directory traversal about subprojects
  Fix thinko in subproject entry sorting
  Teach core object handling functions about gitlinks
  Teach "fsck" not to follow subproject links
  Add "S_IFDIRLNK" file mode infrastructure for git links
  Add 'resolve_gitlink_ref()' helper function
  Avoid overflowing name buffer in deep directory structures
  diff-lib: use ce_mode_from_stat() rather than messing with modes manually
2007-04-21 17:21:10 -07:00
Junio C Hamano 1ad029b6a1 Do not default to --no-index when given two directories.
git-diff -- a/ b/ always defaulted to --no-index, primarily
because the function is_in_index() was implemented quite
incorrectly.

Noticed by Patrick Maaß and Simon Schubert independently,
initial patch was provided by Patrick but I fixed it
differently.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-13 19:34:35 -07:00
Linus Torvalds 844c11ae25 diff-lib: use ce_mode_from_stat() rather than messing with modes manually
The diff helpers used to do the magic mode canonicalization and all the
other special mode handling by hand ("trust executable bit" and "has
symlink support" handling).

That's bogus. Use "ce_mode_from_stat()" that does this all for us.

This is also going to be required when we add support for links to other
git repositories.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-09 22:30:04 -07:00
Junio C Hamano 822cac0155 Teach --quiet to diff backends.
This teaches git-diff-files, git-diff-index and git-diff-tree
backends to exit early under --quiet option.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14 16:21:19 -07:00
Alex Riesen 41bbf9d585 Allow git-diff exit with codes similar to diff(1)
This introduces a new command-line option: --exit-code. The diff
programs will return 1 for differences, return 0 for equality, and
something else for errors.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14 16:21:19 -07:00
Junio C Hamano cf6981d493 Merge branch 'js/diff-ni'
* js/diff-ni:
  Get rid of the dependency to GNU diff in the tests
  diff --no-index: support /dev/null as filename
  diff-ni: fix the diff with standard input
  diff: support reading a file from stdin via "-"
2007-03-10 23:26:33 -08:00
Junio C Hamano e6f9511343 Merge branch 'js/symlink'
* js/symlink:
  Tell multi-parent diff about core.symlinks.
  Handle core.symlinks=false case in merge-recursive.
  Add core.symlinks to mark filesystems that do not support symbolic links.
2007-03-04 17:31:09 -08:00
Johannes Schindelin 0c725f1bd9 diff --no-index: support /dev/null as filename
This allows us to create "new file" and "delete file" patches.
It also cleans up the code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-04 00:20:31 -08:00
Junio C Hamano 3afaa72d7d diff-ni: fix the diff with standard input
The earlier commit to read from stdin was full of problems, and
this corrects them.

 - The mode bits should have been set to satisify S_ISREG(); we
   forgot to the S_IFREG bits and hardcoded 0644;
 - We did not give escape hatch to name a path whose name is
   really "-".  Allow users to say "./-" for that;
 - Use of xread() was not prepared to see short read (e.g. reading
   from tty) nor handing read errors.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-04 00:17:27 -08:00
Johannes Schindelin 5332b2af10 diff: support reading a file from stdin via "-"
This allows you to say

	echo Hello World | git diff x -

to compare the contents of file "x" with the line "Hello World".
This automatically switches to --no-index mode.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-03 23:45:47 -08:00
Junio C Hamano ae792aa52b diff-ni: allow running from a subdirectory.
When run from a subdirectory of a repository, the command forgot
to adjust paths given to it with prefix.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-03 23:45:14 -08:00
Johannes Sixt 78a8d641c1 Add core.symlinks to mark filesystems that do not support symbolic links.
Some file systems that can host git repositories and their working copies
do not support symbolic links. But then if the repository contains a symbolic
link, it is impossible to check out the working copy.

This patch enables partial support of symbolic links so that it is possible
to check out a working copy on such a file system.  A new flag
core.symlinks (which is true by default) can be set to false to indicate
that the filesystem does not support symbolic links. In this case, symbolic
links that exist in the trees are checked out as small plain files, and
checking in modifications of these files preserve the symlink property in
the database (as long as an entry exists in the index).

Of course, this does not magically make symbolic links work on such defective
file systems; hence, this solution does not help if the working copy relies
on that an entry is a real symbolic link.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-02 16:58:05 -08:00
Johannes Schindelin fcfa33ec90 diff: make more cases implicit --no-index
When specifying an absolute path, or a relative path pointing outside
the working tree, do not fail, but roll your own diffopt parsing,
and execute a --no-index diff.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-28 16:32:31 -08:00
Johannes Schindelin 34a5e1a2d9 diff --no-index: also imitate the exit status of diff(1)
diff sets the exit status to 0 when no changes were found, to 1
when changes were found, and 2 means error.

We imitate this to be able to use "git diff" in the test scripts.
(Actually, keeping in line with the rest of git, -1 is returned
on error, which corresponds to an exit status 255).

To find out if the diff is not empty, a member called
"found_changes" was introduced in struct diff_options, which is
set in builtin_diff() and fn_out_consume().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-26 01:20:55 -08:00
Junio C Hamano 048f48a2fd Merge branch 'master' into js/diff-ni
* master: (201 commits)
  Documentation: link in 1.5.0.2 material to the top documentation page.
  Documentation: document remote.<name>.tagopt
  GIT 1.5.0.2
  git-remote: support remotes with a dot in the name
  Documentation: describe "-f/-t/-m" options to "git-remote add"
  diff --cc: fix display of symlink conflicts during a merge.
  merge-recursive: fix longstanding bug in merging symlinks
  merge-index: fix longstanding bug in merging symlinks
  diff --cached: give more sensible error message when HEAD is yet to be created.
  Update tests to use test-chmtime
  Add test-chmtime: a utility to change mtime on files
  Add Release Notes to prepare for 1.5.0.2
  Allow arbitrary number of arguments to git-pack-objects
  rerere: do not deal with symlinks.
  rerere: do not skip two conflicted paths next to each other.
  Don't modify CREDITS-FILE if it hasn't changed.
  diff-patch: Avoid emitting double-slashes in textual patch.
  Reword git-am 3-way fallback failure message.
  Limit filename for format-patch
  core.legacyheaders: Use the description used in RelNotes-1.5.0
  ...
2007-02-26 01:20:42 -08:00
Junio C Hamano 4fc970c438 diff --cc: fix display of symlink conflicts during a merge.
"git-diff-files --cc" to show conflicts during merge did not pass
the correct mode information for the working tree down, and showed
bogus combined diff.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-25 22:25:30 -08:00
Johannes Schindelin 646b329961 Fix typo: do not show name1 when name2 fails
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-25 14:42:23 -08:00
Junio C Hamano efdfd6c8d4 Evil Merge branch 'jc/status' (early part) into js/diff-ni
* 'jc/status' (early part):
  run_diff_{files,index}(): update calling convention.
  update-index: do not die too early in a read-only repository.
  git-status: do not be totally useless in a read-only repository.

This is to resolve semantic conflict (which is not textual) that
changes the calling convention of run_diff_files() early.
2007-02-24 02:20:13 -08:00
Johannes Schindelin d516c2d119 Teach git-diff-files the new option `--no-index`
With this flag and given two paths, git-diff-files behaves as a GNU diff
lookalike (plus the git goodies like --check, colour, etc.).  This flag
is also available in git-diff.  It also works outside of a git repository.

In addition, if git-diff{,-files} is called without revision or stage
parameter, and with exactly two paths at least one of which is not tracked,
the default is --no-index.

So, you can now say

	git diff /etc/inittab /etc/fstab

and it actually works!

This also unifies the duplicated argument parsing between cmd_diff_files()
and builtin_diff_files().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-22 20:59:55 -08:00
Junio C Hamano b4e1e4a787 run_diff_{files,index}(): update calling convention.
They used to open and read index themselves, but they now expect
their callers to do so.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-22 02:02:15 -08:00
Junio C Hamano 185c975faa Do not take mode bits from index after type change.
When we do not trust executable bit from lstat(2), we copied
existing ce_mode bits without checking if the filesystem object
is a regular file (which is the only thing we apply the "trust
executable bit" business) nor if the blob in the index is a
regular file (otherwise, we should do the same as registering a
new regular file, which is to default non-executable).

Noticed by Johannes Sixt.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-16 22:56:06 -08:00
Junio C Hamano 1cfe77333f git-blame: no rev means start from the working tree file.
Warning: this changes the semantics.

This makes "git blame" without any positive rev to start digging
from the working tree copy, which is made into a fake commit
whose sole parent is the HEAD.

It also adds --contents <file> option to pretend as if the
working tree copy has the contents of the named file.  You can
use '-' to make the command read from the standard input.

If you want the command to start annotating from the HEAD
commit, you need to explicitly give HEAD parameter.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-05 14:55:11 -08:00
Junio C Hamano e9c8409900 diff-index --cached --raw: show tree entry on the LHS for unmerged entries.
This updates the way diffcore represents an unmerged pair
somewhat.  It used to be that entries with mode=0 on both sides
were used to represent an unmerged pair, but now it has an
explicit flag.  This is to allow diff-index --cached to report
the entry from the tree when the path is unmerged in the index.

This is used in updating "git reset <tree> -- <path>" to restore
absense of the path in the index from the tree.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-06 22:57:42 -08:00
Paul Mackerras cb2b9f5ee1 diff-index --cc shows a 3-way diff between HEAD, index and working tree.
This implements a 3-way diff between the HEAD commit, the state in the
index, and the working directory.  This is like the n-way diff for a
merge, and uses much of the same code.  It is invoked with the -c flag
to git-diff-index, which it already accepted and did nothing with.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-05 00:13:25 -07:00
Junio C Hamano a8e0d16d85 Convert memset(hash,0,20) to hashclr(hash).
In the same spirit as hashcmp() and hashcpy().

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23 13:57:23 -07:00
Shawn Pearce e702496e43 Convert memcpy(a,b,20) to hashcpy(a,b).
This abstracts away the size of the hash values when copying them
from memory location to memory location, much as the introduction
of hashcmp abstracted away hash value comparsion.

A few call sites were using char* rather than unsigned char* so
I added the cast rather than open hashcpy to be void*.  This is a
reasonable tradeoff as most call sites already use unsigned char*
and the existing hashcmp is also declared to be unsigned char*.

[jc: Splitted the patch to "master" part, to be followed by a
 patch for merge-recursive.c which is not in "master" yet.

 Fixed the cast in the latter hunk to combine-diff.c which was
 wrong in the original.

 Also converted ones left-over in combine-diff.c, diff-lib.c and
 upload-pack.c ]

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23 13:53:10 -07:00
David Rientjes a89fccd281 Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.
Introduces global inline:

	hashcmp(const unsigned char *sha1, const unsigned char *sha2)

Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).

Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17 14:23:53 -07:00
Junio C Hamano b19beecd94 Merge branch 'lt/objlist' into next
* lt/objlist:
  Add "named object array" concept
  xdiff: minor changes to match libxdiff-0.21
  fix rfc2047 formatter.
  Fix t8001-annotate and t8002-blame for ActiveState Perl
  Add specialized object allocator
2006-06-19 18:47:29 -07:00
Linus Torvalds 1f1e895fcc Add "named object array" concept
We've had this notion of a "object_list" for a long time, which eventually
grew a "name" member because some users (notably git-rev-list) wanted to
name each object as it is generated.

That object_list is great for some things, but it isn't all that wonderful
for others, and the "name" member is generally not used by everybody.

This patch splits the users of the object_list array up into two: the
traditional list users, who want the list-like format, and who don't
actually use or want the name. And another class of users that really used
the list as an extensible array, and generally wanted to name the objects.

The patch is fairly straightforward, but it's also biggish. Most of it
really just cleans things up: switching the revision parsing and listing
over to the array makes things like the builtin-diff usage much simpler
(we now see exactly how many members the array has, and we don't get the
objects reversed from the order they were on the command line).

One of the main reasons for doing this at all is that the malloc overhead
of the simple object list was actually pretty high, and the array is just
a lot denser. So this patch brings down memory usage by git-rev-list by
just under 3% (on top of all the other memory use optimizations) on the
mozilla archive.

It does add more lines than it removes, and more importantly, it adds a
whole new infrastructure for maintaining lists of objects, but on the
other hand, the new dynamic array code is pretty obvious. The change to
builtin-diff-tree.c shows a fairly good example of why an array interface
is sometimes more natural, and just much simpler for everybody.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-19 18:45:48 -07:00
Florian Forster b4b1550315 Don't instantiate structures with FAMs.
Since structures with `flexible array members' are an incomplete datatype ANSI
C99 forbids creating instances of them. This patch removes such an instance
from `diff-lib.c' and replaces it with a pointer to a `struct
combine_diff_path'. Since all neccessary memory is allocated at once the number
of calls to `xmalloc' is not increased.

Signed-off-by: Florian Forster <octo@verplant.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-18 21:19:09 -07:00
Junio C Hamano 5c21ac0e7c Libified diff-index: backward compatibility fix.
"diff-index -m" does not mean "do not ignore merges", but means
"pretend missing files match the index".

The previous round tried to address this, but failed because
setup_revisions() ate "-m" flag before the caller had a chance
to intervene.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22 04:03:32 -07:00
Junio C Hamano e09ad6e1e3 Libify diff-index.
The second installment to libify diff brothers.  The pathname
arguments are checked more strictly than before because we now
use the revision.c::setup_revisions() infrastructure.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22 02:43:00 -07:00
Junio C Hamano 6973dcaee7 Libify diff-files.
This is the first installment to libify diff brothers.

The updated diff-files uses revision.c::setup_revisions()
infrastructure to parse its command line arguments, which means
the pathname arguments are checked more strictly than before.
The tests are adjusted to separate possibly missing paths from
the rest of arguments with double-dashes, to show the kosher
way.

As Linus pointed out, renaming diff.c to diff-lib.c was simply
stupid, so I am renaming it back.  The new diff-lib.c is to
contain pieces extracted from diff brothers.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22 02:37:45 -07:00
Junio C Hamano 44aad15f0d diff --stat: do not drop rename information.
When a verbatim rename or copy is detected, we did not show
anything on the "diff --stat" for the filepair.  This makes it
to show the rename information.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-19 23:22:19 -07:00
Junio C Hamano ba580aeafb diff: move diff.c to diff-lib.c to make room.
Now I am not doing any real "git-diff in C" yet, but this would
help before doing so.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-19 15:38:14 -07:00