kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Junio C Hamano	b65982b608	Optimize "diff-index --cached" using cache-tree When running "diff-index --cached" after making a change to only a small portion of the index, there is no point unpacking unchanged subtrees into the index recursively, only to find that all entries match anyway. Tweak unpack_trees() logic that is used to read in the tree object to catch the case where the tree entry we are looking at matches the index as a whole by looking at the cache-tree. As an exercise, after modifying a few paths in the kernel tree, here are a few numbers on my Athlon 64X2 3800+: (without patch, hot cache) $ /usr/bin/time git diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.07user 0.02system 0:00.09elapsed 102%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+9407minor)pagefaults 0swaps (with patch, hot cache) $ /usr/bin/time ../git.git/git-diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.02user 0.00system 0:00.02elapsed 103%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+2446minor)pagefaults 0swaps Cold cache numbers are very impressive, but it does not matter very much in practice: (without patch, cold cache) $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches' $ /usr/bin/time git diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.06user 0.17system 0:10.26elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k 247032inputs+0outputs (1172major+8237minor)pagefaults 0swaps (with patch, cold cache) $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches' $ /usr/bin/time ../git.git/git-diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.02user 0.01system 0:01.01elapsed 3%CPU (0avgtext+0avgdata 0maxresident)k 18440inputs+0outputs (79major+2369minor)pagefaults 0swaps This of course helps "git status" as well. (without patch, hot cache) $ /usr/bin/time ../git.git/git-status >/dev/null 0.17user 0.18system 0:00.35elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+5336outputs (0major+10970minor)pagefaults 0swaps (with patch, hot cache) $ /usr/bin/time ../git.git/git-status >/dev/null 0.10user 0.16system 0:00.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+5336outputs (0major+3921minor)pagefaults 0swaps Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	b87fc96476	cache-tree.c::cache_tree_find(): simplify internal API Earlier cache_tree_find() needs to be called with a valid cache_tree, but repeated look-up may find an invalid or missing cache_tree in between. Help simplify the callers by returning NULL to mean "nothing appropriate found" when the input is NULL. Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	d11b8d3425	write-tree --ignore-cache-tree This allows you to discard the cache-tree information before writing the tree out of the index (i.e. it always recomputes the tree object names for all the subtrees). This is only useful as a debug option, so I did not bother documenting it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	b9d37a5420	Move prime_cache_tree() to cache-tree.c The interface to build cache-tree belongs there. Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	331fcb598e	git add --intent-to-add: do not let an empty blob be committed by accident Writing a tree out of an index with an "intent to add" entry is blocked. This implies that you cannot "git commit" from such a state; however you can still do "git commit -a" or "git commit $that_path". Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Nanako Shiraishi	7ba04d9f37	cache-tree.c: make cache_tree_find() static This function is not used by any other file. Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	edae5f0c20	write-tree: properly detect failure to write tree objects Tomasz Fortuna reported that "git commit" does not error out properly when it cannot write tree objects out. "git write-tree" shares the same issue, as the failure to notice the error is deep in the logic to write tree objects out recursively. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	45525bd022	Make error messages from cherry-pick/revert more sensible The original "rewrite in C" did somewhat a sloppy job while stealing code from git-write-tree. The caller pretends as if the write_tree() function would return an error code and being able to issue a sensible error message itself, but write_tree() function just calls die() and never returns an error. Worse yet, the function claims that it was running git-write-tree (which is no longer true after cherry-pick stole it). Tested-by: Björn Steinbrink <B.Steinbrink@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Linus Torvalds	7a51ed66f6	Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	17 years ago
Pierre Habouzit	1dffb8fa80	Small cache_tree_write refactor. This function cannot fail, make it void. Also make write_one act on a const char* instead of a char*. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	18 years ago
Pierre Habouzit	ba3ed09728	Now that cache.h needs strbuf.h, remove useless includes. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	18 years ago
Pierre Habouzit	f1696ee398	Strbuf API extensions and fixes. * Add strbuf_rtrim to remove trailing spaces. * Add strbuf_insert to insert data at a given position. * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final \0 so the overflow test for snprintf is the strict comparison. This is not critical as the growth mechanism chosen will always allocate _more_ memory than asked, so the second test will not fail. It's some kind of miracle though. * Add size extension hints for strbuf_init and strbuf_read. If 0, default applies, else: + initial buffer has the given size for strbuf_init. + first growth checks it has at least this size rather than the default 8192. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	18 years ago
Pierre Habouzit	5242bcbb63	Use strbuf API in cache-tree.c Should even be marginally faster. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	18 years ago
Junio C Hamano	63daae47e5	Two trivial -Wcast-qual fixes Luiz Fernando N. Capitulino noticed the one in tree-walk.h where we cast away constness while computing the legnth of a tree entry. Signed-off-by: Junio C Hamano <gitster@pobox.com>	18 years ago
Martin Waitz	302b9282c9	rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Linus Torvalds	f35a6d3bce	Teach core object handling functions about gitlinks This teaches the really fundamental core SHA1 object handling routines about gitlinks. We can compare trees with gitlinks in them (although we can not actually generate patches for them yet - just raw git diffs), and they show up as commits in "git ls-tree". We also know to compare gitlinks as if they were directories (ie the normal "sort as trees" rules apply). [jc: amended a cut&paste error] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Johannes Sixt	3d12d0cfbb	Catch errors when writing an index that contains invalid objects. If git-write-index is called without --missing-ok, it reports invalid objects that it finds in the index. But without this patch it dies right away or may run into an infinite loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	4903161fb8	Surround "#define DEBUG 0" with "#ifndef DEBUG..#endif" Otherwise "make CFLAGS=-DDEBUG=1" is cumbersome to run. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Rene Scharfe	abdc3fc842	Add hash_sha1_file() Most callers of write_sha1_file_prepare() are only interested in the resulting hash but don't care about the returned file name or the header. This patch adds a simple wrapper named hash_sha1_file() which does just that, and converts potential callers. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Shawn Pearce	e702496e43	Convert memcpy(a,b,20) to hashcpy(a,b). This abstracts away the size of the hash values when copying them from memory location to memory location, much as the introduction of hashcmp abstracted away hash value comparsion. A few call sites were using char* rather than unsigned char* so I added the cast rather than open hashcpy to be void. This is a reasonable tradeoff as most call sites already use unsigned char and the existing hashcmp is also declared to be unsigned char*. [jc: Splitted the patch to "master" part, to be followed by a patch for merge-recursive.c which is not in "master" yet. Fixed the cast in the latter hunk to combine-diff.c which was wrong in the original. Also converted ones left-over in combine-diff.c, diff-lib.c and upload-pack.c ] Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Junio C Hamano	00703e6d68	cache-tree: a bit more debugging support. Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Junio C Hamano	6bd20358a9	write-tree: --prefix=<path> The "bind" commit can express an aggregation of multiple projects into a single commit. In such an organization, there would be one project, root of whose tree object is at the same level of the root of the aggregated projects, and other projects have their toplevel in separate subdirectories. Let's call that root level project the "primary project", and call other ones just "subprojects". You would first read-tree the primary project, and then graft the subprojects under their appropriate location using read-tree --prefix=<subdir>/ repeatedly. To write out a tree object from such an index for a subproject, write-tree --prefix=<subdir>/ is used. Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Johannes Schindelin	0111ea38cb	cache-tree: replace a sscanf() by two strtol() calls On one of my systems, sscanf() first calls strlen() on the buffer. But this buffer is not terminated by NUL. So git crashed. strtol() does not share that problem, as it stops reading after the first non-digit. [jc: original patch was wrong and did not read the cache-tree structure correctly; this has been fixed up and tested minimally with fsck-objects. ] Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Junio C Hamano	7bc70a590d	cache-tree.c: typefix Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Junio C Hamano	2956dd3bd7	cache_tree_update: give an option to update cache-tree only. When the extra "dryrun" parameter is true, cache_tree_update() recomputes the invalid entry but does not actually creates new tree object. Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Junio C Hamano	7927a55d5b	read-tree: teach 1-way merege and plain read to prime cache-tree. This teaches read-tree to fully populate valid cache-tree when reading a tree from scratch, or reading a single tree into an existing index, reusing only the cached stat information (i.e. one-way merge). We have already taught update-index about cache-tree, so "git checkout" followed by updates to a few path followed by a "git commit" would become very efficient. Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Junio C Hamano	61fa30972c	cache-tree: sort the subtree entries. Not that this makes practical performance difference; the kernel tree for example has 200 or so directories that have subdirectory, and the largest ones have 57 of them (fs and drivers). With a test to apply 600 patches with git-apply and git-write-tree, this did not make more than one per-cent of a difference, but it is a good cleanup. Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Junio C Hamano	bad68ec924	index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Junio C Hamano	dd0c34c46b	cache-tree: protect against "git prune". We reused the cache-tree data without verifying the tree object still exists. Recompute in cache_tree_update() an otherwise valid cache-tree entry when the tree object disappeared. This is not usually a problem, but theoretically without this fix things can break when the user does something like this: - read-index from a side branch - write-tree the result - remove the side branch with "git branch -D" - remove the unreachable objects with "git prune" - write-tree what is in the index. Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago
Junio C Hamano	749864627c	Add cache-tree. The cache_tree data structure is to cache tree object names that would result from the current index file. The idea is to have an optional file to record each tree object name that corresponds to a directory path in the cache when we run write_cache(), and read it back when we run read_cache(). During various index manupulations, we selectively invalidate the parts so that the next write-tree can bypass regenerating tree objects for unchanged parts of the directory hierarchy. We could perhaps make the cache-tree data an optional part of the index file, but that would involve the index format updates, so unless we need it for performance reasons, the current plan is to use a separate file, $GIT_DIR/index.aux to store this information and link it with the index file with the checksum that is already used for index file integrity check. Signed-off-by: Junio C Hamano <junkio@cox.net>	19 years ago

33 Commits (ed24e401e0e6ab860475b8575e28a2c6ea99cc69)