kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Jens Lehmann	3bfc450476	git status: ignoring untracked files must apply to submodules too Since 1.7.0 submodules are considered dirty when they contain untracked files. But when git status is called with the "-uno" option, the user asked to ignore untracked files, so they must be ignored in submodules too. To achieve this, the new flag DIFF_OPT_IGNORE_UNTRACKED_IN_SUBMODULES is introduced. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Jens Lehmann	85adbf2f75	git status: Fix false positive "new commits" output for dirty submodules Testing if the output "new commits" should appear in the long format of "git status" is done by comparing the hashes of the diffpair. This always resulted in printing "new commits" for submodules that contained untracked or modified content, even if they did not contain new commits. The reason was that match_stat_with_submodule() did set the "changed" flag for dirty submodules, resulting in two->sha1 being set to the null_sha1 at the call sites, which indicates that new commits are present. This is changed so that when no new commits are present, the same object names are in the sha1 field for both sides of the filepair, and the working tree side will have the "dirty_submodule" flag set when appropriate. For a submodule to be seen as modified even when it just has a dirty work tree, some conditions had to be extended to also check for the "dirty_submodule" flag. Unfortunately the test case that should have found this bug had been changed incorrectly too. It is fixed and extended to test for other combinations too. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Jens Lehmann	ae6d5c1b6f	Refactor dirty submodule detection in diff-lib.c Moving duplicated code into the new function match_stat_with_submodule(). Replacing the implicit activation of detailed checks for the dirtiness of submodules when DIFF_FORMAT_PATCH was selected with explicitly setting the recently added DIFF_OPT_DIRTY_SUBMODULES option in diff_setup_done(). Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Junio C Hamano	32962c9bd5	revision: introduce setup_revision_opt So far the last parameter to setup_revisions() was to specify the default ref when the command line did not give any (typically "HEAD"). This changes it to take a pointer to a structure so that we can add other information without touching too many codepaths in later patches. There is no functionality change. Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Jens Lehmann	9297f77e6d	git status: Show detailed dirty status of submodules in long format Since 1.7.0 there are three reasons a submodule is considered modified against the work tree: It contains new commits, modified content or untracked content. Lets show all reasons in the long format of git status, so the user can better asses the nature of the modification. This change does not affect the short and porcelain formats. Two new members are added to "struct wt_status_change_data" to store the information gathered by run_diff_files(). wt-status.c uses the new flag DIFF_OPT_DIRTY_SUBMODULES to tell diff-lib.c it wants to get detailed dirty information about submodules. A hint line for submodules is printed in the dirty header when dirty submodules are present. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Jens Lehmann	c7e1a73641	git diff --submodule: Show detailed dirty status of submodules When encountering a dirty submodule while doing "git diff --submodule" print an extra line for new untracked content and another for modified but already tracked content. And if the HEAD of the submodule is equal to the ref diffed against in the superproject, drop the output which would just show the same SHA1s and no commit message headlines. To achieve that, the dirty_submodule bitfield is expanded to two bits. The output of "git status" inside the submodule is parsed to set the according bits. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Jens Lehmann	4d34477f4c	git diff: Don't test submodule dirtiness with --ignore-submodules The diff family suppresses the output of submodule changes when requested but checks them nonetheless. But since recently submodules get examined for their dirtiness, which is rather expensive. There is no need to do that when the --ignore-submodules option is used, as the gathered information is never used anyway. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Junio C Hamano	125fd98434	Make ce_uptodate() trustworthy again The rule has always been that a cache entry that is ce_uptodate(ce) means that we already have checked the work tree entity and we know there is no change in the work tree compared to the index, and nobody should have to double check. Note that false ce_uptodate(ce) does not mean it is known to be dirty---it only means we don't know if it is clean. There are a few codepaths (refresh-index and preload-index are among them) that mark a cache entry as up-to-date based solely on the return value from ie_match_stat(); this function uses lstat() to see if the work tree entity has been touched, and for a submodule entry, if its HEAD points at the same commit as the commit recorded in the index of the superproject (a submodule that is not even cloned is considered clean). A submodule is no longer considered unmodified merely because its HEAD matches the index of the superproject these days, in order to prevent people from forgetting to commit in the submodule and updating the superproject index with the new submodule commit, before commiting the state in the superproject. However, the patch to do so didn't update the codepath that marks cache entries up-to-date based on the updated definition and instead worked it around by saying "we don't trust the return value of ce_uptodate() for submodules." This makes ce_uptodate() trustworthy again by not marking submodule entries up-to-date. The next step _could_ be to introduce a few "in-core" flag bits to cache_entry structure to record "this entry is _known_ to be dirty", call is_submodule_modified() from ie_match_stat(), and use these new bits to avoid running this rather expensive check more than once, but that can be a separate patch. Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Jens Lehmann	e3d42c4773	Performance optimization for detection of modified submodules In the worst case is_submodule_modified() got called three times for each submodule. The information we got from scanning the whole submodule tree the first time can be reused instead. New parameters have been added to diff_change() and diff_addremove(), the information is stored in a new member of struct diff_filespec. Its value is then reused instead of calling is_submodule_modified() again. When no explicit "-dirty" is needed in the output the call to is_submodule_modified() is not necessary when the submodules HEAD already disagrees with the ref of the superproject, as this alone marks it as modified. To achieve that, get_stat_data() got an extra argument. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Jens Lehmann	ee6fc514f2	Show submodules as modified when they contain a dirty work tree Until now a submodule only then showed up as modified in the supermodule when the last commit in the submodule differed from the one in the index or the diffed against commit of the superproject. A dirty work tree containing new untracked or modified files in a submodule was undetectable when looking at it from the superproject. Now git status and git diff (against the work tree) in the superproject will also display submodules as modified when they contain untracked or modified files, even if the compared ref matches the HEAD of the submodule. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Junio C Hamano	730f72840c	unpack-trees.c: look ahead in the index This makes the traversal of index be in sync with the tree traversal. When unpack_callback() is fed a set of tree entries from trees, it inspects the name of the entry and checks if the an index entry with the same name could be hiding behind the current index entry, and (1) if the name appears in the index as a leaf node, it is also fed to the n_way_merge() callback function; (2) if the name is a directory in the index, i.e. there are entries in that are underneath it, then nothing is fed to the n_way_merge() callback function; (3) otherwise, if the name comes before the first eligible entry in the index, the index entry is first unpacked alone. When traverse_trees_recursive() descends into a subdirectory, the cache_bottom pointer is moved to walk index entries within that directory. All of these are omitted for diff-index, which does not even want to be fed an index entry and a tree entry with D/F conflicts. This fixes 3-way read-tree and exposes a bug in other parts of the system in t6035, test #5. The test prepares these three trees: O = HEAD^ 100644 blob `e69de29bb2` a/b-2/c/d 100644 blob `e69de29bb2` a/b/c/d 100644 blob `e69de29bb2` a/x A = HEAD 100644 blob `e69de29bb2` a/b-2/c/d 100644 blob `e69de29bb2` a/b/c/d 100644 blob 587be6b4c3f93f93c489c0111bba5596147a26cb a/x B = master 120000 blob a36b77384451ea1de7bd340ffca868249626bc52 a/b 100644 blob `e69de29bb2` a/b-2/c/d 100644 blob `e69de29bb2` a/x With a clean index that matches HEAD, running git read-tree -m -u --aggressive $O $A $B now yields 120000 a36b77384451ea1de7bd340ffca868249626bc52 3 a/b 100644 `e69de29bb2` 0 a/b-2/c/d 100644 `e69de29bb2` 1 a/b/c/d 100644 `e69de29bb2` 2 a/b/c/d 100644 587be6b4c3f93f93c489c0111bba5596147a26cb 0 a/x which is correct. "master" created "a/b" symlink that did not exist, and removed "a/b/c/d" while HEAD did not do touch either path. Before this series, read-tree did not notice the situation and resolved addition of "a/b" and removal of "a/b/c/d" independently. If A = HEAD had another path "a/b/c/e" added, this merge should conflict but instead it silently resolved "a/b" and then immediately overwrote it to add "a/b/c/e", which was quite bogus. Tests in t1012 start to work with this. Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Junio C Hamano	da165f470e	unpack-trees.c: prepare for looking ahead in the index This prepares but does not yet implement a look-ahead in the index entries when traverse-trees.c decides to give us tree entries in an order that does not match what is in the index. A case where a look-ahead in the index is necessary happens when merging branch B into branch A while the index matches the current branch A, using a tree O as their common ancestor, and these three trees looks like this: O A B t t t-i t-i t-i t-j t-j t/1 t/2 The traverse_trees() function gets "t", "t-i" and "t" from trees O, A and B first, and notices that A may have a matching "t" behind "t-i" and "t-j" (indeed it does), and tells A to give that entry instead. After unpacking blob "t" from tree B (as it hasn't changed since O in B and A removed it, it will result in its removal), it descends into directory "t/". The side that walked index in parallel to the tree traversal used to be implemented with one pointer, o->pos, that points at the next index entry to be processed. When this happens, the pointer o->pos still points at "t-i" that is the first entry. We should be able to skip "t-i" and "t-j" and locate "t/1" from the index while the recursive invocation of traverse_trees() walks and match entries found there, and later come back to process "t-i". While that look-ahead is not implemented yet, this adds a flag bit, CE_UNPACKED, to mark the entries in the index that has already been processed. o->pos pointer has been renamed to o->cache_bottom and it points at the first entry that may still need to be processed. Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Junio C Hamano	da8ba5e7da	diff-lib.c: fix misleading comments on oneway_diff() `20a16eb` (unpack_trees(): fix diff-index regression., 2008-03-10) adjusted diff-index to the new world order since `34110cd` (Make 'unpack_trees()' have a separate source and destination index, 2008-03-06). Callbacks are expected to return anything non-negative as "success", and instead of reporting how many index entries they have processed, they are expected to advance o->pos themselves. The code did so, but a stale comment was left behind. Signed-off-by: Junio C Hamano <gitster@pobox.com>	15 years ago
Nguyễn Thái Ngọc Duy	b4d1690df1	Teach Git to respect skip-worktree bit (reading part) grep: turn on --cached for files that is marked skip-worktree ls-files: do not check for deleted file that is marked skip-worktree update-index: ignore update request if it's skip-worktree, while still allows removing diff*: skip worktree version Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Nguyễn Thái Ngọc Duy	540e694b13	Prevent diff machinery from examining assume-unchanged entries on worktree Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	26da1d7867	diff-index: keep the original index intact When comparing the index and a tree, we used to read the contents of the tree into stage #1 of the index and compared them with stage #0. In order not to lose sight of entries originally unmerged in the index, we hoisted them to stage #3 before reading the tree. Commit `d1f2d7e` (Make run_diff_index() use unpack_trees(), not read_tree(), 2008-01-19) changed all this. These days, we instead use unpack_trees() API to traverse the tree and compare the contents with the index, without modifying the index at all. There is no reason to hoist the unmerged entries to stage #3 anymore. Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	29796c6ccf	diff-index: report unmerged new entries Since an earlier change to diff-index by `d1f2d7e` (Make run_diff_index() use unpack_trees(), not read_tree(), 2008-01-19), we stopped reporting an unmerged path that does not exist in the tree, but we should. Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	90b1994170	diff: Rename QUIET internal option to QUICK The option "QUIET" primarily meant "find if we have _any_ difference as quick as possible and report", which means we often do not even have to look at blobs if we know the trees are different by looking at the higher level (e.g. "diff-tree A B"). As a side effect, because there is no point showing one change that we happened to have found first, it also enables NO_OUTPUT and EXIT_WITH_STATUS options, making the end result look quiet. Rename the internal option to QUICK to reflect this better; it also makes grepping the source tree much easier, as there are other kinds of QUIET option everywhere. Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	a0919ced8a	Avoid "diff-index --cached" optimization under --find-copies-harder When find-copies-harder is in effect, the diff frontends are expected to feed all paths, not just changed paths, to the diffcore, so that copy sources can be picked up. In such a case, not descending into subtrees using the cache-tree information is simply wrong. Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	b65982b608	Optimize "diff-index --cached" using cache-tree When running "diff-index --cached" after making a change to only a small portion of the index, there is no point unpacking unchanged subtrees into the index recursively, only to find that all entries match anyway. Tweak unpack_trees() logic that is used to read in the tree object to catch the case where the tree entry we are looking at matches the index as a whole by looking at the cache-tree. As an exercise, after modifying a few paths in the kernel tree, here are a few numbers on my Athlon 64X2 3800+: (without patch, hot cache) $ /usr/bin/time git diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.07user 0.02system 0:00.09elapsed 102%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+9407minor)pagefaults 0swaps (with patch, hot cache) $ /usr/bin/time ../git.git/git-diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.02user 0.00system 0:00.02elapsed 103%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+2446minor)pagefaults 0swaps Cold cache numbers are very impressive, but it does not matter very much in practice: (without patch, cold cache) $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches' $ /usr/bin/time git diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.06user 0.17system 0:10.26elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k 247032inputs+0outputs (1172major+8237minor)pagefaults 0swaps (with patch, cold cache) $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches' $ /usr/bin/time ../git.git/git-diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.02user 0.01system 0:01.01elapsed 3%CPU (0avgtext+0avgdata 0maxresident)k 18440inputs+0outputs (79major+2369minor)pagefaults 0swaps This of course helps "git status" as well. (without patch, hot cache) $ /usr/bin/time ../git.git/git-status >/dev/null 0.17user 0.18system 0:00.35elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+5336outputs (0major+10970minor)pagefaults 0swaps (with patch, hot cache) $ /usr/bin/time ../git.git/git-status >/dev/null 0.10user 0.16system 0:00.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+5336outputs (0major+3921minor)pagefaults 0swaps Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Linus Torvalds	658dd48c85	Avoid unnecessary 'lstat()' calls in 'get_stat_data()' When we ask get_stat_data() to get the mode and size of an index entry, we can avoid the lstat() call if we have marked the index entry as being uptodate due to earlier lstat() calls. This avoids a lot of unnecessary lstat() calls in eg 'git checkout', where the last phase shows the differences to the working tree (requiring a diff), but earlier phases have already verified the index. On the kernel repo (with a fast machine and everything cached), this changes timings of a nul 'git checkout' from - Before (best of ten): 0.14user 0.05system 0:00.19elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+13237minor)pagefaults 0swaps - After 0.11user 0.03system 0:00.15elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+13235minor)pagefaults 0swaps so it can obviously be noticeable, although equally obviously it's not a show-stopper on this particular machine. The difference is likely larger on slower machines, or with operating systems that don't do as good a job of name caching. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Stephan Beyer	75f3ff2eea	Generalize and libify index_is_dirty() to index_differs_from(...) index_is_dirty() in builtin-revert.c checks if the index is dirty. This patch generalizes this function to check if the index differs from a revision, i.e. the former index_is_dirty() behavior can now be achieved by index_differs_from("HEAD", 0). The second argument "diff_flags" allows to set further diff option flags like DIFF_OPT_IGNORE_SUBMODULES. See DIFF_OPT_* macros in diff.h for a list. index_differs_from() seems to be useful for more than builtin-revert.c, so it is moved into diff-lib.c and also used in builtin-commit.c. Yet to mention: - "rev.abbrev = 0;" can be safely removed. This has no impact on performance or functioning of neither setup_revisions() nor run_diff_index(). - rev.pending.objects is free()d because this fixes a leak. (Also see `295dd2ad` "Fix memory leak in traverse_commit_list") Mentored-by: Daniel Barkalow <barkalow@iabervon.org> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Stephan Beyer <s-beyer@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Kjetil Barvik	571998921d	lstat_cache(): swap func(length, string) into func(string, length) Swap function argument pair (length, string) into (string, length) to conform with the commonly used order inside the GIT source code. Also, add a note about this fact into the coding guidelines. Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Kjetil Barvik	ff7e6aad6d	Cleanup of unused symcache variable inside diff-lib.c Commit `c40641b77b`, 'Optimize symlink/directory detection' by Linus Torvalds, removed the 'char symcache' parameter to the has_symlink_leading_path() function. This made all variables currently named 'symcache' inside diff-lib.c unnecessary. This also let us throw away the 'struct oneway_unpack_data', and instead directly use the 'struct rev_info revs' member, which was the only member left after removal of the 'symcache[] array' member. The 'struct oneway_unpack_data' was introduced by the following commit: `948dd346` "diff-files: careful when inspecting work tree items" Impact: cleanup PATH_MAX bytes less memory stack usage in some cases Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>	16 years ago
Junio C Hamano	a5a818ee48	diff: vary default prefix depending on what are compared With a new configuration "diff.mnemonicprefix", "git diff" shows the differences between various combinations of preimage and postimage trees with prefixes different from the standard "a/" and "b/". Hopefully this will make the distinction stand out for some people. "git diff" compares the (i)ndex and the (w)ork tree; "git diff HEAD" compares a (c)ommit and the (w)ork tree; "git diff --cached" compares a (c)ommit and the (i)ndex; "git-diff HEAD:file1 file2" compares an (o)bject and a (w)ork tree entity; "git diff --no-index a b" compares two non-git things (1) and (2). Because these mnemonics now have meanings, they are swapped when reverse diff is in effect and this feature is enabled. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Dmitry Potapov	fd55a19eb1	Fix buffer overflow in git diff If PATH_MAX on your system is smaller than a path stored, it may cause buffer overflow and stack corruption in diff_addremove() and diff_change() functions when running git-diff Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	0569e9b8ce	"git diff": do not ignore index without --no-index Even if "foo" and/or "bar" does not exist in index, "git diff foo bar" should not change behaviour drastically from "git diff foo bar baz" or "git diff foo". A feature that "sometimes works and is handy" is an unreliable cute hack. "git diff foo bar" outside a git repository continues to work as a more colourful alternative to "diff -u" as before. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Linus Torvalds	c40641b77b	Optimize symlink/directory detection This is the base for making symlink detection in the middle fo a pathname saner and (much) more efficient. Under various loads, we want to verify that the full path leading up to a filename is a real directory tree, and that when we successfully do an 'lstat()' on a filename, we don't get a false positive due to a symlink in the middle of the path that git should have seen as a symlink, not as a normal path component. The 'has_symlink_leading_path()' function already did this, and cached a single level of symlink information, but didn't cache the _lack_ of a symlink, so the normal behaviour was actually the wrong way around, and we ended up doing an 'lstat()' on each path component to check that it was a real directory. This caches the last detected full directory and symlink entries, and speeds up especially deep directory structures a lot by avoiding to lstat() all the directories leading up to each entry in the index. [ This can - and should - probably be extended upon so that we eventually never do a bare 'lstat()' on any path entries at all when checking the index, but always check the full path carefully. Right now we do not generally check the whole path for all our normal quick index revalidation. We should also make sure that we're careful about all the invalidation, ie when we remove a link and replace it by a directory we should invalidate the symlink cache if it matches (and vice versa for the directory cache). But regardless, the basic function needs to be sane to do that. The old 'has_symlink_leading_path()' was not capable enough - or indeed the code readable enough - to really do that sanely. So I'm pushing this as not just an optimization, but as a base for further work. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	451244d724	diff-lib.c: rename check_work_tree_entity() The function is about checking for removed work tree item, so name it accordingly to avoid future confusion. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	1392a37721	diff: a submodule not checked out is not modified `948dd34` (diff-index: careful when inspecting work tree items, 2008-03-30) made the work tree check careful not to be fooled by a new directory that exists at a place the index expects a blob. For such a change to be a typechange from blob to submodule, the new directory has to be a repository. However, if the index expects a submodule there, we should not insist the work tree entity to be a repository --- a simple directory that is not a full fledged repository (even an empty directory would do) should be considered an unmodified subproject, because that is how a superproject with a submodule is checked out sparsely by default. This makes the function check_work_tree_entity() even more careful not to report a submodule that is not checked out as removed. It fixes the recently added test in t4027. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Matthieu Moy	59b0c24daa	git-svn: detect and fail gracefully when dcommitting to a void The command git svn clone (URL of an empty SVN repo here) works, creates an empty git repository. I can perform the initial commit there, but then, "git svn dcommit" says : Use of uninitialized value in concatenation (.) or string at .../git-svn line 414. Committing to ... Unable to determine upstream SVN information from HEAD history I guess a correct management of the initial commit in git-svn would be hard to implement, but at least, the error message can be improved. First step is something like the patch below, and better would be for "git svn clone" to warn that it won't be able to do much with the cloned repo. Acked-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	8fa29602d4	diff-files: mark an index entry we know is up-to-date as such This does not make any difference when running diff-files alone, but if you internally run run_diff_files() and then run other operations further on the index, we do not have to run lstat(2) again on entries we already have checked. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	f58dbf23c3	diff-files: careful when inspecting work tree items This fixes the same breakage in diff-files. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	948dd346fd	diff-index: careful when inspecting work tree items Earlier, if you changed a staged path into a directory in the work tree, we happily ran lstat(2) on it and found that it exists, and declared that the user changed it to a gitlink. This is wrong for two reasons: (1) It may be a directory, but it may not be a submodule, and in the latter case, the change we need to report is "the blob at the path has disappeared". We need to check with resolve_gitlink_ref() to be consistent with what "git add" and "git update-index --add" does. (2) lstat(2) may have succeeded only because a leading component of the path was turned into a symbolic link that points at something that exists in the work tree. In such a case, the path itself does not exist anymore, as far as the index is concerned. This fixes these breakages in diff-index that the previous patch has exposed. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Linus Torvalds	20a16eb33e	unpack_trees(): fix diff-index regression. When skip_unmerged option is not given, unpack_trees() should not just skip unmerged cache entries but keep them in the result for the caller to sort them out. For callers other than diff-index, the incoming index should never be unmerged, but diff-index is a special case caller. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Linus Torvalds	34110cd4e3	Make 'unpack_trees()' have a separate source and destination index We will always unpack into our own internal index, but we will take the source from wherever specified, and we will optionally write the result to a specified index (optionally, because not everybody even _wants_ any result: the index diffing really wants to just walk the tree and index in parallel). This ends up removing a fair number more lines than it adds, for the simple reason that we can now skip all the crud that tried to be oh-so-careful about maintaining our position in the index as we were traversing and modifying it. Since we don't actually modify the source index any more, we can just update the 'o->pos' pointer without worrying about whether an index entry got removed or replaced or added to. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Linus Torvalds	bc052d7f43	Make 'unpack_trees()' take the index to work on as an argument This is just a very mechanical conversion, and makes everybody set it to '&the_index' before calling, but at least it makes it more explicit where we work with the index. The next stage would be to split that index usage up into a 'source' and a 'destination' index, so that we can unpack into a different index than we started out from. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	c8c16f2865	diff-lib.c: constness strengthening The internal implementation of diff-index codepath used to use non const pointer to pass sha1 around, but it did not have to. With this, we can also lose the private no_sha1[] array, as we can use the public null_sha1[] array that exists exactly for the same purpose. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Daniel Barkalow	203a2fe117	Allow callers of unpack_trees() to handle failure Return an error from unpack_trees() instead of calling die(), and exit with an error in read-tree, builtin-commit, and diff-lib. merge-recursive already expected an error return from unpack_trees, so it doesn't need to be changed. The merge function can return negative to abort. This will be used in builtin-checkout -m. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>	17 years ago
Johannes Schindelin	204ce979a5	Also use unpack_trees() in do_diff_cache() As in run_diff_index(), we call unpack_trees() with the oneway_diff() function in do_diff_cache() now. This makes the function diff_cache() obsolete. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	17 years ago
Linus Torvalds	d1f2d7e8ca	Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	17 years ago
Linus Torvalds	7a51ed66f6	Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	17 years ago
Steffen Prohaska	ecf4831d89	Use is_absolute_path() in diff-lib.c, lockfile.c, setup.c, trace.c Using the helper function to test for absolute paths makes porting easier. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Pierre Habouzit	8f67f8aefb	Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	fb63d7f889	git-add: make the entry stat-clean after re-adding the same contents Earlier in commit `0781b8a9b2` (add_file_to_index: skip rehashing if the cached stat already matches), add_file_to_index() were taught not to re-add the path if it already matches the index. The change meant well, but was not executed quite right. It used ie_modified() to see if the file on the work tree is really different from the index, and skipped adding the contents if the function says "not modified". This was wrong. There are three possible comparison results between the index and the file in the work tree: - with lstat(2) we _know_ they are different. E.g. if the length or the owner in the cached stat information is different from the length we just obtained from lstat(2), we can tell the file is modified without looking at the actual contents. - with lstat(2) we _know_ they are the same. The same length, the same owner, the same everything (but this has a twist, as described below). - we cannot tell from lstat(2) information alone and need to go to the filesystem to actually compare. The last case arises from what we call 'racy git' situation, that can be caused with this sequence: $ echo hello >file $ git add file $ echo aeiou >file ;# the same length If the second "echo" is done within the same filesystem timestamp granularity as the first "echo", then the timestamp recorded by "git add" and the timestamp we get from lstat(2) will be the same, and we can mistakenly say the file is not modified. The path is called 'racily clean'. We need to reliably detect racily clean paths are in fact modified. To solve this problem, when we write out the index, we mark the index entry that has the same timestamp as the index file itself (that is the time from the point of view of the filesystem) to tell any later code that does the lstat(2) comparison not to trust the cached stat info, and ie_modified() then actually goes to the filesystem to compare the contents for such a path. That's all good, but it should not be used for this "git add" optimization, as the goal of "git add" is to actually update the path in the index and make it stat-clean. With the false optimization, we did _not_ cause any data loss (after all, what we failed to do was only to update the cached stat information), but it made the following sequence leave the file stat dirty: $ echo hello >file $ git add file $ echo hello >file ;# the same contents $ git add file The solution is not to use ie_modified() which goes to the filesystem to see if it is really clean, but instead use ie_match_stat() with "assume racily clean paths are dirty" option, to force re-adding of such a path. There was another problem with "git add -u". The codepath shares the same issue when adding the paths that are found to be modified, but in addition, it asked "git diff-files" machinery run_diff_files() function (which is "git diff-files") to list the paths that are modified. But "git diff-files" machinery uses the same ie_modified() call so that it does not report racily clean _and_ actually clean paths as modified, which is not what we want. The patch allows the callers of run_diff_files() to pass the same "assume racily clean paths are dirty" option, and makes "git-add -u" codepath to use that option, to discover and re-add racily clean _and_ actually clean paths. We could further optimize on top of this patch to differentiate the case where the path really needs re-adding (i.e. the content of the racily clean entry was indeed different) and the case where only the cached stat information needs to be refreshed (i.e. the racily clean entry was actually clean), but I do not think it is worth it. This patch applies to maint and all the way up. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	4bd5b7dacc	ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com>	17 years ago
Junio C Hamano	b78281f721	diff --no-index: do not forget to run diff_setup_done() Code inspection by Linus found this. Signed-off-by: Junio C Hamano <gitster@pobox.com>	18 years ago
René Scharfe	6d2d9e8666	diff: squelch empty diffs even more When we compare two non-tracked files, or explicitly specify --no-index, the suggestion to run git-status is not helpful. The patch adds a new diff_options bitfield member, no_index, that is used instead of the special value of -2 of the rev_info field max_count to indicate that the index is not to be used. This makes it possible to pass that flag down to diffcore_skip_stat_unmatch(), which only has one diff_options parameter. This could even become a cleanup if we removed all assignments of max_count to a value of -2 (viz. replacement of a magic value with a self-documenting field name) but I didn't dare to do that so late in the rc game.. The no_index bit, if set, then tells diffcore_skip_stat_unmatch() to not account for any skipped stat-mismatches, which avoids the suggestion to run git-status. Signed-off-by: Rene Scharfe <rene.scharfe@lsfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	18 years ago
René Scharfe	4d3f4b80e4	diff-lib.c: don't strdup twice The static function read_directory in diff-lib.c is only ever called with struct path_list lists with .strdup_paths turned on, i.e. path_list_insert will strdup the paths for us (again). Let's take advantage of that and stop doing it twice. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	18 years ago
Junio C Hamano	a6080a0a44	War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>	18 years ago

1 2

97 Commits (67687feae52ff2d3ab7edf3ac42a39c5e44be299)