Go to file
Jeff King 2548183bad fix phantom untracked files when core.ignorecase is set
When core.ignorecase is turned on and there are stale index
entries, "git commit" can sometimes report directories as
untracked, even though they contain tracked files.

You can see an example of this with:

    # make a case-insensitive repo
    git init repo && cd repo &&
    git config core.ignorecase true &&

    # with some tracked files in a subdir
    mkdir subdir &&
    > subdir/one &&
    > subdir/two &&
    git add . &&
    git commit -m base &&

    # now make the index entries stale
    touch subdir/* &&

    # and then ask commit to update those entries and show
    # us the status template
    git commit -a

which will report "subdir/"  as untracked, even though it
clearly contains two tracked files. What is happening in the
commit program is this:

  1. We load the index, and for each entry, insert it into the index's
     name_hash. In addition, if ignorecase is turned on, we make an
     entry in the name_hash for the directory (e.g., "contrib/"), which
     uses the following code from 5102c61's hash_index_entry_directories:

        hash = hash_name(ce->name, ptr - ce->name);
        if (!lookup_hash(hash, &istate->name_hash)) {
                pos = insert_hash(hash, &istate->name_hash);
		if (pos) {
			ce->next = *pos;
			*pos = ce;
		}
        }

     Note that we only add the directory entry if there is not already an
     entry.

  2. We run add_files_to_cache, which gets updated information for each
     cache entry. It helpfully inserts this information into the cache,
     which calls replace_index_entry. This in turn calls
     remove_name_hash() on the old entry, and add_name_hash() on the new
     one. But remove_name_hash doesn't actually remove from the hash, it
     only marks it as "no longer interesting" (from cache.h):

      /*
       * We don't actually *remove* it, we can just mark it invalid so that
       * we won't find it in lookups.
       *
       * Not only would we have to search the lists (simple enough), but
       * we'd also have to rehash other hash buckets in case this makes the
       * hash bucket empty (common). So it's much better to just mark
       * it.
       */
      static inline void remove_name_hash(struct cache_entry *ce)
      {
              ce->ce_flags |= CE_UNHASHED;
      }

     This is OK in the specific-file case, since the entries in the hash
     form a linked list, and we can just skip the "not here anymore"
     entries during lookup.

     But for the directory hash entry, we will _not_ write a new entry,
     because there is already one there: the old one that is actually no
     longer interesting!

  3. While traversing the directories, we end up in the
     directory_exists_in_index_icase function to see if a directory is
     interesting. This in turn checks index_name_exists, which will
     look up the directory in the index's name_hash. We see the old,
     deleted record, and assume there is nothing interesting. The
     directory gets marked as untracked, even though there are index
     entries in it.

The problem is in the code I showed above:

        hash = hash_name(ce->name, ptr - ce->name);
        if (!lookup_hash(hash, &istate->name_hash)) {
                pos = insert_hash(hash, &istate->name_hash);
		if (pos) {
			ce->next = *pos;
			*pos = ce;
		}
        }

Having a single cache entry that represents the directory is
not enough; that entry may go away if the index is changed.
It may be tempting to say that the problem is in our removal
method; if we removed the entry entirely instead of simply
marking it as "not here anymore", then we would know we need
to insert a new entry. But that only covers this particular
case of remove-replace. In the more general case, consider
something like this:

  1. We add "foo/bar" and "foo/baz" to the index. Each gets
     their own entry in name_hash, plus we make a "foo/"
     entry that points to "foo/bar".

  2. We remove the "foo/bar" entry from the index, and from
     the name_hash.

  3. We ask if "foo/" exists, and see no entry, even though
     "foo/baz" exists.

So we need that directory entry to have the list of _all_
cache entries that indicate that the directory is tracked.
So that implies making a linked list as we do for other
entries, like:

  hash = hash_name(ce->name, ptr - ce->name);
  pos = insert_hash(hash, &istate->name_hash);
  if (pos) {
	  ce->next = *pos;
	  *pos = ce;
  }

But that's not right either. In fact, it shows a second bug
in the current code, which is that the "ce->next" pointer is
supposed to be linking entries for a specific filename
entry, but here we are overwriting it for the directory
entry. So the same cache entry ends up in two linked lists,
but they share the same "next" pointer.

As it turns out, this second bug can't be triggered in the
current code. The "if (pos)" conditional is totally dead
code; pos will only be non-NULL if there was an existing
hash entry, and we already checked that there wasn't one
through our call to lookup_hash.

But fixing the first bug means taking out that call to
lookup_hash, which is going to activate the buggy dead code,
and we'll end up splicing the two linked lists together.

So we need to have a separate next pointer for the list in
the directory bucket, and we need to traverse that list in
index_name_exists when we are looking up a directory.

This bloats "struct cache_entry" by a few bytes. Which is
annoying, because it's only necessary when core.ignorecase
is enabled. There's not an easy way around it, short of
separating out the "next" pointers from cache_entry entirely
(i.e., having a separate "cache_entry_list" struct that gets
stored in the name_hash). In practice, it probably doesn't
matter; we have thousands of cache entries, compared to the
millions of objects (where adding 4 bytes to the struct
actually does impact performance).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-07 17:54:04 -07:00
Documentation Git 1.7.6 2011-06-26 12:41:16 -07:00
block-sha1
builtin plug a few coverity-spotted leaks 2011-06-20 14:27:36 -07:00
compat compat/fnmatch/fnmatch.c: give a fall-back definition for NULL 2011-05-26 09:25:47 -07:00
contrib Merge branch 'maint' 2011-06-26 12:09:11 -07:00
git-gui Merge git-gui 0.14.0 2011-03-26 10:42:35 -07:00
git_remote_helpers
gitk-git Merge git://git.kernel.org/pub/scm/gitk/gitk 2011-04-11 09:33:06 -07:00
gitweb Merge branch 'maint' 2011-06-21 14:56:59 -07:00
perl
po
ppc
t Merge branch 'mk/grep-pcre' 2011-06-20 14:49:44 -07:00
templates
vcs-svn Merge branch 'rj/sparse' 2011-04-27 11:36:42 -07:00
xdiff
.gitattributes
.gitignore Merge branch 'jn/gitweb-js' 2011-05-26 10:31:57 -07:00
.mailmap
COPYING
GIT-VERSION-GEN Git 1.7.6 2011-06-26 12:41:16 -07:00
INSTALL
LGPL-2.1 provide a copy of the LGPLv2.1 2011-05-19 18:23:17 -07:00
Makefile Merge branch 'mk/grep-pcre' 2011-05-30 00:00:07 -07:00
README
RelNotes Start 1.7.5.4 draft release notes 2011-05-31 12:06:40 -07:00
abspath.c Name make_*_path functions more accurately 2011-03-17 16:08:30 -07:00
aclocal.m4
advice.c
advice.h
alias.c
alloc.c unbreak and eliminate NO_C99_FORMAT 2011-03-17 15:30:49 -07:00
archive-tar.c
archive-zip.c
archive.c Convert read_tree{,_recursive} to support struct pathspec 2011-03-25 09:20:33 -07:00
archive.h
attr.c sparse: Fix some "symbol not declared" warnings 2011-04-22 10:04:27 -07:00
attr.h
base85.c
bisect.c bisect: refactor sha1_array into a generic sha1 list 2011-05-19 20:02:10 -07:00
bisect.h
blob.c
blob.h
branch.c
branch.h
builtin.h
bundle.c
bundle.h
cache-tree.c
cache-tree.h
cache.h fix phantom untracked files when core.ignorecase is set 2011-10-07 17:54:04 -07:00
check-builtins.sh
check-racy.c
check_bindir
color.c Share color list between graph and show-branch 2011-04-04 23:20:39 -07:00
color.h Share color list between graph and show-branch 2011-04-04 23:20:39 -07:00
combine-diff.c
command-list.txt
commit.c
commit.h Merge branch 'jk/format-patch-am' 2011-05-31 12:19:11 -07:00
config.c Merge branch 'jk/maint-config-alias-fix' into maint 2011-06-01 14:05:22 -07:00
config.mak.in Merge branch 'kk/maint-prefix-in-config-mak' into maint 2011-06-01 14:02:39 -07:00
configure.ac configure: Check for libpcre 2011-05-09 16:29:46 -07:00
connect.c Merge branch 'jk/git-connection-deadlock-fix' into maint-1.7.4 2011-05-26 10:28:10 -07:00
convert.c convert: make it harder to screw up adding a conversion attribute 2011-05-09 14:59:09 -07:00
copy.c
csum-file.c sparse: Fix errors and silence warnings 2011-04-03 10:14:53 -07:00
csum-file.h
ctype.c magic pathspec: futureproof shorthand form 2011-04-08 16:19:48 -07:00
daemon.c Fix sparse warnings 2011-03-22 10:16:54 -07:00
date.c date: avoid "X years, 12 months" in relative dates 2011-04-20 19:23:16 -07:00
decorate.c
decorate.h
delta.h
diff-delta.c
diff-lib.c Merge branch 'jk/diff-not-so-quick' 2011-06-06 11:40:14 -07:00
diff-no-index.c
diff.c Merge branch 'jk/diff-not-so-quick' 2011-06-06 11:40:14 -07:00
diff.h Merge branch 'jk/diff-not-so-quick' 2011-06-06 11:40:14 -07:00
diffcore-break.c
diffcore-delta.c
diffcore-order.c
diffcore-pickaxe.c
diffcore-rename.c diffcore-rename.c: avoid set-but-not-used warning 2011-06-01 13:54:17 -07:00
diffcore.h
dir.c Merge branch 'nd/struct-pathspec' 2011-05-06 10:50:06 -07:00
dir.h Merge branch 'nd/maint-setup' 2011-05-02 15:58:30 -07:00
editor.c
entry.c
environment.c Merge branch 'jc/replacing' 2011-05-19 20:37:21 -07:00
exec_cmd.c Name make_*_path functions more accurately 2011-03-17 16:08:30 -07:00
exec_cmd.h
fast-import.c fast-import: fix option parser for no-arg options 2011-05-05 21:21:24 -07:00
fetch-pack.h standardize brace placement in struct definitions 2011-03-16 12:49:02 -07:00
fixup-builtins
fsck.c Merge branch 'jm/maint-misc-fix' into maint 2011-05-30 00:09:41 -07:00
fsck.h
generate-cmdlist.sh standardize brace placement in struct definitions 2011-03-16 12:49:02 -07:00
gettext.c
gettext.h i18n: avoid parenthesized string as array initializer 2011-04-11 10:33:51 -07:00
git-add--interactive.perl add -i: ignore terminal escape sequences 2011-05-17 20:44:17 -07:00
git-am.sh
git-archimport.perl
git-bisect.sh bisect: visualize with git-log if gitk is unavailable 2011-03-21 10:23:45 -07:00
git-compat-util.h Merge branch 'jc/magic-pathspec' 2011-05-23 09:58:35 -07:00
git-cvsexportcommit.perl
git-cvsimport.perl Merge branch 'gr/cvsimport-alternative-cvspass-location' into maint 2011-05-13 10:44:54 -07:00
git-cvsserver.perl
git-difftool--helper.sh
git-difftool.perl
git-filter-branch.sh
git-instaweb.sh
git-lost-found.sh
git-merge-octopus.sh
git-merge-one-file.sh Merge branch 'jk/merge-one-file-working-tree' into maint 2011-05-13 10:44:19 -07:00
git-merge-resolve.sh
git-mergetool--lib.sh Pass empty file to p4merge where no base is suitable. 2011-05-01 15:56:05 -07:00
git-mergetool.sh mergetool: Teach about submodules 2011-04-13 12:21:45 -07:00
git-parse-remote.sh Merge branch 'mz/rebase' 2011-04-28 14:11:39 -07:00
git-pull.sh Merge branch 'mz/rebase' 2011-04-28 14:11:39 -07:00
git-quiltimport.sh
git-rebase--am.sh
git-rebase--interactive.sh rebase: write a reflog entry when finishing 2011-05-27 15:52:03 -07:00
git-rebase--merge.sh
git-rebase.sh rebase: write a reflog entry when finishing 2011-05-27 15:52:03 -07:00
git-relink.perl
git-remote-testgit.py
git-repack.sh
git-request-pull.sh
git-send-email.perl git-send-email: fix missing space in error message 2011-04-29 11:34:32 -07:00
git-sh-i18n.sh git-sh-i18n.sh: add GIT_GETTEXT_POISON support 2011-05-14 20:29:11 -07:00
git-sh-setup.sh require-work-tree wants more than what its name says 2011-05-24 11:34:40 -07:00
git-stash.sh Merge branch 'jk/maint-stash-oob' into maint 2011-05-04 14:58:42 -07:00
git-submodule.sh Merge branch 'maint' 2011-05-30 00:09:55 -07:00
git-svn.perl Merge branch 'maint' 2011-05-20 18:50:29 -07:00
git-web--browse.sh
git.c Merge branch 'jk/maint-config-alias-fix' into maint 2011-06-01 14:05:22 -07:00
git.spec.in
graph.c Share color list between graph and show-branch 2011-04-04 23:20:39 -07:00
graph.h
grep.c git-grep: Learn PCRE 2011-05-09 16:29:33 -07:00
grep.h git-grep: Learn PCRE 2011-05-09 16:29:33 -07:00
hash.c
hash.h
help.c
help.h
hex.c
http-backend.c
http-fetch.c Fix two unused variable warnings in gcc 4.6 2011-04-03 10:59:40 -07:00
http-push.c http-push: refactor curl_easy_setup madness 2011-05-04 13:30:28 -07:00
http-walker.c http: make curl callbacks match contracts from curl header 2011-05-04 13:30:28 -07:00
http.c Merge branch 'sp/maint-clear-postfields' into maint 2011-05-04 14:58:56 -07:00
http.h http: make curl callbacks match contracts from curl header 2011-05-04 13:30:28 -07:00
ident.c Merge branch 'rg/no-gecos-in-pwent' 2011-05-26 10:32:19 -07:00
imap-send.c sparse: Fix some "Using plain integer as NULL pointer" warnings 2011-04-11 10:35:25 -07:00
levenshtein.c
levenshtein.h
list-objects.c Merge branch 'nd/struct-pathspec' 2011-05-06 10:50:06 -07:00
list-objects.h
ll-merge.c
ll-merge.h
lockfile.c Name make_*_path functions more accurately 2011-03-17 16:08:30 -07:00
log-tree.c Merge branch 'jk/format-patch-am' 2011-05-31 12:19:11 -07:00
log-tree.h
mailmap.c
mailmap.h
match-trees.c
merge-file.c sparse: Fix an "symbol 'merge_file' not decared" warning 2011-04-11 10:35:25 -07:00
merge-file.h sparse: Fix an "symbol 'merge_file' not decared" warning 2011-04-11 10:35:25 -07:00
merge-recursive.c Merge branch 'jc/rename-degrade-cc-to-c' into maint 2011-05-31 12:00:02 -07:00
merge-recursive.h Merge branch 'jk/merge-rename-ux' 2011-03-19 23:23:56 -07:00
name-hash.c fix phantom untracked files when core.ignorecase is set 2011-10-07 17:54:04 -07:00
notes-cache.c
notes-cache.h
notes-merge.c index_fd(): turn write_object and format_check arguments into one flag 2011-05-09 11:58:19 -07:00
notes-merge.h
notes.c notes: refactor display notes default handling 2011-03-29 14:31:59 -07:00
notes.h notes: refactor display notes default handling 2011-03-29 14:31:59 -07:00
object.c read_sha1_file(): get rid of read_sha1_file_repl() madness 2011-05-15 15:23:33 -07:00
object.h
pack-check.c sparse: Fix errors and silence warnings 2011-04-03 10:14:53 -07:00
pack-refs.c
pack-refs.h
pack-revindex.c
pack-revindex.h
pack-write.c
pack.h
pager.c
parse-options.c Fix sparse warnings 2011-03-22 10:16:54 -07:00
parse-options.h
patch-delta.c
patch-ids.c
patch-ids.h
path.c Name make_*_path functions more accurately 2011-03-17 16:08:30 -07:00
pkt-line.c sparse: Fix errors and silence warnings 2011-04-03 10:14:53 -07:00
pkt-line.h
preload-index.c
pretty.c Merge branch 'jk/format-patch-am' 2011-05-31 12:19:11 -07:00
progress.c
progress.h
quote.c
quote.h
reachable.c Remove unused variables 2011-03-22 11:43:27 -07:00
reachable.h
read-cache.c index_fd(): turn write_object and format_check arguments into one flag 2011-05-09 11:58:19 -07:00
reflog-walk.c
reflog-walk.h
refs.c Fix typo: existant->existent 2011-06-16 10:33:50 -07:00
refs.h
remote-curl.c plug a few coverity-spotted leaks 2011-06-20 14:27:36 -07:00
remote.c
remote.h
replace_object.c inline lookup_replace_object() calls 2011-05-15 15:23:33 -07:00
rerere.c Merge branch 'maint' 2011-05-30 00:09:55 -07:00
rerere.h rerere: libify rerere_clear() and rerere_gc() 2011-05-08 12:55:34 -07:00
resolve-undo.c
resolve-undo.h
revision.c Merge branch 'jc/notes-batch-removal' 2011-05-29 23:51:26 -07:00
revision.h Merge branch 'jk/format-patch-am' 2011-05-31 12:19:11 -07:00
run-command.c run-command: handle short writes and EINTR in die_child 2011-04-20 10:09:26 -07:00
run-command.h
send-pack.h
server-info.c
setup.c Merge branch 'maint' 2011-05-30 00:09:55 -07:00
sh-i18n--envsubst.c Merge branch 'ab/i18n-scripts-basic' 2011-06-17 11:40:32 -07:00
sha1-array.c receive-pack: eliminate duplicate .have refs 2011-05-19 20:02:31 -07:00
sha1-array.h receive-pack: eliminate duplicate .have refs 2011-05-19 20:02:31 -07:00
sha1-lookup.c
sha1-lookup.h
sha1_file.c Merge branch 'jc/bigfile' 2011-05-25 16:23:26 -07:00
sha1_name.c Merge branch 'jc/magic-pathspec' 2011-05-23 09:58:35 -07:00
shallow.c
shell.c shell: add missing initialization of argv0_path 2011-05-05 09:32:28 -07:00
shortlog.h
show-index.c
sideband.c
sideband.h
sigchain.c
sigchain.h
strbuf.c Merge branch 'ef/maint-strbuf-init' 2011-04-27 11:36:43 -07:00
strbuf.h strbuf: clarify assertion in strbuf_setlen() 2011-04-27 10:52:15 -07:00
string-list.c
string-list.h standardize brace placement in struct definitions 2011-03-16 12:49:02 -07:00
submodule.c Submodules: Don't parse .gitmodules when it contains, merge conflicts 2011-05-14 10:57:56 -07:00
submodule.h
symlinks.c
tag.c
tag.h
tar.h
test-chmtime.c
test-ctype.c
test-date.c
test-delta.c
test-dump-cache-tree.c
test-genrandom.c
test-index-version.c
test-line-buffer.c vcs-svn: remove buffer_read_string 2011-03-26 00:17:35 -05:00
test-match-trees.c
test-mktemp.c
test-obj-pool.c
test-parse-options.c
test-path-utils.c Name make_*_path functions more accurately 2011-03-17 16:08:30 -07:00
test-run-command.c tests: check error message from run_command 2011-04-20 10:08:54 -07:00
test-sha1.c
test-sha1.sh
test-sigchain.c
test-string-pool.c
test-subprocess.c Remove unused variables 2011-03-22 11:43:27 -07:00
test-svn-fe.c
test-treap.c
thread-utils.c Fix sparse warnings 2011-03-22 10:16:54 -07:00
thread-utils.h
trace.c Fix sparse warnings 2011-03-22 10:16:54 -07:00
transport-helper.c Remove unused variables 2011-03-22 11:43:27 -07:00
transport.c Merge branch 'maint' 2011-05-30 00:09:55 -07:00
transport.h refactor refs_from_alternate_cb to allow passing extra data 2011-05-19 20:01:10 -07:00
tree-diff.c Merge branch 'jk/diff-not-so-quick' 2011-06-06 11:40:14 -07:00
tree-walk.c pathspec: rename per-item field has_wildcard to use_wildcard 2011-04-05 09:30:36 -07:00
tree-walk.h
tree.c Convert read_tree{,_recursive} to support struct pathspec 2011-03-25 09:20:33 -07:00
tree.h Convert read_tree{,_recursive} to support struct pathspec 2011-03-25 09:20:33 -07:00
unimplemented.sh
unpack-trees.c unpack-trees: add the dry_run flag to unpack_trees_options 2011-05-25 14:32:02 -07:00
unpack-trees.h unpack-trees: add the dry_run flag to unpack_trees_options 2011-05-25 14:32:02 -07:00
upload-pack.c Merge branch 'jk/maint-upload-pack-shallow' into maint 2011-05-04 14:58:13 -07:00
url.c Fix sparse warnings 2011-03-22 10:16:54 -07:00
url.h
usage.c Fix sparse warnings 2011-03-22 10:16:54 -07:00
userdiff.c userdiff/perl: tighten BEGIN/END block pattern to reject here-doc delimiters 2011-05-23 11:39:13 -07:00
userdiff.h
utf8.c
utf8.h
walker.c
walker.h
wrap-for-bin.sh
wrapper.c read_in_full: always report errors 2011-05-26 13:54:18 -07:00
write_or_die.c
ws.c
wt-status.c Merge branch 'ab/i18n-st' 2011-04-01 17:55:55 -07:00
wt-status.h Merge branch 'jn/status-translatable' 2011-03-19 23:24:19 -07:00
xdiff-interface.c add, merge, diff: do not use strcasecmp to compare config variable names 2011-05-14 18:53:39 -07:00
xdiff-interface.h
zlib.c

README

////////////////////////////////////////////////////////////////

	GIT - the stupid content tracker

////////////////////////////////////////////////////////////////

"git" can mean anything, depending on your mood.

 - random three-letter combination that is pronounceable, and not
   actually used by any common UNIX command.  The fact that it is a
   mispronunciation of "get" may or may not be relevant.
 - stupid. contemptible and despicable. simple. Take your pick from the
   dictionary of slang.
 - "global information tracker": you're in a good mood, and it actually
   works for you. Angels sing, and a light suddenly fills the room.
 - "goddamn idiotic truckload of sh*t": when it breaks

Git is a fast, scalable, distributed revision control system with an
unusually rich command set that provides both high-level operations
and full access to internals.

Git is an Open Source project covered by the GNU General Public License.
It was originally written by Linus Torvalds with help of a group of
hackers around the net. It is currently maintained by Junio C Hamano.

Please read the file INSTALL for installation instructions.

See Documentation/gittutorial.txt to get started, then see
Documentation/everyday.txt for a useful minimum set of commands, and
Documentation/git-commandname.txt for documentation of each command.
If git has been correctly installed, then the tutorial can also be
read with "man gittutorial" or "git help tutorial", and the
documentation of each command with "man git-commandname" or "git help
commandname".

CVS users may also want to read Documentation/gitcvs-migration.txt
("man gitcvs-migration" or "git help cvs-migration" if git is
installed).

Many Git online resources are accessible from http://git-scm.com/
including full documentation and Git related tools.

The user discussion and development of Git take place on the Git
mailing list -- everyone is welcome to post bug reports, feature
requests, comments and patches to git@vger.kernel.org. To subscribe
to the list, send an email with just "subscribe git" in the body to
majordomo@vger.kernel.org. The mailing list archives are available at
http://marc.theaimsgroup.com/?l=git and other archival sites.

The messages titled "A note from the maintainer", "What's in
git.git (stable)" and "What's cooking in git.git (topics)" and
the discussion following them on the mailing list give a good
reference for project status, development direction and
remaining tasks.