When we want to get the list of modified files, we first
expand any user-provided pathspecs with "ls-files", and then
feed the resulting list of paths as arguments to
"diff-index" and "diff-files". If your pathspec expands into
a large number of paths, you may run into one of two
problems:
1. The OS may complain about the size of the argument
list, and refuse to run. For example:
$ (ulimit -s 128 && git add -p drivers)
Can't exec "git": Argument list too long at .../git-add--interactive line 177.
Died at .../git-add--interactive line 177.
That's on the linux.git repository, which has about 20K
files in the "drivers" directory (none of them modified
in this case). The "ulimit -s" trick is necessary to
show the problem on Linux even for such a gigantic set
of paths. Other operating systems have much smaller
limits (e.g., a real-world case was seen with only 5K
files on OS X).
2. Even when it does work, it's really slow. The pathspec
code is not optimized for huge numbers of paths. Here's
the same case without the ulimit:
$ time git add -p drivers
No changes.
real 0m16.559s
user 0m53.140s
sys 0m0.220s
We can improve this by skipping "ls-files" completely, and
just feeding the original pathspecs to the diff commands.
This solution was discussed in 2010:
http://public-inbox.org/git/20100105041438.GB12574@coredump.intra.peff.net/
but at the time the diff code's pathspecs were more
primitive than those used by ls-files (e.g., they did not
support globs). Making the change would have caused a
user-visible regression, so we didn't.
Since then, the pathspec code has been unified, and the diff
commands natively understand pathspecs like '*.c'.
This patch implements that solution. That skips the
argument-list limits, and the result runs much faster:
$ time git add -p drivers
No changes.
real 0m0.149s
user 0m0.116s
sys 0m0.080s
There are two new tests. The first just exercises the
globbing behavior to confirm that we are not causing a
regression there. The second checks the actual argument
behavior using GIT_TRACE. We _could_ do it with the "ulimit
-s" trick, as above. But that would mean the test could only
run where "ulimit -s" works. And tests of that sort are
expensive, because we have to come up with enough files to
actually bust the limit (we can't just shrink the "128" down
infinitely, since it is also the in-program stack size).
Finally, two caveats and possibilities for future work:
a. This fixes one argument-list expansion, but there may
be others. In fact, it's very likely that if you run
"git add -i" and select a large number of modified
files that the script would try to feed them all to a
single git command.
In practice this is probably fine. The real issue here
is that the argument list was growing with the _total_
number of files, not the number of modified or selected
files.
b. If the repository contains filenames with literal wildcard
characters (e.g., "foo*"), the original code expanded
them via "ls-files" and then fed those wildcard names
to "diff-index", which would have treated them as
wildcards. This was a bug, which is now fixed (though
unless you really go through some contortions with
":(literal)", it's likely that your original pathspec
would match whatever the accidentally-expanded wildcard
would anyway).
So this takes us one step closer to working correctly
with files whose names contain wildcard characters, but
it's likely that others remain (e.g., if "git add -i"
feeds the selected paths to "git add").
Reported-by: Wincent Colaiuta <win@wincent.com>
Reported-by: Mislav Marohnić <mislav.marohnic@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
||
|---|---|---|
| Documentation | ||
| block-sha1 | ||
| builtin | ||
| ci | ||
| compat | ||
| contrib | ||
| ewah | ||
| git-gui | ||
| gitk-git | ||
| gitweb | ||
| mergetools | ||
| perl | ||
| po | ||
| ppc | ||
| refs | ||
| t | ||
| templates | ||
| vcs-svn | ||
| xdiff | ||
| .gitattributes | ||
| .gitignore | ||
| .mailmap | ||
| .travis.yml | ||
| COPYING | ||
| GIT-VERSION-GEN | ||
| INSTALL | ||
| LGPL-2.1 | ||
| Makefile | ||
| README.md | ||
| RelNotes | ||
| abspath.c | ||
| aclocal.m4 | ||
| advice.c | ||
| advice.h | ||
| alias.c | ||
| alloc.c | ||
| apply.c | ||
| apply.h | ||
| archive-tar.c | ||
| archive-zip.c | ||
| archive.c | ||
| archive.h | ||
| argv-array.c | ||
| argv-array.h | ||
| attr.c | ||
| attr.h | ||
| base85.c | ||
| bisect.c | ||
| bisect.h | ||
| blob.c | ||
| blob.h | ||
| branch.c | ||
| branch.h | ||
| builtin.h | ||
| bulk-checkin.c | ||
| bulk-checkin.h | ||
| bundle.c | ||
| bundle.h | ||
| cache-tree.c | ||
| cache-tree.h | ||
| cache.h | ||
| check-builtins.sh | ||
| check-racy.c | ||
| check_bindir | ||
| color.c | ||
| color.h | ||
| column.c | ||
| column.h | ||
| combine-diff.c | ||
| command-list.txt | ||
| commit-slab.h | ||
| commit.c | ||
| commit.h | ||
| common-main.c | ||
| config.c | ||
| config.mak.in | ||
| config.mak.uname | ||
| configure.ac | ||
| connect.c | ||
| connect.h | ||
| connected.c | ||
| connected.h | ||
| convert.c | ||
| convert.h | ||
| copy.c | ||
| credential-cache--daemon.c | ||
| credential-cache.c | ||
| credential-store.c | ||
| credential.c | ||
| credential.h | ||
| csum-file.c | ||
| csum-file.h | ||
| ctype.c | ||
| daemon.c | ||
| date.c | ||
| decorate.c | ||
| decorate.h | ||
| delta.h | ||
| diff-delta.c | ||
| diff-lib.c | ||
| diff-no-index.c | ||
| diff.c | ||
| diff.h | ||
| diffcore-break.c | ||
| diffcore-delta.c | ||
| diffcore-order.c | ||
| diffcore-pickaxe.c | ||
| diffcore-rename.c | ||
| diffcore.h | ||
| dir-iterator.c | ||
| dir-iterator.h | ||
| dir.c | ||
| dir.h | ||
| editor.c | ||
| entry.c | ||
| environment.c | ||
| exec_cmd.c | ||
| exec_cmd.h | ||
| fast-import.c | ||
| fetch-pack.c | ||
| fetch-pack.h | ||
| fmt-merge-msg.h | ||
| fsck.c | ||
| fsck.h | ||
| generate-cmdlist.sh | ||
| gettext.c | ||
| gettext.h | ||
| git-add--interactive.perl | ||
| git-archimport.perl | ||
| git-bisect.sh | ||
| git-compat-util.h | ||
| git-cvsexportcommit.perl | ||
| git-cvsimport.perl | ||
| git-cvsserver.perl | ||
| git-difftool--helper.sh | ||
| git-filter-branch.sh | ||
| git-instaweb.sh | ||
| git-merge-octopus.sh | ||
| git-merge-one-file.sh | ||
| git-merge-resolve.sh | ||
| git-mergetool--lib.sh | ||
| git-mergetool.sh | ||
| git-p4.py | ||
| git-parse-remote.sh | ||
| git-quiltimport.sh | ||
| git-rebase--am.sh | ||
| git-rebase--interactive.sh | ||
| git-rebase--merge.sh | ||
| git-rebase.sh | ||
| git-remote-testgit.sh | ||
| git-request-pull.sh | ||
| git-send-email.perl | ||
| git-sh-i18n.sh | ||
| git-sh-setup.sh | ||
| git-stash.sh | ||
| git-submodule.sh | ||
| git-svn.perl | ||
| git-web--browse.sh | ||
| git.c | ||
| git.rc | ||
| gpg-interface.c | ||
| gpg-interface.h | ||
| graph.c | ||
| graph.h | ||
| grep.c | ||
| grep.h | ||
| hashmap.c | ||
| hashmap.h | ||
| help.c | ||
| help.h | ||
| hex.c | ||
| http-backend.c | ||
| http-fetch.c | ||
| http-push.c | ||
| http-walker.c | ||
| http.c | ||
| http.h | ||
| ident.c | ||
| imap-send.c | ||
| iterator.h | ||
| khash.h | ||
| kwset.c | ||
| kwset.h | ||
| levenshtein.c | ||
| levenshtein.h | ||
| line-log.c | ||
| line-log.h | ||
| line-range.c | ||
| line-range.h | ||
| list-objects.c | ||
| list-objects.h | ||
| list.h | ||
| ll-merge.c | ||
| ll-merge.h | ||
| lockfile.c | ||
| lockfile.h | ||
| log-tree.c | ||
| log-tree.h | ||
| mailinfo.c | ||
| mailinfo.h | ||
| mailmap.c | ||
| mailmap.h | ||
| match-trees.c | ||
| merge-blobs.c | ||
| merge-blobs.h | ||
| merge-recursive.c | ||
| merge-recursive.h | ||
| merge.c | ||
| mergesort.c | ||
| mergesort.h | ||
| mru.c | ||
| mru.h | ||
| name-hash.c | ||
| notes-cache.c | ||
| notes-cache.h | ||
| notes-merge.c | ||
| notes-merge.h | ||
| notes-utils.c | ||
| notes-utils.h | ||
| notes.c | ||
| notes.h | ||
| object.c | ||
| object.h | ||
| pack-bitmap-write.c | ||
| pack-bitmap.c | ||
| pack-bitmap.h | ||
| pack-check.c | ||
| pack-objects.c | ||
| pack-objects.h | ||
| pack-revindex.c | ||
| pack-revindex.h | ||
| pack-write.c | ||
| pack.h | ||
| pager.c | ||
| parse-options-cb.c | ||
| parse-options.c | ||
| parse-options.h | ||
| patch-delta.c | ||
| patch-ids.c | ||
| patch-ids.h | ||
| path.c | ||
| pathspec.c | ||
| pathspec.h | ||
| pkt-line.c | ||
| pkt-line.h | ||
| preload-index.c | ||
| pretty.c | ||
| prio-queue.c | ||
| prio-queue.h | ||
| progress.c | ||
| progress.h | ||
| prompt.c | ||
| prompt.h | ||
| quote.c | ||
| quote.h | ||
| reachable.c | ||
| reachable.h | ||
| read-cache.c | ||
| ref-filter.c | ||
| ref-filter.h | ||
| reflog-walk.c | ||
| reflog-walk.h | ||
| refs.c | ||
| refs.h | ||
| remote-curl.c | ||
| remote-testsvn.c | ||
| remote.c | ||
| remote.h | ||
| replace_object.c | ||
| rerere.c | ||
| rerere.h | ||
| resolve-undo.c | ||
| resolve-undo.h | ||
| revision.c | ||
| revision.h | ||
| run-command.c | ||
| run-command.h | ||
| send-pack.c | ||
| send-pack.h | ||
| sequencer.c | ||
| sequencer.h | ||
| server-info.c | ||
| setup.c | ||
| sh-i18n--envsubst.c | ||
| sha1-array.c | ||
| sha1-array.h | ||
| sha1-lookup.c | ||
| sha1-lookup.h | ||
| sha1_file.c | ||
| sha1_name.c | ||
| shallow.c | ||
| shell.c | ||
| shortlog.h | ||
| show-index.c | ||
| sideband.c | ||
| sideband.h | ||
| sigchain.c | ||
| sigchain.h | ||
| split-index.c | ||
| split-index.h | ||
| strbuf.c | ||
| strbuf.h | ||
| streaming.c | ||
| streaming.h | ||
| string-list.c | ||
| string-list.h | ||
| submodule-config.c | ||
| submodule-config.h | ||
| submodule.c | ||
| submodule.h | ||
| symlinks.c | ||
| tag.c | ||
| tag.h | ||
| tar.h | ||
| tempfile.c | ||
| tempfile.h | ||
| thread-utils.c | ||
| thread-utils.h | ||
| tmp-objdir.c | ||
| tmp-objdir.h | ||
| trace.c | ||
| trace.h | ||
| trailer.c | ||
| trailer.h | ||
| transport-helper.c | ||
| transport.c | ||
| transport.h | ||
| tree-diff.c | ||
| tree-walk.c | ||
| tree-walk.h | ||
| tree.c | ||
| tree.h | ||
| unicode_width.h | ||
| unimplemented.sh | ||
| unix-socket.c | ||
| unix-socket.h | ||
| unpack-trees.c | ||
| unpack-trees.h | ||
| upload-pack.c | ||
| url.c | ||
| url.h | ||
| urlmatch.c | ||
| urlmatch.h | ||
| usage.c | ||
| userdiff.c | ||
| userdiff.h | ||
| utf8.c | ||
| utf8.h | ||
| varint.c | ||
| varint.h | ||
| version.c | ||
| version.h | ||
| versioncmp.c | ||
| walker.c | ||
| walker.h | ||
| wildmatch.c | ||
| wildmatch.h | ||
| worktree.c | ||
| worktree.h | ||
| wrap-for-bin.sh | ||
| wrapper.c | ||
| write_or_die.c | ||
| ws.c | ||
| wt-status.c | ||
| wt-status.h | ||
| xdiff-interface.c | ||
| xdiff-interface.h | ||
| zlib.c | ||
README.md
Git - fast, scalable, distributed revision control system
Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.
Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.
Please read the file INSTALL for installation instructions.
Many Git online resources are accessible from http://git-scm.com/ including full documentation and Git related tools.
See Documentation/gittutorial.txt to get started, then see
Documentation/giteveryday.txt for a useful minimum set of commands, and
Documentation/git-.txt for documentation of each command.
If git has been correctly installed, then the tutorial can also be
read with man gittutorial or git help tutorial, and the
documentation of each command with man git-<commandname> or git help <commandname>.
CVS users may also want to read Documentation/gitcvs-migration.txt
(man gitcvs-migration or git help cvs-migration if git is
installed).
The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://public-inbox.org/git, http://marc.info/?l=git and other archival sites.
The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.
The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):
- random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
- stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
- "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
- "goddamn idiotic truckload of sh*t": when it breaks