Go to file
Ævar Arnfjörð Bjarmason 3878c7a540 perf: add a comparison test of grep regex engines
Add a very basic performance comparison test comparing the POSIX
basic, extended and perl engines.

In theory the "basic" and "extended" engines should be implemented
using the same underlying code with a slightly different pattern
parser, but some implementations may not do this. Jump through some
slight hoops to test both, which is worthwhile since "basic" is the
default.

Running this on an i7 3.4GHz Linux 4.9.0-2 Debian testing against a
checkout of linux.git & latest upstream PCRE, both PCRE and git
compiled with -O3 using gcc 7.1.1:

    $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh
    [...]
    Test                                            this tree
    ---------------------------------------------------------------
    7820.1: basic grep 'how.to'                     0.34(1.24+0.53)
    7820.2: extended grep 'how.to'                  0.33(1.23+0.45)
    7820.3: perl grep 'how.to'                      0.31(1.05+0.56)
    7820.5: basic grep '^how to'                    0.32(1.24+0.42)
    7820.6: extended grep '^how to'                 0.33(1.20+0.44)
    7820.7: perl grep '^how to'                     0.57(2.67+0.42)
    7820.9: basic grep '[how] to'                   0.51(2.16+0.45)
    7820.10: extended grep '[how] to'               0.49(2.20+0.43)
    7820.11: perl grep '[how] to'                   0.56(2.60+0.43)
    7820.13: basic grep '\(e.t[^ ]*\|v.ry\) rare'   0.66(3.25+0.40)
    7820.14: extended grep '(e.t[^ ]*|v.ry) rare'   0.65(3.19+0.46)
    7820.15: perl grep '(e.t[^ ]*|v.ry) rare'       1.05(5.74+0.34)
    7820.17: basic grep 'm\(ú\|u\)lt.b\(æ\|y\)te'   0.34(1.28+0.47)
    7820.18: extended grep 'm(ú|u)lt.b(æ|y)te'      0.34(1.38+0.38)
    7820.19: perl grep 'm(ú|u)lt.b(æ|y)te'          0.39(1.56+0.44)

Options can also be passed to git-grep via the GIT_PERF_7820_GREP_OPTS
environment variable. There are various modes such as "-v" that have
very different performance profiles, but handling the combinatorial
explosion of testing all those options would make this script much
more complex and harder to maintain. Instead just add the ability to
do one-shot runs with arbitrary options, e.g.:

    $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7820_GREP_OPTS=" -i" ./run p7820-grep-engines.sh
    [...]
    Test                                               this tree
    ------------------------------------------------------------------
    7820.1: basic grep -i 'how.to'                     0.49(1.72+0.38)
    7820.2: extended grep -i 'how.to'                  0.46(1.64+0.42)
    7820.3: perl grep -i 'how.to'                      0.44(1.45+0.45)
    7820.5: basic grep -i '^how to'                    0.47(1.76+0.38)
    7820.6: extended grep -i '^how to'                 0.47(1.70+0.42)
    7820.7: perl grep -i '^how to'                     0.65(2.72+0.37)
    7820.9: basic grep -i '[how] to'                   0.86(3.64+0.42)
    7820.10: extended grep -i '[how] to'               0.84(3.62+0.46)
    7820.11: perl grep -i '[how] to'                   0.73(3.06+0.39)
    7820.13: basic grep -i '\(e.t[^ ]*\|v.ry\) rare'   1.63(8.13+0.36)
    7820.14: extended grep -i '(e.t[^ ]*|v.ry) rare'   1.64(8.01+0.44)
    7820.15: perl grep -i '(e.t[^ ]*|v.ry) rare'       1.44(6.88+0.44)
    7820.17: basic grep -i 'm\(ú\|u\)lt.b\(æ\|y\)te'   0.66(2.67+0.44)
    7820.18: extended grep -i 'm(ú|u)lt.b(æ|y)te'      0.66(2.67+0.43)
    7820.19: perl grep -i 'm(ú|u)lt.b(æ|y)te'          0.59(2.31+0.37)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:52:36 +09:00
Documentation grep & rev-list doc: stop promising libpcre for --perl-regexp 2017-05-21 08:25:37 +09:00
block-sha1
builtin Merge branch 'ja/i18n-cleanup' 2017-05-04 16:26:44 +09:00
ci Merge branch 'rg/a-the-typo' 2017-05-04 16:26:47 +09:00
compat
contrib Merge branch 'jk/complete-checkout-sans-dwim-remote' 2017-05-01 14:14:41 +09:00
ewah
git-gui
gitk-git
gitweb
mergetools
perl
po Merge branch 'master' of git://github.com/nafmo/git-l10n-sv 2017-05-09 22:12:34 +08:00
ppc
refs Merge branch 'mh/separate-ref-cache' 2017-04-26 15:39:13 +09:00
sha1dc
t perf: add a comparison test of grep regex engines 2017-05-26 12:52:36 +09:00
templates
vcs-svn
xdiff
.gitattributes
.gitignore
.mailmap
.travis.yml Merge branch 'ls/travis-stricter-linux32-builds' 2017-05-01 14:14:44 +09:00
COPYING
GIT-VERSION-GEN Git 2.13 2017-05-09 23:26:02 +09:00
INSTALL
LGPL-2.1
Makefile perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do 2017-05-21 08:25:38 +09:00
README.md
RelNotes Git 2.11.2 2017-05-05 13:29:43 +09:00
abspath.c
aclocal.m4
advice.c
advice.h
alias.c
alloc.c
apply.c
apply.h
archive-tar.c
archive-zip.c
archive.c
archive.h
argv-array.c
argv-array.h
attr.c
attr.h
base85.c
bisect.c Merge branch 'jk/war-on-git-path' 2017-04-26 15:39:08 +09:00
bisect.h
blob.c
blob.h
branch.c create_branch: use xstrfmt for reflog message 2017-03-30 14:59:50 -07:00
branch.h
builtin.h
bulk-checkin.c
bulk-checkin.h
bundle.c
bundle.h
cache-tree.c
cache-tree.h
cache.h Merge branch 'jh/add-index-entry-optim' 2017-04-26 15:39:07 +09:00
check-builtins.sh
check-racy.c
check_bindir
color.c
color.h
column.c
column.h
combine-diff.c Merge branch 'bc/object-id' 2017-04-19 21:37:13 -07:00
command-list.txt
commit-slab.h
commit.c
commit.h Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
common-main.c
config.c Merge branch 'nd/conditional-config-in-early-config' 2017-04-26 15:39:05 +09:00
config.mak.in
config.mak.uname
configure.ac Makefile & configure: reword inaccurate comment about PCRE 2017-05-21 08:25:37 +09:00
connect.c Merge branch 'sf/putty-w-args' 2017-04-26 15:39:10 +09:00
connect.h
connected.c
connected.h
convert.c
convert.h
copy.c
credential-cache--daemon.c
credential-cache.c Merge branch 'nd/conditional-config-include' 2017-04-23 22:07:46 -07:00
credential-store.c path.c: and an option to call real_path() in expand_user_path() 2017-04-14 23:51:38 -07:00
credential.c
credential.h
csum-file.c
csum-file.h
ctype.c
daemon.c Merge branch 'dt/xgethostname-nul-termination' 2017-04-23 22:07:57 -07:00
date.c
decorate.c
decorate.h
delta.h
diff-delta.c
diff-lib.c
diff-no-index.c
diff.c fix minor typos 2017-05-01 11:01:52 +09:00
diff.h Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
diffcore-break.c
diffcore-delta.c
diffcore-order.c
diffcore-pickaxe.c Merge branch 'js/regexec-buf' into maint 2017-03-28 13:52:24 -07:00
diffcore-rename.c
diffcore.h
dir-iterator.c
dir-iterator.h
dir.c Merge branch 'sb/checkout-recurse-submodules' 2017-03-28 14:05:58 -07:00
dir.h
editor.c
entry.c
environment.c Merge branch 'jk/snprintf-cleanups' 2017-04-16 23:29:26 -07:00
exec_cmd.c
exec_cmd.h
fast-import.c Merge branch 'jk/war-on-git-path' 2017-04-26 15:39:08 +09:00
fetch-pack.c Merge branch 'dt/xgethostname-nul-termination' 2017-04-23 22:07:57 -07:00
fetch-pack.h Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
fmt-merge-msg.h
fsck.c Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
fsck.h Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
generate-cmdlist.sh
gettext.c
gettext.h
git-add--interactive.perl Merge branch 'va/i18n-perl-scripts' 2017-04-19 21:37:17 -07:00
git-archimport.perl
git-bisect.sh
git-compat-util.h Merge branch 'dt/xgethostname-nul-termination' 2017-04-23 22:07:57 -07:00
git-cvsexportcommit.perl
git-cvsimport.perl
git-cvsserver.perl
git-difftool--helper.sh
git-filter-branch.sh
git-instaweb.sh
git-merge-octopus.sh
git-merge-one-file.sh
git-merge-resolve.sh
git-mergetool--lib.sh
git-mergetool.sh
git-p4.py git-p4: don't use name-rev to get current branch 2017-04-16 21:13:26 -07:00
git-parse-remote.sh
git-quiltimport.sh
git-rebase--am.sh
git-rebase--interactive.sh
git-rebase--merge.sh
git-rebase.sh Merge branch 'gb/rebase-signoff' 2017-04-26 15:39:02 +09:00
git-remote-testgit.sh
git-request-pull.sh
git-send-email.perl
git-sh-i18n.sh
git-sh-setup.sh
git-stash.sh
git-submodule.sh submodule: prevent backslash expantion in submodule names 2017-04-16 20:09:36 -07:00
git-svn.perl
git-web--browse.sh
git.c Merge branch 'bw/recurse-submodules-relative-fix' 2017-03-30 14:07:15 -07:00
git.rc
gpg-interface.c
gpg-interface.h
graph.c
graph.h
grep.c convert unchecked snprintf into xsnprintf 2017-03-30 14:59:50 -07:00
grep.h
hash.h
hashmap.c
hashmap.h
help.c
help.h
hex.c
http-backend.c
http-fetch.c
http-push.c
http-walker.c Merge branch 'ew/http-alternates-as-redirects-warning' into maint 2017-03-28 13:52:23 -07:00
http.c Merge branch 'dt/http-postbuffer-can-be-large' 2017-04-23 22:07:45 -07:00
http.h http.postbuffer: allow full range of ssize_t values 2017-04-13 18:24:32 -07:00
ident.c Merge branch 'dt/xgethostname-nul-termination' 2017-04-23 22:07:57 -07:00
imap-send.c convert unchecked snprintf into xsnprintf 2017-03-30 14:59:50 -07:00
iterator.h
khash.h
kwset.c
kwset.h
levenshtein.c
levenshtein.h
line-log.c
line-log.h
line-range.c
line-range.h
list-objects.c
list-objects.h
list.h
ll-merge.c
ll-merge.h
lockfile.c
lockfile.h
log-tree.c
log-tree.h
mailinfo.c Merge branch 'lt/mailinfo-in-body-header-continuation' 2017-04-19 21:37:15 -07:00
mailinfo.h
mailmap.c
mailmap.h
match-trees.c
merge-blobs.c
merge-blobs.h
merge-recursive.c
merge-recursive.h
merge.c
mergesort.c
mergesort.h
mru.c
mru.h
name-hash.c name-hash: fix buffer overrun 2017-03-31 20:57:18 -07:00
notes-cache.c
notes-cache.h
notes-merge.c replace strbuf_addstr(git_path()) with git_path_buf() 2017-04-20 21:04:20 -07:00
notes-merge.h
notes-utils.c
notes-utils.h
notes.c
notes.h
object.c
object.h
oidset.c
oidset.h
pack-bitmap-write.c odb_mkstemp: write filename into strbuf 2017-03-28 15:28:04 -07:00
pack-bitmap.c
pack-bitmap.h
pack-check.c
pack-objects.c
pack-objects.h
pack-revindex.c
pack-revindex.h
pack-write.c odb_mkstemp: write filename into strbuf 2017-03-28 15:28:04 -07:00
pack.h
pager.c Merge branch 'jk/pager-in-use' 2017-03-28 14:05:59 -07:00
parse-options-cb.c Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
parse-options.c
parse-options.h
patch-delta.c
patch-ids.c
patch-ids.h
path.c Merge branch 'nd/conditional-config-include' 2017-04-23 22:07:46 -07:00
pathspec.c Merge branch 'ps/pathspec-empty-prefix-origin' 2017-04-26 15:39:03 +09:00
pathspec.h
pkt-line.c
pkt-line.h
preload-index.c
pretty.c
prio-queue.c Merge branch 'jk/prio-queue-avoid-swap-with-self' 2017-05-01 14:14:43 +09:00
prio-queue.h
progress.c
progress.h
prompt.c
prompt.h
quote.c
quote.h
reachable.c
reachable.h
read-cache.c i18n: read-cache: typofix 2017-05-01 11:08:02 +09:00
ref-filter.c Merge branch 'bc/object-id' 2017-04-19 21:37:13 -07:00
ref-filter.h Merge branch 'bc/object-id' 2017-04-19 21:37:13 -07:00
reflog-walk.c
reflog-walk.h
refs.c Merge branch 'mh/separate-ref-cache' 2017-04-26 15:39:13 +09:00
refs.h refs_verify_refname_available(): implement once for all backends 2017-04-16 21:32:45 -07:00
remote-curl.c Merge branch 'dt/http-postbuffer-can-be-large' 2017-04-23 22:07:45 -07:00
remote-testsvn.c
remote.c Merge branch 'bw/push-options-recursively-to-submodules' 2017-04-19 21:37:14 -07:00
remote.h Merge branch 'bw/push-options-recursively-to-submodules' 2017-04-19 21:37:14 -07:00
replace_object.c
rerere.c
rerere.h
resolve-undo.c
resolve-undo.h
revision.c log: make --regexp-ignore-case work with --perl-regexp 2017-05-21 08:25:37 +09:00
revision.h Merge branch 'rs/path-name-safety-cleanup' into maint 2017-03-28 13:52:27 -07:00
run-command.c Merge branch 'jk/execv-dashed-external' into maint 2017-03-28 13:52:23 -07:00
run-command.h
send-pack.c Merge branch 'bc/object-id' 2017-04-19 21:37:13 -07:00
send-pack.h Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
sequencer.c Merge branch 'sh/rebase-i-reread-todo-after-exec' 2017-05-01 14:14:44 +09:00
sequencer.h
server-info.c server-info: avoid calling fclose(3) twice in update_info_file() 2017-04-17 17:37:28 -07:00
setup.c Merge branch 'bw/recurse-submodules-relative-fix' 2017-03-30 14:07:15 -07:00
sh-i18n--envsubst.c
sha1-array.c Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
sha1-array.h Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
sha1-lookup.c
sha1-lookup.h
sha1_file.c Merge branch 'jk/loose-object-fsck' 2017-04-23 22:07:50 -07:00
sha1_name.c Merge branch 'bc/object-id' 2017-04-19 21:37:13 -07:00
shallow.c Rename sha1_array to oid_array 2017-03-31 08:33:56 -07:00
shell.c Merge branch 'maint-2.8' into maint-2.9 2017-05-05 13:13:48 +09:00
shortlog.h
show-index.c
sideband.c
sideband.h
sigchain.c
sigchain.h
split-index.c
split-index.h
strbuf.c Merge branch 'rs/freebsd-getcwd-workaround' 2017-03-30 14:07:15 -07:00
strbuf.h Merge branch 'jk/interpret-branch-name' into maint 2017-03-28 13:52:22 -07:00
streaming.c
streaming.h
string-list.c Merge branch 'jh/string-list-micro-optim' 2017-04-23 22:07:47 -07:00
string-list.h
submodule-config.c Merge branch 'sb/checkout-recurse-submodules' 2017-03-28 14:05:58 -07:00
submodule-config.h
submodule.c Merge branch 'sb/checkout-recurse-submodules' 2017-04-23 22:07:54 -07:00
submodule.h Merge branch 'nd/files-backend-git-dir' 2017-04-19 21:37:19 -07:00
symlinks.c
tag.c
tag.h
tar.h
tempfile.c
tempfile.h
thread-utils.c
thread-utils.h
tmp-objdir.c
tmp-objdir.h
trace.c
trace.h
trailer.c
trailer.h
transport-helper.c transport-helper: replace checked snprintf with xsnprintf 2017-03-30 14:59:50 -07:00
transport.c Merge branch 'bw/push-options-recursively-to-submodules' 2017-04-19 21:37:14 -07:00
transport.h
tree-diff.c
tree-walk.c
tree-walk.h
tree.c
tree.h
unicode_width.h
unimplemented.sh
unix-socket.c
unix-socket.h
unpack-trees.c Merge branch 'jh/unpack-trees-micro-optim' 2017-04-23 22:07:48 -07:00
unpack-trees.h
upload-pack.c
url.c
url.h
urlmatch.c
urlmatch.h
usage.c
userdiff.c
userdiff.h
utf8.c
utf8.h
varint.c
varint.h
version.c
version.h
versioncmp.c
walker.c
walker.h
wildmatch.c
wildmatch.h
worktree.c Merge branch 'rs/strbuf-add-real-path' into maint 2017-03-28 13:52:19 -07:00
worktree.h
wrap-for-bin.sh
wrapper.c Merge branch 'dt/xgethostname-nul-termination' 2017-04-23 22:07:57 -07:00
write_or_die.c
ws.c
wt-status.c short status: improve reporting for submodule changes 2017-03-29 15:27:54 -07:00
wt-status.h
xdiff-interface.c
xdiff-interface.h
zlib.c

README.md

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://public-inbox.org/git/, http://marc.info/?l=git and other archival sites.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks