Go to file
René Scharfe be39144954 userdiff: support regexec(3) with multi-byte support
Since 1819ad327b (grep: fix multibyte regex handling under macOS,
2022-08-26) we use the system library for all regular expression
matching on macOS, not just for git grep.  It supports multi-byte
strings and rejects invalid multi-byte characters.

This broke all built-in userdiff word regexes in UTF-8 locales because
they all include such invalid bytes in expressions that are intended to
match multi-byte characters without explicit support for that from the
regex engine.

"|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+" is added to all built-in word
regexes to match a single non-space or multi-byte character.  The \xNN
characters are invalid if interpreted as UTF-8 because they have their
high bit set, which indicates they are part of a multi-byte character,
but they are surrounded by single-byte characters.

Replace that expression with "|[^[:space:]]" if the regex engine
supports multi-byte matching, as there is no need to have an explicit
range for multi-byte characters then.  Check for that capability at
runtime, because it depends on the locale and thus on environment
variables.  Construct the full replacement expression at build time
and just switch it in if necessary to avoid string manipulation and
allocations at runtime.

Additionally the word regex for tex contains the expression
"[a-zA-Z0-9\x80-\xff]+" with a similarly invalid range.  The best
replacement with only valid characters that I can come up with is
"([a-zA-Z0-9]|[^\x01-\x7f])+".  Unlike the original it matches NUL
characters, though.  Assuming that tex files usually don't contain NUL
this should be acceptable.

Reported-by: D. Ben Knoble <ben.knoble@gmail.com>
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-07 07:38:09 -07:00
.github Merge branch 'tb/ci-concurrency' into maint-2.39 2023-02-14 14:15:46 -08:00
Documentation Prepare for 2.39.3 just in case 2023-02-14 14:15:57 -08:00
block-sha1
builtin Merge branch 'rs/am-parse-options-cleanup' into maint-2.39 2023-02-14 14:15:56 -08:00
ci Merge branch 'jx/ci-ubuntu-fix' into maint-2.38 2022-12-10 16:17:47 +09:00
compat use enhanced basic regular expressions on macOS 2023-01-08 10:06:34 +09:00
contrib Merge branch 'ab/fewer-the-index-macros' 2022-12-01 18:38:07 +09:00
ewah
git-gui Makefiles: change search through $(MAKEFLAGS) for GNU make 4.4 2022-12-01 07:24:12 +09:00
gitk-git
gitweb
mergetools
negotiator
oss-fuzz
perl
po l10n: zh_TW.po: Git 2.39-rc2 2022-12-11 01:27:25 +08:00
refs Merge branch 'ps/fsync-refs-fix' into maint-2.39 2023-02-14 14:15:50 -08:00
reftable
sha1collisiondetection@855827c583
sha1dc
sha256
t userdiff: support regexec(3) with multi-byte support 2023-04-07 07:38:09 -07:00
templates
trace2
xdiff xdiff: mark unused parameter in xdl_call_hunk_func() 2022-12-13 22:16:23 +09:00
.cirrus.yml
.clang-format
.editorconfig
.gitattributes
.gitignore Merge branch 'ab/coccicheck-incremental' 2022-11-23 11:22:23 +09:00
.gitmodules
.mailmap mailmap: update email address of Matheus Tavares 2022-12-10 09:17:36 +09:00
.tsan-suppressions
CODE_OF_CONDUCT.md
COPYING
GIT-VERSION-GEN Prepare for 2.39.3 just in case 2023-02-14 14:15:57 -08:00
INSTALL Sync with 2.38.4 2023-02-06 09:43:39 +01:00
LGPL-2.1
Makefile use enhanced basic regular expressions on macOS 2023-01-08 10:06:34 +09:00
README.md
RelNotes Prepare for 2.39.3 just in case 2023-02-14 14:15:57 -08:00
SECURITY.md
abspath.c
aclocal.m4
add-interactive.c diff: mark unused parameters in callbacks 2022-12-13 22:16:23 +09:00
add-interactive.h
add-patch.c read-cache API & users: make discard_index() return void 2022-11-21 12:06:15 +09:00
advice.c
advice.h
alias.c
alias.h
alloc.c
alloc.h
apply.c Merge branch 'jk/unused-post-2.39' into maint-2.39 2023-02-14 14:15:55 -08:00
apply.h
archive-tar.c
archive-zip.c
archive.c
archive.h
attr.c Sync with maint-2.37 2023-01-19 13:48:26 -08:00
attr.h Merge branch 'maint-2.35' into maint-2.36 2022-12-13 21:19:11 +09:00
banned.h
base85.c
bisect.c
bisect.h
blame.c
blame.h
blob.c blob: drop unused parts of parse_blob_buffer() 2022-12-13 22:16:22 +09:00
blob.h blob: drop unused parts of parse_blob_buffer() 2022-12-13 22:16:22 +09:00
bloom.c
bloom.h
branch.c
branch.h
builtin.h
bulk-checkin.c
bulk-checkin.h
bundle-uri.c
bundle-uri.h
bundle.c
bundle.h
cache-tree.c
cache-tree.h
cache.h ws: drop unused parameter from ws_blank_line() 2022-12-13 22:16:23 +09:00
cbtree.c
cbtree.h
chdir-notify.c
chdir-notify.h
check-builtins.sh
checkout.c
checkout.h
chunk-format.c
chunk-format.h
color.c
color.h
column.c utf8: fix truncated string lengths in `utf8_strnwidth()` 2022-12-09 14:26:21 +09:00
column.h
combine-diff.c diff: mark unused parameters in callbacks 2022-12-13 22:16:23 +09:00
command-list.txt
commit-graph.c
commit-graph.h
commit-reach.c
commit-reach.h
commit-slab-decl.h
commit-slab-impl.h
commit-slab.h
commit.c Merge branch 'rs/clear-commit-marks-cleanup' into maint-2.39 2023-02-14 14:15:56 -08:00
commit.h
common-main.c
config.c Merge branch 'pw/config-int-parse-fixes' 2022-11-28 12:13:43 +09:00
config.h
config.mak.dev
config.mak.in
config.mak.uname use enhanced basic regular expressions on macOS 2023-01-08 10:06:34 +09:00
configure.ac
connect.c server_supports_v2(): use a separate function for die_on_error 2022-12-13 22:08:52 +09:00
connect.h server_supports_v2(): use a separate function for die_on_error 2022-12-13 22:08:52 +09:00
connected.c receive-pack: only use visible refs for connectivity check 2022-11-17 16:22:52 -05:00
connected.h receive-pack: only use visible refs for connectivity check 2022-11-17 16:22:52 -05:00
convert.c
convert.h
copy.c
credential.c
credential.h
csum-file.c
csum-file.h
ctype.c
daemon.c
date.c
date.h
decorate.c
decorate.h
delta-islands.c delta-islands: free island-related data after use 2022-11-18 18:30:49 -05:00
delta-islands.h
delta.h
detect-compiler
diagnose.c
diagnose.h
diff-delta.c
diff-lib.c diff: mark unused parameters in callbacks 2022-12-13 22:16:23 +09:00
diff-merges.c diff-merges: cleanup set_diff_merges() 2022-09-16 09:21:43 -07:00
diff-merges.h
diff-no-index.c
diff.c Merge branch 'jk/unused-post-2.39' into maint-2.39 2023-02-14 14:15:55 -08:00
diff.h
diffcore-break.c
diffcore-delta.c
diffcore-order.c
diffcore-pickaxe.c
diffcore-rename.c
diffcore-rotate.c
diffcore.h
dir-iterator.c dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS 2023-01-24 16:52:16 -08:00
dir-iterator.h dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS 2023-01-24 16:52:16 -08:00
dir.c dir: check for single file cone patterns 2023-01-05 11:14:28 +09:00
dir.h
editor.c
entry.c
entry.h
environment.c
environment.h
exec-cmd.c
exec-cmd.h
fetch-negotiator.c
fetch-negotiator.h
fetch-pack.c server_supports_v2(): use a separate function for die_on_error 2022-12-13 22:08:52 +09:00
fetch-pack.h
fmt-merge-msg.c
fmt-merge-msg.h
fsck.c Merge branch 'maint-2.36' into maint-2.37 2022-12-13 21:20:35 +09:00
fsck.h Sync with 2.38.3 2022-12-13 21:25:15 +09:00
fsmonitor--daemon.h
fsmonitor-ipc.c
fsmonitor-ipc.h
fsmonitor-path-utils.h
fsmonitor-settings.c
fsmonitor-settings.h
fsmonitor.c
fsmonitor.h
generate-cmdlist.sh
generate-configlist.sh
generate-hooklist.sh
gettext.c
gettext.h
git-add--interactive.perl
git-archimport.perl
git-bisect.sh bisect--helper: parse subcommand with OPT_SUBCOMMAND 2022-11-11 17:04:57 -05:00
git-compat-util.h Merge branch 'rs/use-enhanced-bre-on-macos' into maint-2.39 2023-02-14 14:15:49 -08:00
git-curl-compat.h http: support CURLOPT_PROTOCOLS_STR 2023-02-06 09:27:09 +01:00
git-cvsexportcommit.perl
git-cvsimport.perl
git-cvsserver.perl
git-difftool--helper.sh
git-filter-branch.sh
git-instaweb.sh
git-merge-octopus.sh
git-merge-one-file.sh
git-merge-resolve.sh
git-mergetool--lib.sh
git-mergetool.sh
git-p4.py
git-quiltimport.sh
git-request-pull.sh
git-send-email.perl
git-sh-i18n.sh
git-sh-setup.sh
git-submodule.sh
git-svn.perl
git-web--browse.sh
git.c Merge branch 'ab/submodule-helper-prep-only' 2022-11-23 11:22:22 +09:00
git.rc
gpg-interface.c
gpg-interface.h
graph.c
graph.h
grep.c
grep.h
hash-lookup.c
hash-lookup.h
hash.h
hashmap.c
hashmap.h
help.c
help.h
hex.c
hook.c
hook.h
http-backend.c
http-fetch.c
http-push.c Sync with 2.36.5 2023-02-06 09:38:31 +01:00
http-walker.c
http.c Sync with 2.38.4 2023-02-06 09:43:39 +01:00
http.h Sync with 2.37.6 2023-02-06 09:43:28 +01:00
ident.c
imap-send.c
iterator.h
json-writer.c
json-writer.h
khash.h
kwset.c
kwset.h
levenshtein.c
levenshtein.h
line-log.c
line-log.h
line-range.c line-range: fix infinite loop bug with '$' regex 2022-12-20 10:00:43 +09:00
line-range.h
linear-assignment.c
linear-assignment.h
list-objects-filter-options.c
list-objects-filter-options.h
list-objects-filter.c Merge branch 'jk/unused-post-2.39' into maint-2.39 2023-02-14 14:15:55 -08:00
list-objects-filter.h
list-objects.c list-objects: drop process_gitlink() function 2022-12-13 22:16:22 +09:00
list-objects.h
list.h
ll-merge.c
ll-merge.h
lockfile.c
lockfile.h
log-tree.c
log-tree.h
ls-refs.c ls-refs: use repository parameter to iterate refs 2022-12-13 22:16:22 +09:00
ls-refs.h
mailinfo.c
mailinfo.h
mailmap.c
mailmap.h
match-trees.c
mem-pool.c
mem-pool.h
merge-blobs.c
merge-blobs.h
merge-ort-wrappers.c
merge-ort-wrappers.h
merge-ort.c
merge-ort.h
merge-recursive.c merge-recursive: fix variable typo in error message 2022-11-27 10:26:10 +09:00
merge-recursive.h
merge.c
mergesort.h
midx.c
midx.h
name-hash.c
notes-cache.c
notes-cache.h
notes-merge.c
notes-merge.h
notes-utils.c
notes-utils.h
notes.c
notes.h
object-file.c object-file: inline write_buffer() 2022-12-14 10:29:19 +09:00
object-name.c
object-store.h
object.c blob: drop unused parts of parse_blob_buffer() 2022-12-13 22:16:22 +09:00
object.h
oid-array.c
oid-array.h
oidmap.c
oidmap.h
oidset.c
oidset.h
oidtree.c
oidtree.h
pack-bitmap-write.c
pack-bitmap.c
pack-bitmap.h
pack-check.c
pack-mtimes.c
pack-mtimes.h
pack-objects.c
pack-objects.h
pack-revindex.c
pack-revindex.h
pack-write.c git: remove duplicate includes 2022-12-15 09:09:38 +09:00
pack.h
packfile.c
packfile.h
pager.c
parallel-checkout.c
parallel-checkout.h
parse-options-cb.c
parse-options.c
parse-options.h
patch-delta.c
patch-ids.c
patch-ids.h
path.c
path.h
pathspec.c
pathspec.h
pkt-line.c
pkt-line.h
preload-index.c
pretty.c Sync with Git 2.37.5 2022-12-13 21:23:36 +09:00
pretty.h
prio-queue.c
prio-queue.h
progress.c
progress.h
promisor-remote.c
promisor-remote.h
prompt.c
prompt.h
protocol-caps.c
protocol-caps.h
protocol.c
protocol.h
prune-packed.c
prune-packed.h
quote.c
quote.h
range-diff.c diff: mark unused parameters in callbacks 2022-12-13 22:16:23 +09:00
range-diff.h
reachable.c
reachable.h
read-cache.c read-cache API & users: make discard_index() return void 2022-11-21 12:06:15 +09:00
rebase-interactive.c
rebase-interactive.h
rebase.c
rebase.h
ref-filter.c ls-refs: use repository parameter to iterate refs 2022-12-13 22:16:22 +09:00
ref-filter.h
reflog-walk.c
reflog-walk.h
reflog.c Merge branch 'rs/reflog-expiry-cleanup' into maint-2.39 2023-02-14 14:15:56 -08:00
reflog.h
refs.c ls-refs: use repository parameter to iterate refs 2022-12-13 22:16:22 +09:00
refs.h ls-refs: use repository parameter to iterate refs 2022-12-13 22:16:22 +09:00
refspec.c
refspec.h
remote-curl.c Sync with 2.37.6 2023-02-06 09:43:28 +01:00
remote.c
remote.h
replace-object.c
replace-object.h
repo-settings.c
repository.c {builtin/*,repository}.c: add & use "USE_THE_INDEX_VARIABLE" 2022-11-21 12:06:15 +09:00
repository.h
rerere.c
rerere.h
reset.c rebase: use 'skip_cache_tree_update' option 2022-11-10 21:49:34 -05:00
reset.h
resolve-undo.c
resolve-undo.h
revision.c diff: mark unused parameters in callbacks 2022-12-13 22:16:23 +09:00
revision.h Merge branch 'ps/receive-use-only-advertised' 2022-11-23 11:22:25 +09:00
run-command.c
run-command.h
scalar.c Merge branch 'js/remove-stale-scalar-repos' 2022-11-23 11:22:23 +09:00
send-pack.c
send-pack.h
sequencer.c git: remove duplicate includes 2022-12-15 09:09:38 +09:00
sequencer.h sequencer: stop exporting GIT_REFLOG_ACTION 2022-11-09 18:15:43 -05:00
serve.c
serve.h
server-info.c
setup.c
sh-i18n--envsubst.c
sha1dc_git.c
sha1dc_git.h
shallow.c
shallow.h
shared.mak Merge branch 'ab/gnumake-4.4-fix' 2022-12-01 18:38:07 +09:00
shell.c
shortlog.h
sideband.c
sideband.h
sigchain.c
sigchain.h
simple-ipc.h
sparse-index.c
sparse-index.h
split-index.c
split-index.h
stable-qsort.c
strbuf.c
strbuf.h
streaming.c
streaming.h
string-list.c
string-list.h
strmap.c
strmap.h
strvec.c
strvec.h
sub-process.c
sub-process.h
submodule-config.c
submodule-config.h
submodule.c diff: mark unused parameters in callbacks 2022-12-13 22:16:23 +09:00
submodule.h
symlinks.c
tag.c
tag.h
tar.h
tempfile.c
tempfile.h
thread-utils.c
thread-utils.h
tmp-objdir.c
tmp-objdir.h
trace.c
trace.h
trace2.c
trace2.h
trailer.c
trailer.h
transport-helper.c
transport-internal.h
transport.c
transport.h
tree-diff.c
tree-walk.c
tree-walk.h
tree.c
tree.h
unicode-width.h
unimplemented.sh
unix-socket.c
unix-socket.h
unix-stream-server.c
unix-stream-server.h
unpack-trees.c git: remove duplicate includes 2022-12-15 09:09:38 +09:00
unpack-trees.h unpack-trees: add 'skip_cache_tree_update' option 2022-11-10 21:49:34 -05:00
upload-pack.c refs: get rid of global list of hidden refs 2022-11-17 16:22:51 -05:00
upload-pack.h
url.c
url.h
urlmatch.c
urlmatch.h
usage.c
userdiff.c userdiff: support regexec(3) with multi-byte support 2023-04-07 07:38:09 -07:00
userdiff.h userdiff: support regexec(3) with multi-byte support 2023-04-07 07:38:09 -07:00
utf8.c Sync with Git 2.31.6 2022-12-13 21:09:40 +09:00
utf8.h Sync with Git 2.31.6 2022-12-13 21:09:40 +09:00
varint.c
varint.h
version.c
version.h
versioncmp.c
walker.c
walker.h
wildmatch.c
wildmatch.h
worktree.c
worktree.h
wrap-for-bin.sh
wrapper.c
write-or-die.c
ws.c ws: drop unused parameter from ws_blank_line() 2022-12-13 22:16:23 +09:00
wt-status.c diff: mark unused parameters in callbacks 2022-12-13 22:16:23 +09:00
wt-status.h
xdiff-interface.c
xdiff-interface.h
zlib.c

README.md

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission and Documentation/CodingGuidelines).

Those wishing to help with error message, usage and informational message string translations (localization l10) should see po/README.md (a po file is a Portable Object file that holds the translations).

To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org (not the Git list). The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks