Go to file
Derrick Stolee fd67d149bd commit-reach: implement ahead_behind() logic
Fully implement the commit-counting logic required to determine
ahead/behind counts for a batch of commit pairs. This is a new library
method within commit-reach.h. This method will be linked to the
for-each-ref builtin in the next change.

The interface for ahead_behind() uses two arrays. The first array of
commits contains the list of all starting points for the walk. This
includes all tip commits _and_ base commits. The second array specifies
base/tip pairs by pointing to commits within the first array, by index.
The second array also stores the resulting ahead/behind counts for each
of these pairs.

This implementation of ahead_behind() allows multiple bases, if desired.
Even with multiple bases, there is only one commit walk used for
counting the ahead/behind values, saving time when the base/tip ranges
overlap significantly.

This interface for ahead_behind() also makes it very easy to call
ensure_generations_valid() on the entire array of bases and tips. This
call is necessary because it is critical that the walk that counts
ahead/behind values never walks a commit more than once. Without
generation numbers on every commit, there is a possibility that a
commit date skew could cause the walk to revisit a commit and then
double-count it. For this reason, it is strongly recommended that 'git
ahead-behind' is only run in a repository with a commit-graph file that
covers most of the reachable commits, storing precomputed generation
numbers. If no commit-graph exists, this walk will be much slower as it
must walk all reachable commits in ensure_generations_valid() before
performing the counting logic.

It is possible to detect if generation numbers are available at run time
and redirect the implementation to another algorithm that does not
require this property. However, that implementation requires a commit
walk per base/tip pair _and_ can be slower due to the commit date
heuristics required. Such an implementation could be considered in the
future if there is a reason to include it, but most Git hosts should
already be generating a commit-graph file as part of repository
maintenance. Most Git clients should also be generating commit-graph
files as part of background maintenance or automatic GCs.

Now, let's discuss the ahead/behind counting algorithm.

The first array of commits are considered the starting commits. The
index within that array will play a critical role.

We create a new commit slab that maps commits to a bitmap. For a given
commit (anywhere in the history), its bitmap stores information relative
to which of the input commits can reach that commit. The ith bit will be
on if the ith commit from the starting list can reach that commit. It is
important to notice that these bitmaps are not the typical "reachability
bitmaps" that are stored in .bitmap files. Instead of signalling which
objects are reachable from the current commit, they instead signal
"which starting commits can reach me?" It is also important to know that
the bitmap is not necessarily "complete" until we walk that commit. We
will perform a commit walk by generation number in such a way that we
can guarantee the bitmap is correct when we visit that commit.

At the beginning of the ahead_behind() method, we initialize the bitmaps
for each of the starting commits. By enabling the ith bit for the ith
starting commit, we signal "the ith commit can reach itself."

We walk commits by popping the commit with maximum generation number out
of the queue, guaranteeing that we will never walk a child of that
commit in any future steps.

As we walk, we load the bitmap for the current commit and perform two
main steps. The _second_ step examines each parent of the current commit
and adds the current commit's bitmap bits to each parent's bitmap. (We
create a new bitmap for the parent if this is our first time seeing that
parent.) After adding the bits to the parent's bitmap, the parent is
added to the walk queue. Due to this passing of bits to parents, the
current commit has a guarantee that the ith bit is enabled on its bitmap
if and only if the ith commit can reach the current commit.

The first step of the walk is to examine the bitmask on the current
commit and decide which ranges the commit is in or not. Due to the "bit
pushing" in the second step, we have a guarantee that the ith bit of the
current commit's bitmap is on if and only if the ith starting commit can
reach it. For each ahead_behind_count struct, check the base_index and
tip_index to see if those bits are enabled on the current bitmap. If
exactly one bit is enabled, then increment the corresponding 'ahead' or
'behind' count.  This increment is the reason we _absolutely need_ to
walk commits at most once.

The only subtle thing to do with this walk is to check to see if a
parent has all bits on in its bitmap, in which case it becomes "stale"
and is marked with the STALE bit. This allows queue_has_nonstale() to be
the terminating condition of the walk, which greatly reduces the number
of commits walked if all of the commits are nearby in history. It avoids
walking a large number of common commits when there is a deep history.
We also use the helper method insert_no_dup() to add commits to the
priority queue without adding them multiple times. This uses the PARENT2
flag. Thus, we must clear both the STALE and PARENT2 bits of all
commits, in case ahead_behind() is called multiple times in the same
process.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
.github Merge branch 'tb/ci-concurrency' into maint-2.39 2023-02-14 14:15:46 -08:00
Documentation for-each-ref: add --stdin option 2023-03-20 12:17:32 -07:00
block-sha1
builtin for-each-ref: add --stdin option 2023-03-20 12:17:32 -07:00
ci add: remove "add.interactive.useBuiltin" & Perl "git add--interactive" 2023-02-06 15:03:34 -08:00
compat Merge branch 'sk/winansi-createthread-fix' 2023-02-09 14:40:47 -08:00
contrib cocci & cache.h: remove "USE_THE_INDEX_COMPATIBILITY_MACROS" 2023-02-10 11:38:40 -08:00
ewah
git-gui
gitk-git
gitweb
mergetools
negotiator
oss-fuzz
perl
po
refs Merge branch 'ps/fsync-refs-fix' into maint-2.39 2023-02-14 14:15:50 -08:00
reftable
sha1collisiondetection@855827c583
sha1dc
sha256
t commit-graph: return generation from memory 2023-03-20 12:17:33 -07:00
templates
trace2
xdiff
.cirrus.yml
.clang-format
.editorconfig
.gitattributes .gitattributes: include `text` attribute for eol attributes 2023-02-06 13:57:08 -08:00
.gitignore add: remove "add.interactive.useBuiltin" & Perl "git add--interactive" 2023-02-06 15:03:34 -08:00
.gitmodules
.mailmap
.tsan-suppressions
CODE_OF_CONDUCT.md
COPYING
GIT-VERSION-GEN Git 2.40-rc1 2023-03-01 08:13:35 -08:00
INSTALL add: remove "add.interactive.useBuiltin" & Perl "git add--interactive" 2023-02-06 15:03:34 -08:00
LGPL-2.1
Makefile add: remove "add.interactive.useBuiltin" & Perl "git add--interactive" 2023-02-06 15:03:34 -08:00
README.md
RelNotes Prepare for 2.39.3 just in case 2023-02-14 14:15:57 -08:00
SECURITY.md
abspath.c
aclocal.m4
add-interactive.c
add-interactive.h
add-patch.c
advice.c
advice.h
alias.c
alias.h
alloc.c
alloc.h
apply.c Merge branch 'jk/unused-post-2.39' into maint-2.39 2023-02-14 14:15:55 -08:00
apply.h
archive-tar.c
archive-zip.c
archive.c Merge branch 'rs/archive-mtime' 2023-02-27 10:08:57 -08:00
archive.h archive: add --mtime 2023-02-18 09:29:13 -08:00
attr.c Merge branch 'kn/attr-from-tree' 2023-01-23 13:39:51 -08:00
attr.h attr: fix instructions on how to check attrs 2023-01-26 14:16:48 -08:00
banned.h
base85.c
bisect.c
bisect.h
blame.c
blame.h
blob.c
blob.h
bloom.c
bloom.h
branch.c branch: improve advice when --recurse-submodules fails 2023-01-18 15:13:21 -08:00
branch.h
builtin.h
bulk-checkin.c
bulk-checkin.h
bundle-uri.c Merge branch 'ds/bundle-uri-5' 2023-02-15 17:11:52 -08:00
bundle-uri.h clone: set fetch.bundleURI if appropriate 2023-01-31 08:57:48 -08:00
bundle.c Merge branch 'ab/various-leak-fixes' 2023-02-22 14:55:45 -08:00
bundle.h
cache-tree.c Merge branch 'rs/cache-tree-strbuf-growth-fix' 2023-02-22 14:55:44 -08:00
cache-tree.h cache-tree API: remove redundant update_main_cache_tree() 2023-02-10 11:38:14 -08:00
cache.h Merge branch 'ab/the-index-compatibility' 2023-02-22 14:55:44 -08:00
cbtree.c
cbtree.h
chdir-notify.c
chdir-notify.h
check-builtins.sh
checkout.c
checkout.h
chunk-format.c
chunk-format.h
color.c
color.h
column.c
column.h
combine-diff.c
command-list.txt
commit-graph.c commit-graph: introduce `ensure_generations_valid()` 2023-03-20 12:17:33 -07:00
commit-graph.h commit-graph: introduce `ensure_generations_valid()` 2023-03-20 12:17:33 -07:00
commit-reach.c commit-reach: implement ahead_behind() logic 2023-03-20 12:17:33 -07:00
commit-reach.h commit-reach: implement ahead_behind() logic 2023-03-20 12:17:33 -07:00
commit-slab-decl.h
commit-slab-impl.h
commit-slab.h
commit.c Merge branch 'rs/clear-commit-marks-cleanup' into maint-2.39 2023-02-14 14:15:56 -08:00
commit.h add API: remove run_add_interactive() wrapper function 2023-02-06 15:03:34 -08:00
common-main.c
config.c
config.h config.h: remove unused git_configset_add_parameters() 2023-02-07 10:50:27 -08:00
config.mak.dev
config.mak.in
config.mak.uname Merge branch 'hj/remove-msys-support' 2023-02-09 14:40:47 -08:00
configure.ac
connect.c
connect.h
connected.c
connected.h
convert.c
convert.h
copy.c
credential.c credential: new attribute password_expiry_utc 2023-02-22 15:18:58 -08:00
credential.h credential: new attribute password_expiry_utc 2023-02-22 15:18:58 -08:00
csum-file.c
csum-file.h
ctype.c
daemon.c
date.c
date.h
decorate.c
decorate.h
delta-islands.c delta-islands: fix segfault when freeing island marks 2023-02-21 09:15:04 -08:00
delta-islands.h delta-islands: free island_marks and bitmaps 2023-02-03 18:01:46 -08:00
delta.h
detect-compiler
diagnose.c
diagnose.h
diff-delta.c
diff-lib.c
diff-merges.c
diff-merges.h
diff-no-index.c
diff.c Merge branch 'jc/diff-algo-attribute' 2023-02-27 10:08:56 -08:00
diff.h Merge branch 'jc/diff-algo-attribute' 2023-02-27 10:08:56 -08:00
diffcore-break.c
diffcore-delta.c
diffcore-order.c
diffcore-pickaxe.c
diffcore-rename.c
diffcore-rotate.c
diffcore.h
dir-iterator.c dir-iterator: drop unused `DIR_ITERATOR_FOLLOW_SYMLINKS` 2023-02-16 16:21:56 -08:00
dir-iterator.h dir-iterator: drop unused `DIR_ITERATOR_FOLLOW_SYMLINKS` 2023-02-16 16:21:56 -08:00
dir.c Merge branch 'ws/single-file-cone' 2023-01-16 12:07:47 -08:00
dir.h
editor.c
entry.c
entry.h
environment.c
environment.h
exec-cmd.c
exec-cmd.h
fetch-negotiator.c
fetch-negotiator.h
fetch-pack.c
fetch-pack.h
fmt-merge-msg.c
fmt-merge-msg.h
fsck.c fsck: do not assume NUL-termination of buffers 2023-01-19 15:39:43 -08:00
fsck.h fsck: provide a function to fsck buffer without object struct 2023-01-18 12:59:44 -08:00
fsmonitor--daemon.h
fsmonitor-ipc.c
fsmonitor-ipc.h
fsmonitor-path-utils.h
fsmonitor-settings.c treewide: always have a valid "index_state.repo" member 2023-01-17 14:32:06 -08:00
fsmonitor-settings.h
fsmonitor.c treewide: always have a valid "index_state.repo" member 2023-01-17 14:32:06 -08:00
fsmonitor.h
generate-cmdlist.sh
generate-configlist.sh
generate-hooklist.sh
gettext.c
gettext.h
git-archimport.perl
git-compat-util.h Merge branch 'rs/use-enhanced-bre-on-macos' into maint-2.39 2023-02-14 14:15:49 -08:00
git-curl-compat.h http: support CURLOPT_PROTOCOLS_STR 2023-02-06 09:27:09 +01:00
git-cvsexportcommit.perl
git-cvsimport.perl
git-cvsserver.perl
git-difftool--helper.sh
git-filter-branch.sh
git-instaweb.sh
git-merge-octopus.sh
git-merge-one-file.sh
git-merge-resolve.sh
git-mergetool--lib.sh
git-mergetool.sh
git-p4.py
git-quiltimport.sh
git-request-pull.sh request-pull: filter out SSH/X.509 tag signatures 2023-01-25 15:54:41 -08:00
git-send-email.perl
git-sh-i18n.sh
git-sh-setup.sh
git-submodule.sh
git-svn.perl
git-web--browse.sh
git.c trace.c, git.c: remove unnecessary parameter to trace_repo_setup() 2023-02-21 12:06:32 -08:00
git.rc
gpg-interface.c Merge branch 'js/gpg-errors' 2023-02-24 11:32:29 -08:00
gpg-interface.h
graph.c
graph.h
grep.c Merge branch 'ab/various-leak-fixes' 2023-02-22 14:55:45 -08:00
grep.h
hash-lookup.c
hash-lookup.h
hash.h
hashmap.c
hashmap.h
help.c
help.h
hex.c
hook.c hook API: support passing stdin to hooks, convert am's 'post-rewrite' 2023-02-08 12:50:03 -08:00
hook.h hook API: support passing stdin to hooks, convert am's 'post-rewrite' 2023-02-08 12:50:03 -08:00
http-backend.c http-backend.c: fix cmd_main() memory leak, refactor reg{exec,free}() 2023-02-06 15:34:38 -08:00
http-fetch.c
http-push.c Sync with 2.36.5 2023-02-06 09:38:31 +01:00
http-walker.c
http.c Sync with 2.38.4 2023-02-06 09:43:39 +01:00
http.h Sync with 2.37.6 2023-02-06 09:43:28 +01:00
ident.c
imap-send.c
iterator.h
json-writer.c
json-writer.h
khash.h
kwset.c
kwset.h
levenshtein.c
levenshtein.h
line-log.c
line-log.h
line-range.c
line-range.h
linear-assignment.c
linear-assignment.h
list-objects-filter-options.c
list-objects-filter-options.h
list-objects-filter.c Merge branch 'jk/unused-post-2.39' into maint-2.39 2023-02-14 14:15:55 -08:00
list-objects-filter.h
list-objects.c
list-objects.h
list.h
ll-merge.c
ll-merge.h
lockfile.c
lockfile.h
log-tree.c
log-tree.h
ls-refs.c
ls-refs.h
mailinfo.c
mailinfo.h
mailmap.c
mailmap.h
match-trees.c
mem-pool.c
mem-pool.h
merge-blobs.c
merge-blobs.h
merge-ort-wrappers.c
merge-ort-wrappers.h
merge-ort.c
merge-ort.h
merge-recursive.c treewide: always have a valid "index_state.repo" member 2023-01-17 14:32:06 -08:00
merge-recursive.h
merge.c
mergesort.h
midx.c
midx.h
name-hash.c
notes-cache.c
notes-cache.h
notes-merge.c
notes-merge.h
notes-utils.c
notes-utils.h
notes.c
notes.h
object-file.c Merge branch 'jk/hash-object-fsck' 2023-01-30 14:24:22 -08:00
object-name.c
object-store.h
object.c
object.h
oid-array.c
oid-array.h
oidmap.c
oidmap.h
oidset.c
oidset.h
oidtree.c
oidtree.h
pack-bitmap-write.c
pack-bitmap.c
pack-bitmap.h
pack-check.c
pack-mtimes.c
pack-mtimes.h
pack-objects.c
pack-objects.h
pack-revindex.c
pack-revindex.h
pack-write.c
pack.h
packfile.c
packfile.h
pager.c
parallel-checkout.c
parallel-checkout.h
parse-options-cb.c
parse-options.c
parse-options.h
patch-delta.c
patch-ids.c
patch-ids.h
path.c
path.h
pathspec.c docs & comments: replace mentions of "git-add--interactive.perl" 2023-02-06 15:03:34 -08:00
pathspec.h
pkt-line.c
pkt-line.h
preload-index.c
pretty.c
pretty.h
prio-queue.c
prio-queue.h
progress.c
progress.h
promisor-remote.c
promisor-remote.h
prompt.c
prompt.h
protocol-caps.c
protocol-caps.h
protocol.c
protocol.h
prune-packed.c
prune-packed.h
quote.c
quote.h
range-diff.c range-diff: avoid compiler warning when char is unsigned 2023-02-28 14:43:05 -08:00
range-diff.h
reachable.c
reachable.h
read-cache.c Merge branch 'rs/size-t-fixes' 2023-02-15 17:11:53 -08:00
rebase-interactive.c
rebase-interactive.h
rebase.c
rebase.h
ref-filter.c
ref-filter.h
reflog-walk.c
reflog-walk.h
reflog.c Merge branch 'rs/reflog-expiry-cleanup' into maint-2.39 2023-02-14 14:15:56 -08:00
reflog.h
refs.c Merge branch 'jk/shorten-unambiguous-ref-wo-sscanf' 2023-02-27 10:08:57 -08:00
refs.h
refspec.c
refspec.h
remote-curl.c Sync with 2.37.6 2023-02-06 09:43:28 +01:00
remote.c
remote.h
replace-object.c
replace-object.h
repo-settings.c
repository.c treewide: always have a valid "index_state.repo" member 2023-01-17 14:32:06 -08:00
repository.h
rerere.c
rerere.h
reset.c
reset.h
resolve-undo.c
resolve-undo.h
revision.c treewide: always have a valid "index_state.repo" member 2023-01-17 14:32:06 -08:00
revision.h
run-command.c run-command: allow stdin for run_processes_parallel 2023-02-08 12:50:03 -08:00
run-command.h
scalar.c scalar: only warn when background maintenance fails 2023-01-27 12:38:26 -08:00
send-pack.c
send-pack.h
sequencer.c Merge branch 'pw/rebase-i-parse-fix' 2023-02-28 16:38:47 -08:00
sequencer.h sequencer API users: fix get_replay_opts() leaks 2023-02-06 16:03:52 -08:00
serve.c
serve.h
server-info.c
setup.c
sh-i18n--envsubst.c
sha1dc_git.c
sha1dc_git.h
shallow.c
shallow.h
shared.mak
shell.c
shortlog.h
sideband.c
sideband.h
sigchain.c
sigchain.h
simple-ipc.h
sparse-index.c treewide: always have a valid "index_state.repo" member 2023-01-17 14:32:06 -08:00
sparse-index.h
split-index.c treewide: always have a valid "index_state.repo" member 2023-01-17 14:32:06 -08:00
split-index.h
stable-qsort.c
strbuf.c
strbuf.h
streaming.c
streaming.h
string-list.c
string-list.h
strmap.c
strmap.h
strvec.c
strvec.h
sub-process.c
sub-process.h
submodule-config.c
submodule-config.h
submodule.c
submodule.h
symlinks.c
tag.c
tag.h
tar.h
tempfile.c
tempfile.h
thread-utils.c
thread-utils.h
tmp-objdir.c
tmp-objdir.h
trace.c trace.c, git.c: remove unnecessary parameter to trace_repo_setup() 2023-02-21 12:06:32 -08:00
trace.h trace.c, git.c: remove unnecessary parameter to trace_repo_setup() 2023-02-21 12:06:32 -08:00
trace2.c
trace2.h
trailer.c
trailer.h
transport-helper.c
transport-internal.h
transport.c
transport.h
tree-diff.c
tree-walk.c
tree-walk.h
tree.c
tree.h
unicode-width.h
unimplemented.sh
unix-socket.c
unix-socket.h
unix-stream-server.c
unix-stream-server.h
unpack-trees.c treewide: always have a valid "index_state.repo" member 2023-01-17 14:32:06 -08:00
unpack-trees.h
upload-pack.c
upload-pack.h
url.c
url.h
urlmatch.c
urlmatch.h
usage.c
userdiff.c Merge branch 'jc/diff-algo-attribute' 2023-02-27 10:08:56 -08:00
userdiff.h diff: teach diff to read algorithm from diff driver 2023-02-21 09:29:10 -08:00
utf8.c
utf8.h
varint.c
varint.h
version.c
version.h
versioncmp.c
walker.c
walker.h
wildmatch.c
wildmatch.h
worktree.c
worktree.h
wrap-for-bin.sh
wrapper.c
write-or-die.c
ws.c Merge branch 'kn/attr-from-tree' 2023-01-23 13:39:51 -08:00
wt-status.c
wt-status.h
xdiff-interface.c
xdiff-interface.h
zlib.c

README.md

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission and Documentation/CodingGuidelines).

Those wishing to help with error message, usage and informational message string translations (localization l10) should see po/README.md (a po file is a Portable Object file that holds the translations).

To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org (not the Git list). The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks