Go to file
Taylor Blau 0fabafd0b9 builtin/repack.c: add '--geometric' option
Often it is useful to both:

  - have relatively few packfiles in a repository, and

  - avoid having so few packfiles in a repository that we repack its
    entire contents regularly

This patch implements a '--geometric=<n>' option in 'git repack'. This
allows the caller to specify that they would like each pack to be at
least a factor times as large as the previous largest pack (by object
count).

Concretely, say that a repository has 'n' packfiles, labeled P1, P2,
..., up to Pn. Each packfile has an object count equal to 'objects(Pn)'.
With a geometric factor of 'r', it should be that:

  objects(Pi) > r*objects(P(i-1))

for all i in [1, n], where the packs are sorted by

  objects(P1) <= objects(P2) <= ... <= objects(Pn).

Since finding a true optimal repacking is NP-hard, we approximate it
along two directions:

  1. We assume that there is a cutoff of packs _before starting the
     repack_ where everything to the right of that cut-off already forms
     a geometric progression (or no cutoff exists and everything must be
     repacked).

  2. We assume that everything smaller than the cutoff count must be
     repacked. This forms our base assumption, but it can also cause
     even the "heavy" packs to get repacked, for e.g., if we have 6
     packs containing the following number of objects:

       1, 1, 1, 2, 4, 32

     then we would place the cutoff between '1, 1' and '1, 2, 4, 32',
     rolling up the first two packs into a pack with 2 objects. That
     breaks our progression and leaves us:

       2, 1, 2, 4, 32
         ^

     (where the '^' indicates the position of our split). To restore a
     progression, we move the split forward (towards larger packs)
     joining each pack into our new pack until a geometric progression
     is restored. Here, that looks like:

       2, 1, 2, 4, 32  ~>  3, 2, 4, 32  ~>  5, 4, 32  ~> ... ~> 9, 32
         ^                   ^                ^                   ^

This has the advantage of not repacking the heavy-side of packs too
often while also only creating one new pack at a time. Another wrinkle
is that we assume that loose, indexed, and reflog'd objects are
insignificant, and lump them into any new pack that we create. This can
lead to non-idempotent results.

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
.github Merge branch 'tb/ci-run-cocci-with-18.04' into maint 2021-02-11 13:57:36 -08:00
Documentation builtin/repack.c: add '--geometric' option 2021-02-22 23:30:52 -08:00
block-sha1
builtin builtin/repack.c: add '--geometric' option 2021-02-22 23:30:52 -08:00
ci Merge branch 'tb/pack-revindex-on-disk' 2021-02-12 14:21:04 -08:00
compat MacOS: precompose_argv_prefix() 2021-02-03 14:09:37 -08:00
contrib Merge branch 'jk/complete-branch-force-delete' 2021-02-12 14:21:04 -08:00
ewah
git-gui
gitk-git
gitweb
mergetools
negotiator
perl
po
ppc
refs
sha1collisiondetection@855827c583
sha1dc
sha256
t builtin/repack.c: add '--geometric' option 2021-02-22 23:30:52 -08:00
templates
trace2
vcs-svn
xdiff
.cirrus.yml
.clang-format
.editorconfig
.gitattributes
.gitignore
.gitmodules
.mailmap
.travis.yml
.tsan-suppressions
CODE_OF_CONDUCT.md
COPYING
GIT-VERSION-GEN Git 2.30.1 2021-02-08 14:05:55 -08:00
INSTALL
LGPL-2.1
Makefile Merge branch 'ab/grep-pcre-invalid-utf8' 2021-02-10 14:48:33 -08:00
README.md
RelNotes Prepare for 2.30.1 2021-02-05 16:31:28 -08:00
abspath.c
aclocal.m4
add-interactive.c
add-interactive.h
add-patch.c
advice.c
advice.h
alias.c
alias.h
alloc.c
alloc.h
apply.c
apply.h
archive-tar.c
archive-zip.c
archive.c
archive.h
attr.c
attr.h
banned.h
base85.c
bisect.c
bisect.h
blame.c
blame.h
blob.c
blob.h
bloom.c
bloom.h
branch.c
branch.h
builtin.h
bulk-checkin.c
bulk-checkin.h
bundle.c
bundle.h
cache-tree.c
cache-tree.h
cache.h Merge branch 'ds/more-index-cleanups' 2021-02-10 14:48:33 -08:00
chdir-notify.c
chdir-notify.h
check-builtins.sh
check_bindir
checkout.c
checkout.h
color.c
color.h
column.c
column.h
combine-diff.c
command-list.txt
commit-graph.c Merge branch 'js/commit-graph-warning' 2021-02-17 17:21:42 -08:00
commit-graph.h
commit-reach.c
commit-reach.h
commit-slab-decl.h
commit-slab-impl.h
commit-slab.h
commit.c Merge branch 'ak/corrected-commit-date' 2021-02-17 17:21:40 -08:00
commit.h Merge branch 'ds/commit-graph-genno-fix' 2021-02-17 17:21:40 -08:00
common-main.c
config.c Merge branch 'ak/config-bad-bool-error' 2021-02-17 17:21:43 -08:00
config.h
config.mak.dev
config.mak.in
config.mak.uname
configure.ac
connect.c Merge branch 'jt/clone-unborn-head' 2021-02-17 17:21:40 -08:00
connect.h
connected.c
connected.h
convert.c
convert.h
copy.c
credential.c
credential.h
csum-file.c
csum-file.h
ctype.c
daemon.c
date.c
decorate.c
decorate.h
delta-islands.c
delta-islands.h
delta.h
detect-compiler
diff-delta.c
diff-lib.c
diff-merges.c
diff-merges.h
diff-no-index.c
diff.c
diff.h
diffcore-break.c
diffcore-delta.c
diffcore-order.c
diffcore-pickaxe.c
diffcore-rename.c
diffcore.h
dir-iterator.c
dir-iterator.h
dir.c
dir.h
editor.c
entry.c
environment.c
environment.h
exec-cmd.c
exec-cmd.h
fetch-negotiator.c
fetch-negotiator.h
fetch-pack.c
fetch-pack.h
fmt-merge-msg.c Merge branch 'so/log-diff-merge' 2021-02-05 16:40:44 -08:00
fmt-merge-msg.h
fsck.c Merge branch 'js/fsck-name-objects-fix' 2021-02-17 17:21:42 -08:00
fsck.h
fsmonitor.c
fsmonitor.h
fuzz-commit-graph.c
fuzz-pack-headers.c
fuzz-pack-idx.c
generate-cmdlist.sh
generate-configlist.sh
gettext.c Merge branch 'ab/detox-gettext-tests' 2021-02-10 14:48:33 -08:00
gettext.h
git-add--interactive.perl
git-archimport.perl
git-bisect.sh bisect--helper: reimplement `bisect_skip` shell function in C 2021-02-03 14:52:09 -08:00
git-compat-util.h Merge branch 'tb/precompose-prefix-too' 2021-02-12 14:21:04 -08:00
git-cvsexportcommit.perl
git-cvsimport.perl
git-cvsserver.perl
git-difftool--helper.sh mergetool: break setup_tool out into separate initialization function 2021-02-09 14:09:16 -08:00
git-filter-branch.sh
git-instaweb.sh
git-merge-octopus.sh
git-merge-one-file.sh
git-merge-resolve.sh
git-mergetool--lib.sh Merge branch 'sh/mergetool-hideresolved' 2021-02-17 17:21:41 -08:00
git-mergetool.sh mergetool: add per-tool support and overrides for the hideResolved flag 2021-02-09 14:09:16 -08:00
git-p4.py Merge branch 'dl/p4-encode-after-kw-expansion' into maint 2021-02-08 14:05:54 -08:00
git-quiltimport.sh
git-rebase--preserve-merges.sh
git-request-pull.sh
git-send-email.perl
git-sh-i18n.sh
git-sh-setup.sh
git-submodule.sh
git-svn.perl
git-web--browse.sh
git.c Merge branch 'tb/precompose-prefix-too' 2021-02-12 14:21:04 -08:00
git.rc
gpg-interface.c
gpg-interface.h
graph.c
graph.h
grep.c Merge branch 'ab/grep-pcre-invalid-utf8' 2021-02-10 14:48:33 -08:00
grep.h Merge branch 'ab/grep-pcre-invalid-utf8' 2021-02-10 14:48:33 -08:00
hash-lookup.c
hash-lookup.h
hash.h
hashmap.c
hashmap.h
help.c
help.h
hex.c
http-backend.c
http-fetch.c
http-push.c
http-walker.c
http.c
http.h
ident.c
imap-send.c
iterator.h
json-writer.c
json-writer.h
khash.h
kwset.c
kwset.h
levenshtein.c
levenshtein.h
line-log.c
line-log.h
line-range.c
line-range.h
linear-assignment.c
linear-assignment.h
list-objects-filter-options.c
list-objects-filter-options.h
list-objects-filter.c
list-objects-filter.h
list-objects.c
list-objects.h
list.h
ll-merge.c
ll-merge.h
lockfile.c
lockfile.h
log-tree.c Merge branch 'js/range-diff-one-side-only' 2021-02-17 17:21:41 -08:00
log-tree.h
ls-refs.c Merge branch 'jt/clone-unborn-head' 2021-02-17 17:21:40 -08:00
ls-refs.h ls-refs: report unborn targets of symrefs 2021-02-05 13:49:53 -08:00
mailinfo.c
mailinfo.h
mailmap.c mailmap: only look for .mailmap in work tree 2021-02-10 13:34:51 -08:00
mailmap.h
match-trees.c
mem-pool.c
mem-pool.h
merge-blobs.c
merge-blobs.h
merge-ort-wrappers.c
merge-ort-wrappers.h
merge-ort.c Merge branch 'en/ort-directory-rename' 2021-02-11 13:58:43 -08:00
merge-ort.h
merge-recursive.c
merge-recursive.h
merge.c
mergesort.c
mergesort.h
midx.c
midx.h
name-hash.c
notes-cache.c
notes-cache.h
notes-merge.c
notes-merge.h
notes-utils.c
notes-utils.h
notes.c
notes.h
object-file.c
object-name.c
object-store.h packfile: add kept-pack cache for find_kept_pack_entry() 2021-02-22 23:30:52 -08:00
object.c
object.h
oid-array.c
oid-array.h
oidmap.c
oidmap.h
oidset.c
oidset.h
pack-bitmap-write.c
pack-bitmap.c
pack-bitmap.h
pack-check.c
pack-objects.c
pack-objects.h
pack-revindex.c
pack-revindex.h
pack-write.c
pack.h
packfile.c packfile: add kept-pack cache for find_kept_pack_entry() 2021-02-22 23:30:52 -08:00
packfile.h packfile: introduce 'find_kept_pack_entry()' 2021-02-22 23:30:52 -08:00
pager.c
parse-options-cb.c
parse-options.c MacOS: precompose_argv_prefix() 2021-02-03 14:09:37 -08:00
parse-options.h
patch-delta.c
patch-ids.c Merge branch 'jk/log-cherry-pick-duplicate-patches' into maint 2021-02-05 16:31:28 -08:00
patch-ids.h
path.c
path.h
pathspec.c
pathspec.h
pkt-line.c
pkt-line.h
preload-index.c
pretty.c
pretty.h
prio-queue.c
prio-queue.h
progress.c
progress.h
promisor-remote.c
promisor-remote.h
prompt.c
prompt.h
protocol.c
protocol.h
prune-packed.c
prune-packed.h
quote.c
quote.h
range-diff.c Merge branch 'js/range-diff-one-side-only' 2021-02-17 17:21:41 -08:00
range-diff.h Merge branch 'js/range-diff-one-side-only' 2021-02-17 17:21:41 -08:00
reachable.c
reachable.h
read-cache.c
rebase-interactive.c
rebase-interactive.h
rebase.c
rebase.h
ref-filter.c Merge branch 'tb/ls-refs-optim' 2021-02-05 16:40:45 -08:00
ref-filter.h
reflog-walk.c
reflog-walk.h
refs.c Merge branch 'tb/ls-refs-optim' 2021-02-05 16:40:45 -08:00
refs.h Merge branch 'tb/ls-refs-optim' 2021-02-05 16:40:45 -08:00
refspec.c
refspec.h
remote-curl.c
remote.c
remote.h Merge branch 'jt/clone-unborn-head' 2021-02-17 17:21:40 -08:00
replace-object.c
replace-object.h
repo-settings.c
repository.c
repository.h
rerere.c
rerere.h
reset.c
reset.h
resolve-undo.c
resolve-undo.h
revision.c revision: learn '--no-kept-objects' 2021-02-22 23:30:52 -08:00
revision.h revision: learn '--no-kept-objects' 2021-02-22 23:30:52 -08:00
run-command.c
run-command.h
send-pack.c
send-pack.h
sequencer.c Merge branch 'ds/more-index-cleanups' 2021-02-10 14:48:33 -08:00
sequencer.h
serve.c ls-refs: report unborn targets of symrefs 2021-02-05 13:49:53 -08:00
serve.h
server-info.c
setup.c
sh-i18n--envsubst.c
sha1dc_git.c
sha1dc_git.h
shallow.c
shallow.h
shell.c
shortlog.h
sideband.c
sideband.h
sigchain.c
sigchain.h
split-index.c
split-index.h
stable-qsort.c
strbuf.c
strbuf.h
streaming.c
streaming.h
string-list.c
string-list.h
strmap.c
strmap.h
strvec.c
strvec.h
sub-process.c
sub-process.h
submodule-config.c
submodule-config.h
submodule.c
submodule.h
symlinks.c
tag.c
tag.h
tar.h
tempfile.c
tempfile.h
thread-utils.c
thread-utils.h
tmp-objdir.c
tmp-objdir.h
trace.c
trace.h
trace2.c
trace2.h
trailer.c
trailer.h
transport-helper.c connect, transport: encapsulate arg in struct 2021-02-05 13:49:54 -08:00
transport-internal.h connect, transport: encapsulate arg in struct 2021-02-05 13:49:54 -08:00
transport.c connect, transport: encapsulate arg in struct 2021-02-05 13:49:54 -08:00
transport.h clone: respect remote unborn HEAD 2021-02-05 13:49:55 -08:00
tree-diff.c
tree-walk.c
tree-walk.h
tree.c
tree.h
unicode-width.h
unimplemented.sh
unix-socket.c
unix-socket.h
unpack-trees.c
unpack-trees.h
upload-pack.c Merge branch 'ak/corrected-commit-date' 2021-02-17 17:21:40 -08:00
upload-pack.h
url.c
url.h
urlmatch.c
urlmatch.h
usage.c usage: trace2 BUG() invocations 2021-02-09 14:14:34 -08:00
userdiff.c
userdiff.h
utf8.c
utf8.h
varint.c
varint.h
version.c
version.h
versioncmp.c
walker.c
walker.h
wildmatch.c
wildmatch.h
worktree.c
worktree.h
wrap-for-bin.sh
wrapper.c
write-or-die.c
ws.c
wt-status.c
wt-status.h
xdiff-interface.c
xdiff-interface.h
zlib.c

README.md

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks