|
|
|
git-repack(1)
|
|
|
|
=============
|
|
|
|
|
|
|
|
NAME
|
|
|
|
----
|
|
|
|
git-repack - Pack unpacked objects in a repository
|
|
|
|
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
--------
|
|
|
|
[verse]
|
|
|
|
'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [--window=<n>] [--depth=<n>] [--threads=<n>] [--keep-pack=<pack-name>]
|
|
|
|
|
|
|
|
DESCRIPTION
|
|
|
|
-----------
|
|
|
|
|
|
|
|
This command is used to combine all objects that do not currently
|
|
|
|
reside in a "pack", into a pack. It can also be used to re-organize
|
|
|
|
existing packs into a single, more efficient pack.
|
|
|
|
|
|
|
|
A pack is a collection of objects, individually compressed, with
|
|
|
|
delta compression applied, stored in a single file, with an
|
|
|
|
associated index file.
|
|
|
|
|
|
|
|
Packs are used to reduce the load on mirror systems, backup
|
|
|
|
engines, disk storage, etc.
|
|
|
|
|
|
|
|
OPTIONS
|
|
|
|
-------
|
|
|
|
|
|
|
|
-a::
|
|
|
|
Instead of incrementally packing the unpacked objects,
|
|
|
|
pack everything referenced into a single pack.
|
|
|
|
Especially useful when packing a repository that is used
|
|
|
|
for private development. Use
|
|
|
|
with `-d`. This will clean up the objects that `git prune`
|
|
|
|
leaves behind, but `git fsck --full --dangling` shows as
|
|
|
|
dangling.
|
|
|
|
+
|
|
|
|
Note that users fetching over dumb protocols will have to fetch the
|
|
|
|
whole new pack in order to get any contained object, no matter how many
|
|
|
|
other objects in that pack they already have locally.
|
|
|
|
+
|
|
|
|
Promisor packfiles are repacked separately: if there are packfiles that
|
|
|
|
have an associated ".promisor" file, these packfiles will be repacked
|
|
|
|
into another separate pack, and an empty ".promisor" file corresponding
|
|
|
|
to the new separate pack will be written.
|
|
|
|
|
|
|
|
-A::
|
|
|
|
Same as `-a`, unless `-d` is used. Then any unreachable
|
|
|
|
objects in a previous pack become loose, unpacked objects,
|
|
|
|
instead of being left in the old pack. Unreachable objects
|
|
|
|
are never intentionally added to a pack, even when repacking.
|
|
|
|
This option prevents unreachable objects from being immediately
|
|
|
|
deleted by way of being left in the old pack and then
|
|
|
|
removed. Instead, the loose unreachable objects
|
|
|
|
will be pruned according to normal expiry rules
|
|
|
|
with the next 'git gc' invocation. See linkgit:git-gc[1].
|
|
|
|
|
|
|
|
-d::
|
|
|
|
After packing, if the newly created packs make some
|
|
|
|
existing packs redundant, remove the redundant packs.
|
|
|
|
Also run 'git prune-packed' to remove redundant
|
|
|
|
loose object files.
|
|
|
|
|
|
|
|
-l::
|
|
|
|
Pass the `--local` option to 'git pack-objects'. See
|
|
|
|
linkgit:git-pack-objects[1].
|
|
|
|
|
|
|
|
-f::
|
|
|
|
Pass the `--no-reuse-delta` option to `git-pack-objects`, see
|
|
|
|
linkgit:git-pack-objects[1].
|
|
|
|
|
|
|
|
-F::
|
|
|
|
Pass the `--no-reuse-object` option to `git-pack-objects`, see
|
|
|
|
linkgit:git-pack-objects[1].
|
|
|
|
|
|
|
|
-q::
|
|
|
|
Pass the `-q` option to 'git pack-objects'. See
|
|
|
|
linkgit:git-pack-objects[1].
|
|
|
|
|
|
|
|
-n::
|
|
|
|
Do not update the server information with
|
|
|
|
'git update-server-info'. This option skips
|
|
|
|
updating local catalog files needed to publish
|
|
|
|
this repository (or a direct copy of it)
|
|
|
|
over HTTP or FTP. See linkgit:git-update-server-info[1].
|
|
|
|
|
|
|
|
--window=<n>::
|
|
|
|
--depth=<n>::
|
|
|
|
These two options affect how the objects contained in the pack are
|
|
|
|
stored using delta compression. The objects are first internally
|
|
|
|
sorted by type, size and optionally names and compared against the
|
|
|
|
other objects within `--window` to see if using delta compression saves
|
|
|
|
space. `--depth` limits the maximum delta depth; making it too deep
|
|
|
|
affects the performance on the unpacker side, because delta data needs
|
|
|
|
to be applied that many times to get to the necessary object.
|
|
|
|
+
|
|
|
|
The default value for --window is 10 and --depth is 50. The maximum
|
|
|
|
depth is 4095.
|
|
|
|
|
|
|
|
--threads=<n>::
|
|
|
|
This option is passed through to `git pack-objects`.
|
|
|
|
|
|
|
|
--window-memory=<n>::
|
|
|
|
This option provides an additional limit on top of `--window`;
|
|
|
|
the window size will dynamically scale down so as to not take
|
|
|
|
up more than '<n>' bytes in memory. This is useful in
|
|
|
|
repositories with a mix of large and small objects to not run
|
|
|
|
out of memory with a large window, but still be able to take
|
|
|
|
advantage of the large window for the smaller objects. The
|
|
|
|
size can be suffixed with "k", "m", or "g".
|
|
|
|
`--window-memory=0` makes memory usage unlimited. The default
|
|
|
|
is taken from the `pack.windowMemory` configuration variable.
|
|
|
|
Note that the actual memory usage will be the limit multiplied
|
|
|
|
by the number of threads used by linkgit:git-pack-objects[1].
|
|
|
|
|
|
|
|
--max-pack-size=<n>::
|
|
|
|
Maximum size of each output pack file. The size can be suffixed with
|
|
|
|
"k", "m", or "g". The minimum size allowed is limited to 1 MiB.
|
|
|
|
If specified, multiple packfiles may be created, which also
|
|
|
|
prevents the creation of a bitmap index.
|
|
|
|
The default is unlimited, unless the config variable
|
|
|
|
`pack.packSizeLimit` is set.
|
|
|
|
|
|
|
|
-b::
|
|
|
|
--write-bitmap-index::
|
|
|
|
Write a reachability bitmap index as part of the repack. This
|
|
|
|
only makes sense when used with `-a` or `-A`, as the bitmaps
|
|
|
|
must be able to refer to all reachable objects. This option
|
|
|
|
overrides the setting of `repack.writeBitmaps`. This option
|
|
|
|
has no effect if multiple packfiles are created.
|
|
|
|
|
repack: add `repack.packKeptObjects` config var
The git-repack command always passes `--honor-pack-keep`
to pack-objects. This has traditionally been a good thing,
as we do not want to duplicate those objects in a new pack,
and we are not going to delete the old pack.
However, when bitmaps are in use, it is important for a full
repack to include all reachable objects, even if they may be
duplicated in a .keep pack. Otherwise, we cannot generate
the bitmaps, as the on-disk format requires the set of
objects in the pack to be fully closed.
Even if the repository does not generally have .keep files,
a simultaneous push could cause a race condition in which a
.keep file exists at the moment of a repack. The repack may
try to include those objects in one of two situations:
1. The pushed .keep pack contains objects that were
already in the repository (e.g., blobs due to a revert of
an old commit).
2. Receive-pack updates the refs, making the objects
reachable, but before it removes the .keep file, the
repack runs.
In either case, we may prefer to duplicate some objects in
the new, full pack, and let the next repack (after the .keep
file is cleaned up) take care of removing them.
This patch introduces both a command-line and config option
to disable the `--honor-pack-keep` option. By default, it
is triggered when pack.writeBitmaps (or `--write-bitmap-index`
is turned on), but specifying it explicitly can override the
behavior (e.g., in cases where you prefer .keep files to
bitmaps, but only when they are present).
Note that this option just disables the pack-objects
behavior. We still leave packs with a .keep in place, as we
do not necessarily know that we have duplicated all of their
objects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
11 years ago
|
|
|
--pack-kept-objects::
|
|
|
|
Include objects in `.keep` files when repacking. Note that we
|
|
|
|
still do not delete `.keep` packs after `pack-objects` finishes.
|
|
|
|
This means that we may duplicate objects, but this makes the
|
|
|
|
option safe to use when there are concurrent pushes or fetches.
|
|
|
|
This option is generally only useful if you are writing bitmaps
|
|
|
|
with `-b` or `repack.writeBitmaps`, as it ensures that the
|
repack: add `repack.packKeptObjects` config var
The git-repack command always passes `--honor-pack-keep`
to pack-objects. This has traditionally been a good thing,
as we do not want to duplicate those objects in a new pack,
and we are not going to delete the old pack.
However, when bitmaps are in use, it is important for a full
repack to include all reachable objects, even if they may be
duplicated in a .keep pack. Otherwise, we cannot generate
the bitmaps, as the on-disk format requires the set of
objects in the pack to be fully closed.
Even if the repository does not generally have .keep files,
a simultaneous push could cause a race condition in which a
.keep file exists at the moment of a repack. The repack may
try to include those objects in one of two situations:
1. The pushed .keep pack contains objects that were
already in the repository (e.g., blobs due to a revert of
an old commit).
2. Receive-pack updates the refs, making the objects
reachable, but before it removes the .keep file, the
repack runs.
In either case, we may prefer to duplicate some objects in
the new, full pack, and let the next repack (after the .keep
file is cleaned up) take care of removing them.
This patch introduces both a command-line and config option
to disable the `--honor-pack-keep` option. By default, it
is triggered when pack.writeBitmaps (or `--write-bitmap-index`
is turned on), but specifying it explicitly can override the
behavior (e.g., in cases where you prefer .keep files to
bitmaps, but only when they are present).
Note that this option just disables the pack-objects
behavior. We still leave packs with a .keep in place, as we
do not necessarily know that we have duplicated all of their
objects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
11 years ago
|
|
|
bitmapped packfile has the necessary objects.
|
|
|
|
|
|
|
|
--keep-pack=<pack-name>::
|
|
|
|
Exclude the given pack from repacking. This is the equivalent
|
|
|
|
of having `.keep` file on the pack. `<pack-name>` is the
|
|
|
|
pack file name without leading directory (e.g. `pack-123.pack`).
|
|
|
|
The option could be specified multiple times to keep multiple
|
|
|
|
packs.
|
|
|
|
|
|
|
|
--unpack-unreachable=<when>::
|
|
|
|
When loosening unreachable objects, do not bother loosening any
|
|
|
|
objects older than `<when>`. This can be used to optimize out
|
|
|
|
the write of any objects that would be immediately pruned by
|
|
|
|
a follow-up `git prune`.
|
|
|
|
|
repack: add --keep-unreachable option
The usual way to do a full repack (and what is done by
git-gc) is to run "repack -Ad --unpack-unreachable=<when>",
which will loosen any unreachable objects newer than
"<when>", and drop any older ones.
This is a safer alternative to "repack -ad", because
"<when>" becomes a grace period during which we will not
drop any new objects that are about to be referenced.
However, it isn't perfectly safe. It's always possible that
a process is about to reference an old object. Even if that
process were to take care to update the timestamp on the
object, there is no atomicity with a simultaneously running
"repack" process.
So while unlikely, there is a small race wherein we may drop
an object that is in the process of being referenced. If you
do automated repacking on a large number of active
repositories, you may hit it eventually, and the result is a
corrupted repository.
It would be nice to fix that race in the long run, but it's
complicated. In the meantime, there is a much simpler
strategy for automated repository maintenance: do not drop
objects at all. We already have a "--keep-unreachable"
option in pack-objects; we just need to plumb it through
from git-repack.
Note that this _isn't_ plumbed through from git-gc, so at
this point it's strictly a tool for people doing their own
advanced repository maintenance strategy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9 years ago
|
|
|
-k::
|
|
|
|
--keep-unreachable::
|
|
|
|
When used with `-ad`, any unreachable objects from existing
|
|
|
|
packs will be appended to the end of the packfile instead of
|
repack: extend --keep-unreachable to loose objects
If you use "repack -adk" currently, we will pack all objects
that are already packed into the new pack, and then drop the
old packs. However, loose unreachable objects will be left
as-is. In theory these are meant to expire eventually with
"git prune". But if you are using "repack -k", you probably
want to keep things forever and therefore do not run "git
prune" at all. Meaning those loose objects may build up over
time and end up fooling any object-count heuristics (such as
the one done by "gc --auto", though since git-gc does not
support "repack -k", this really applies to whatever custom
scripts people might have driving "repack -k").
With this patch, we instead stuff any loose unreachable
objects into the pack along with the already-packed
unreachable objects. This may seem wasteful, but it is
really no more so than using "repack -k" in the first place.
We are at a slight disadvantage, in that we have no useful
ordering for the result, or names to hand to the delta code.
However, this is again no worse than what "repack -k" is
already doing for the packed objects. The packing of these
objects doesn't matter much because they should not be
accessed frequently (unless they actually _do_ become
referenced, but then they would get moved to a different
part of the packfile during the next repack).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9 years ago
|
|
|
being removed. In addition, any unreachable loose objects will
|
|
|
|
be packed (and their loose counterparts removed).
|
repack: add --keep-unreachable option
The usual way to do a full repack (and what is done by
git-gc) is to run "repack -Ad --unpack-unreachable=<when>",
which will loosen any unreachable objects newer than
"<when>", and drop any older ones.
This is a safer alternative to "repack -ad", because
"<when>" becomes a grace period during which we will not
drop any new objects that are about to be referenced.
However, it isn't perfectly safe. It's always possible that
a process is about to reference an old object. Even if that
process were to take care to update the timestamp on the
object, there is no atomicity with a simultaneously running
"repack" process.
So while unlikely, there is a small race wherein we may drop
an object that is in the process of being referenced. If you
do automated repacking on a large number of active
repositories, you may hit it eventually, and the result is a
corrupted repository.
It would be nice to fix that race in the long run, but it's
complicated. In the meantime, there is a much simpler
strategy for automated repository maintenance: do not drop
objects at all. We already have a "--keep-unreachable"
option in pack-objects; we just need to plumb it through
from git-repack.
Note that this _isn't_ plumbed through from git-gc, so at
this point it's strictly a tool for people doing their own
advanced repository maintenance strategy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9 years ago
|
|
|
|
|
|
|
-i::
|
|
|
|
--delta-islands::
|
|
|
|
Pass the `--delta-islands` option to `git-pack-objects`, see
|
|
|
|
linkgit:git-pack-objects[1].
|
|
|
|
|
|
|
|
CONFIGURATION
|
|
|
|
-------------
|
|
|
|
|
|
|
|
Various configuration variables affect packing, see
|
|
|
|
linkgit:git-config[1] (search for "pack" and "delta").
|
|
|
|
|
|
|
|
By default, the command passes `--delta-base-offset` option to
|
|
|
|
'git pack-objects'; this typically results in slightly smaller packs,
|
|
|
|
but the generated packs are incompatible with versions of Git older than
|
|
|
|
version 1.4.4. If you need to share your repository with such ancient Git
|
transport: drop support for git-over-rsync
The git-over-rsync protocol is inefficient and broken, and
has been for a long time. It transfers way more objects than
it needs (grabbing all of the remote's "objects/",
regardless of which objects we need). It does its own ad-hoc
parsing of loose and packed refs from the remote, but
doesn't properly override packed refs with loose ones,
leading to garbage results (e.g., expecting the other side
to have an object pointed to by a stale packed-refs entry,
or complaining that the other side has two copies of the
refs[1]).
This latter breakage means that nobody could have
successfully pulled from a moderately active repository
since cd547b4 (fetch/push: readd rsync support, 2007-10-01).
We never made an official deprecation notice in the release
notes for git's rsync protocol, but the tutorial has marked
it as such since 914328a (Update tutorial., 2005-08-30).
And on the mailing list as far back as Oct 2005, we can find
Junio mentioning it as having "been deprecated for quite
some time."[2,3,4]. So it was old news then; cogito had
deprecated the transport in July of 2005[5] (though it did
come back briefly when Linus broke git-http-pull!).
Of course some people professed their love of rsync through
2006, but Linus clarified in his usual gentle manner[6]:
> Thanks! This is why I still use rsync, even though
> everybody and their mother tells me "Linus says rsync is
> deprecated."
No. You're using rsync because you're actively doing
something _wrong_.
The deprecation sentiment was reinforced in 2008, with a
mention that cloning via rsync is broken (with no fix)[7].
Even the commit porting rsync over to C from shell (cd547b4)
lists it as deprecated! So between the 10 years of informal
warnings, and the fact that it has been severely broken
since 2007, it's probably safe to simply remove it without
further deprecation warnings.
[1] http://article.gmane.org/gmane.comp.version-control.git/285101
[2] http://article.gmane.org/gmane.comp.version-control.git/10093
[3] http://article.gmane.org/gmane.comp.version-control.git/17734
[4] http://article.gmane.org/gmane.comp.version-control.git/18911
[5] http://article.gmane.org/gmane.comp.version-control.git/5617
[6] http://article.gmane.org/gmane.comp.version-control.git/19354
[7] http://article.gmane.org/gmane.comp.version-control.git/103635
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9 years ago
|
|
|
versions, either directly or via the dumb http protocol, then you
|
|
|
|
need to set the configuration variable `repack.UseDeltaBaseOffset` to
|
|
|
|
"false" and repack. Access from old Git versions over the native protocol
|
|
|
|
is unaffected by this option as the conversion is performed on the fly
|
|
|
|
as needed in that case.
|
|
|
|
|
|
|
|
Delta compression is not used on objects larger than the
|
|
|
|
`core.bigFileThreshold` configuration variable and on files with the
|
|
|
|
attribute `delta` set to false.
|
|
|
|
|
|
|
|
SEE ALSO
|
|
|
|
--------
|
|
|
|
linkgit:git-pack-objects[1]
|
|
|
|
linkgit:git-prune-packed[1]
|
|
|
|
|
|
|
|
GIT
|
|
|
|
---
|
|
|
|
Part of the linkgit:git[1] suite
|