You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
224 lines
9.5 KiB
224 lines
9.5 KiB
git-maintenance(1) |
|
================== |
|
|
|
NAME |
|
---- |
|
git-maintenance - Run tasks to optimize Git repository data |
|
|
|
|
|
SYNOPSIS |
|
-------- |
|
[verse] |
|
'git maintenance' run [<options>] |
|
|
|
|
|
DESCRIPTION |
|
----------- |
|
Run tasks to optimize Git repository data, speeding up other Git commands |
|
and reducing storage requirements for the repository. |
|
|
|
Git commands that add repository data, such as `git add` or `git fetch`, |
|
are optimized for a responsive user experience. These commands do not take |
|
time to optimize the Git data, since such optimizations scale with the full |
|
size of the repository while these user commands each perform a relatively |
|
small action. |
|
|
|
The `git maintenance` command provides flexibility for how to optimize the |
|
Git repository. |
|
|
|
SUBCOMMANDS |
|
----------- |
|
|
|
register:: |
|
Initialize Git config values so any scheduled maintenance will |
|
start running on this repository. This adds the repository to the |
|
`maintenance.repo` config variable in the current user's global |
|
config and enables some recommended configuration values for |
|
`maintenance.<task>.schedule`. The tasks that are enabled are safe |
|
for running in the background without disrupting foreground |
|
processes. |
|
+ |
|
The `register` subcommand will also set the `maintenance.strategy` config |
|
value to `incremental`, if this value is not previously set. The |
|
`incremental` strategy uses the following schedule for each maintenance |
|
task: |
|
+ |
|
-- |
|
* `gc`: disabled. |
|
* `commit-graph`: hourly. |
|
* `prefetch`: hourly. |
|
* `loose-objects`: daily. |
|
* `incremental-repack`: daily. |
|
-- |
|
+ |
|
`git maintenance register` will also disable foreground maintenance by |
|
setting `maintenance.auto = false` in the current repository. This config |
|
setting will remain after a `git maintenance unregister` command. |
|
|
|
run:: |
|
Run one or more maintenance tasks. If one or more `--task` options |
|
are specified, then those tasks are run in that order. Otherwise, |
|
the tasks are determined by which `maintenance.<task>.enabled` |
|
config options are true. By default, only `maintenance.gc.enabled` |
|
is true. |
|
|
|
start:: |
|
Start running maintenance on the current repository. This performs |
|
the same config updates as the `register` subcommand, then updates |
|
the background scheduler to run `git maintenance run --scheduled` |
|
on an hourly basis. |
|
|
|
stop:: |
|
Halt the background maintenance schedule. The current repository |
|
is not removed from the list of maintained repositories, in case |
|
the background maintenance is restarted later. |
|
|
|
unregister:: |
|
Remove the current repository from background maintenance. This |
|
only removes the repository from the configured list. It does not |
|
stop the background maintenance processes from running. |
|
|
|
TASKS |
|
----- |
|
|
|
commit-graph:: |
|
The `commit-graph` job updates the `commit-graph` files incrementally, |
|
then verifies that the written data is correct. The incremental |
|
write is safe to run alongside concurrent Git processes since it |
|
will not expire `.graph` files that were in the previous |
|
`commit-graph-chain` file. They will be deleted by a later run based |
|
on the expiration delay. |
|
|
|
prefetch:: |
|
The `prefetch` task updates the object directory with the latest |
|
objects from all registered remotes. For each remote, a `git fetch` |
|
command is run. The refmap is custom to avoid updating local or remote |
|
branches (those in `refs/heads` or `refs/remotes`). Instead, the |
|
remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are |
|
not updated. |
|
+ |
|
This is done to avoid disrupting the remote-tracking branches. The end users |
|
expect these refs to stay unmoved unless they initiate a fetch. With prefetch |
|
task, however, the objects necessary to complete a later real fetch would |
|
already be obtained, so the real fetch would go faster. In the ideal case, |
|
it will just become an update to a bunch of remote-tracking branches without |
|
any object transfer. |
|
|
|
gc:: |
|
Clean up unnecessary files and optimize the local repository. "GC" |
|
stands for "garbage collection," but this task performs many |
|
smaller tasks. This task can be expensive for large repositories, |
|
as it repacks all Git objects into a single pack-file. It can also |
|
be disruptive in some situations, as it deletes stale data. See |
|
linkgit:git-gc[1] for more details on garbage collection in Git. |
|
|
|
loose-objects:: |
|
The `loose-objects` job cleans up loose objects and places them into |
|
pack-files. In order to prevent race conditions with concurrent Git |
|
commands, it follows a two-step process. First, it deletes any loose |
|
objects that already exist in a pack-file; concurrent Git processes |
|
will examine the pack-file for the object data instead of the loose |
|
object. Second, it creates a new pack-file (starting with "loose-") |
|
containing a batch of loose objects. The batch size is limited to 50 |
|
thousand objects to prevent the job from taking too long on a |
|
repository with many loose objects. The `gc` task writes unreachable |
|
objects as loose objects to be cleaned up by a later step only if |
|
they are not re-added to a pack-file; for this reason it is not |
|
advisable to enable both the `loose-objects` and `gc` tasks at the |
|
same time. |
|
|
|
incremental-repack:: |
|
The `incremental-repack` job repacks the object directory |
|
using the `multi-pack-index` feature. In order to prevent race |
|
conditions with concurrent Git commands, it follows a two-step |
|
process. First, it calls `git multi-pack-index expire` to delete |
|
pack-files unreferenced by the `multi-pack-index` file. Second, it |
|
calls `git multi-pack-index repack` to select several small |
|
pack-files and repack them into a bigger one, and then update the |
|
`multi-pack-index` entries that refer to the small pack-files to |
|
refer to the new pack-file. This prepares those small pack-files |
|
for deletion upon the next run of `git multi-pack-index expire`. |
|
The selection of the small pack-files is such that the expected |
|
size of the big pack-file is at least the batch size; see the |
|
`--batch-size` option for the `repack` subcommand in |
|
linkgit:git-multi-pack-index[1]. The default batch-size is zero, |
|
which is a special case that attempts to repack all pack-files |
|
into a single pack-file. |
|
|
|
OPTIONS |
|
------- |
|
--auto:: |
|
When combined with the `run` subcommand, run maintenance tasks |
|
only if certain thresholds are met. For example, the `gc` task |
|
runs when the number of loose objects exceeds the number stored |
|
in the `gc.auto` config setting, or when the number of pack-files |
|
exceeds the `gc.autoPackLimit` config setting. Not compatible with |
|
the `--schedule` option. |
|
|
|
--schedule:: |
|
When combined with the `run` subcommand, run maintenance tasks |
|
only if certain time conditions are met, as specified by the |
|
`maintenance.<task>.schedule` config value for each `<task>`. |
|
This config value specifies a number of seconds since the last |
|
time that task ran, according to the `maintenance.<task>.lastRun` |
|
config value. The tasks that are tested are those provided by |
|
the `--task=<task>` option(s) or those with |
|
`maintenance.<task>.enabled` set to true. |
|
|
|
--quiet:: |
|
Do not report progress or other information over `stderr`. |
|
|
|
--task=<task>:: |
|
If this option is specified one or more times, then only run the |
|
specified tasks in the specified order. If no `--task=<task>` |
|
arguments are specified, then only the tasks with |
|
`maintenance.<task>.enabled` configured as `true` are considered. |
|
See the 'TASKS' section for the list of accepted `<task>` values. |
|
|
|
|
|
TROUBLESHOOTING |
|
--------------- |
|
The `git maintenance` command is designed to simplify the repository |
|
maintenance patterns while minimizing user wait time during Git commands. |
|
A variety of configuration options are available to allow customizing this |
|
process. The default maintenance options focus on operations that complete |
|
quickly, even on large repositories. |
|
|
|
Users may find some cases where scheduled maintenance tasks do not run as |
|
frequently as intended. Each `git maintenance run` command takes a lock on |
|
the repository's object database, and this prevents other concurrent |
|
`git maintenance run` commands from running on the same repository. Without |
|
this safeguard, competing processes could leave the repository in an |
|
unpredictable state. |
|
|
|
The background maintenance schedule runs `git maintenance run` processes |
|
on an hourly basis. Each run executes the "hourly" tasks. At midnight, |
|
that process also executes the "daily" tasks. At midnight on the first day |
|
of the week, that process also executes the "weekly" tasks. A single |
|
process iterates over each registered repository, performing the scheduled |
|
tasks for that frequency. Depending on the number of registered |
|
repositories and their sizes, this process may take longer than an hour. |
|
In this case, multiple `git maintenance run` commands may run on the same |
|
repository at the same time, colliding on the object database lock. This |
|
results in one of the two tasks not running. |
|
|
|
If you find that some maintenance windows are taking longer than one hour |
|
to complete, then consider reducing the complexity of your maintenance |
|
tasks. For example, the `gc` task is much slower than the |
|
`incremental-repack` task. However, this comes at a cost of a slightly |
|
larger object database. Consider moving more expensive tasks to be run |
|
less frequently. |
|
|
|
Expert users may consider scheduling their own maintenance tasks using a |
|
different schedule than is available through `git maintenance start` and |
|
Git configuration options. These users should be aware of the object |
|
database lock and how concurrent `git maintenance run` commands behave. |
|
Further, the `git gc` command should not be combined with |
|
`git maintenance run` commands. `git gc` modifies the object database |
|
but does not take the lock in the same way as `git maintenance run`. If |
|
possible, use `git maintenance run --task=gc` instead of `git gc`. |
|
|
|
|
|
GIT |
|
--- |
|
Part of the linkgit:git[1] suite
|
|
|