builtin/gc: provide hint when maintenance hits a stale schedule lock

When running scheduled maintenance via `git maintenance start`, we
acquire a lockfile to ensure that no other scheduled maintenance task is
running in the repository concurrently. If so, we do provide an error to
the user hinting that another process seems to be running in this repo.

There are two important cases why such a lockfile may exist:

  - An actual git-maintenance(1) process is still running in this
    repository.

  - An earlier process may have crashed or was interrupted part way
    through and has left a stale lockfile behind.

In c95547a394 (builtin/gc: fix crash when running `git maintenance
start`, 2024-10-10), we have fixed an issue where git-maintenance(1)
would crash with the "start" subcommand, and the underlying bug causes
the second scenario to trigger quite often now.

Most users don't know how to get out of that situation again though.
Ideally, we'd be removing the stale lock for our users automatically.
But in the context of repository maintenance this is rather risky, as it
can easily run for hours or even days. So finding a clear point where we
know that the old process has exited is basically impossible.

We have the same issue in other subsystems, e.g. when locking refs. Our
lockfile interfaces thus provide the `unable_to_lock_message()` function
for exactly this purpose: it provides a nice hint to the user that
explains what is going on and how to get out of that situation again by
manually removing the file.

Adapt git-maintenance(1) to print a similar hint. While we could use the
above function, we can provide a bit more context as we know exactly
what kind of process would create the lockfile.

Reported-by: Miguel Rincon Barahona <mrincon@gitlab.com>
Reported-by: Kev Kloss <kkloss@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
maint
Patrick Steinhardt 2024-11-19 11:48:43 +01:00 committed by Junio C Hamano
parent 777489f9e0
commit 656ca9204a
2 changed files with 18 additions and 1 deletions

View File

@ -2887,8 +2887,17 @@ static int update_background_schedule(const struct maintenance_start_opts *opts,
char *lock_path = xstrfmt("%s/schedule", the_repository->objects->odb->path);

if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0) {
if (errno == EEXIST)
error(_("unable to create '%s.lock': %s.\n\n"
"Another scheduled git-maintenance(1) process seems to be running in this\n"
"repository. Please make sure no other maintenance processes are running and\n"
"then try again. If it still fails, a git-maintenance(1) process may have\n"
"crashed in this repository earlier: remove the file manually to continue."),
absolute_path(lock_path), strerror(errno));
else
error_errno(_("cannot acquire lock for scheduled background maintenance"));
free(lock_path);
return error(_("another process is scheduling background maintenance"));
return -1;
}

for (i = 1; i < ARRAY_SIZE(scheduler_fn); i++) {

View File

@ -995,4 +995,12 @@ test_expect_success 'repacking loose objects is quiet' '
)
'

test_expect_success 'maintenance aborts with existing lock file' '
test_when_finished "rm -rf repo" &&
git init repo &&
: >repo/.git/objects/schedule.lock &&
test_must_fail git -C repo maintenance start 2>err &&
test_grep "Another scheduled git-maintenance(1) process seems to be running" err
'

test_done