builtin/receive-pack: add option to skip connectivity check

During git-receive-pack(1), connectivity of the object graph is
validated to ensure that the received packfile does not leave the
repository in a broken state. This is done via git-rev-list(1) and
walking the objects, which can be expensive for large repositories.

Generally, this check is critical to avoid an incomplete received
packfile from corrupting a repository. Server operators may have
additional knowledge though around exactly how Git is being used on the
server-side which can be used to facilitate more efficient connectivity
computation of incoming objects.

For example, if it can be ensured that all objects in a repository are
connected and do not depend on any missing objects, the connectivity of
newly written objects can be checked by walking the object graph
containing only the new objects from the updated tips and identifying
the missing objects which represent the boundary between the new objects
and the repository. These boundary objects can be checked in the
canonical repository to ensure the new objects connect as expected and
thus avoid walking the rest of the object graph.

Git itself cannot make the guarantees required for such an optimization
as it is possible for a repository to contain an unreachable object that
references a missing object without the repository being considered
corrupt.

Introduce the --skip-connectivity-check option for git-receive-pack(1)
which bypasses this connectivity check to give more control to the
server-side. Note that without proper server-side validation of newly
received objects handled outside of Git, usage of this option risks
corrupting a repository.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
maint
Justin Tobler 2025-05-20 11:32:18 -05:00 committed by Junio C Hamano
parent 95262afe78
commit 68cb0b5253
3 changed files with 58 additions and 20 deletions

View File

@ -46,6 +46,18 @@ OPTIONS
`$GIT_URL/info/refs?service=git-receive-pack` requests. See `$GIT_URL/info/refs?service=git-receive-pack` requests. See
`--http-backend-info-refs` in linkgit:git-upload-pack[1]. `--http-backend-info-refs` in linkgit:git-upload-pack[1].


--skip-connectivity-check::
Bypasses the connectivity checks that validate the existence of all
objects in the transitive closure of reachable objects. This option is
intended for server operators that want to implement their own object
connectivity validation outside of Git. This is useful in such cases
where the server-side knows additional information about how Git is
being used and thus can rely on certain guarantees to more efficiently
compute object connectivity that Git itself cannot make. Usage of this
option without a reliable external mechanism to ensure full reachable
object connectivity risks corrupting the repository and should not be
used in the general case.

PRE-RECEIVE HOOK PRE-RECEIVE HOOK
---------------- ----------------
Before any ref is updated, if $GIT_DIR/hooks/pre-receive file exists Before any ref is updated, if $GIT_DIR/hooks/pre-receive file exists

View File

@ -81,6 +81,7 @@ static int prefer_ofs_delta = 1;
static int auto_update_server_info; static int auto_update_server_info;
static int auto_gc = 1; static int auto_gc = 1;
static int reject_thin; static int reject_thin;
static int skip_connectivity_check;
static int stateless_rpc; static int stateless_rpc;
static const char *service_dir; static const char *service_dir;
static const char *head_name; static const char *head_name;
@ -1938,28 +1939,30 @@ static void execute_commands(struct command *commands,
return; return;
} }


if (use_sideband) { if (!skip_connectivity_check) {
memset(&muxer, 0, sizeof(muxer)); if (use_sideband) {
muxer.proc = copy_to_sideband; memset(&muxer, 0, sizeof(muxer));
muxer.in = -1; muxer.proc = copy_to_sideband;
if (!start_async(&muxer)) muxer.in = -1;
err_fd = muxer.in; if (!start_async(&muxer))
/* ...else, continue without relaying sideband */ err_fd = muxer.in;
/* ...else, continue without relaying sideband */
}

data.cmds = commands;
data.si = si;
opt.err_fd = err_fd;
opt.progress = err_fd && !quiet;
opt.env = tmp_objdir_env(tmp_objdir);
opt.exclude_hidden_refs_section = "receive";

if (check_connected(iterate_receive_command_list, &data, &opt))
set_connectivity_errors(commands, si);

if (use_sideband)
finish_async(&muxer);
} }


data.cmds = commands;
data.si = si;
opt.err_fd = err_fd;
opt.progress = err_fd && !quiet;
opt.env = tmp_objdir_env(tmp_objdir);
opt.exclude_hidden_refs_section = "receive";

if (check_connected(iterate_receive_command_list, &data, &opt))
set_connectivity_errors(commands, si);

if (use_sideband)
finish_async(&muxer);

reject_updates_to_hidden(commands); reject_updates_to_hidden(commands);


/* /*
@ -2519,6 +2522,7 @@ int cmd_receive_pack(int argc,


struct option options[] = { struct option options[] = {
OPT__QUIET(&quiet, N_("quiet")), OPT__QUIET(&quiet, N_("quiet")),
OPT_HIDDEN_BOOL(0, "skip-connectivity-check", &skip_connectivity_check, NULL),
OPT_HIDDEN_BOOL(0, "stateless-rpc", &stateless_rpc, NULL), OPT_HIDDEN_BOOL(0, "stateless-rpc", &stateless_rpc, NULL),
OPT_HIDDEN_BOOL(0, "http-backend-info-refs", &advertise_refs, NULL), OPT_HIDDEN_BOOL(0, "http-backend-info-refs", &advertise_refs, NULL),
OPT_ALIAS(0, "advertise-refs", "http-backend-info-refs"), OPT_ALIAS(0, "advertise-refs", "http-backend-info-refs"),

View File

@ -62,4 +62,26 @@ test_expect_success 'receive-pack missing objects fails connectivity check' '
test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) test_must_fail git -C remote.git cat-file -e $(git -C repo rev-parse HEAD)
' '


test_expect_success 'receive-pack missing objects bypasses connectivity check' '
test_when_finished rm -rf repo remote.git setup.git &&

git init repo &&
git -C repo commit --allow-empty -m 1 &&
git clone --bare repo setup.git &&
git -C repo commit --allow-empty -m 2 &&

# Capture git-send-pack(1) output sent to git-receive-pack(1).
git -C repo send-pack ../setup.git --all \
--receive-pack="tee ${SQ}$(pwd)/out${SQ} | git-receive-pack" &&

# Replay captured git-send-pack(1) output on new empty repository.
git init --bare remote.git &&
git receive-pack --skip-connectivity-check remote.git <out >actual 2>err &&

test_grep ! "missing necessary objects" actual &&
test_must_be_empty err &&
git -C remote.git cat-file -e $(git -C repo rev-parse HEAD) &&
test_must_fail git -C remote.git rev-list $(git -C repo rev-parse HEAD)
'

test_done test_done