You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

2619 lines
66 KiB

#include "builtin.h"
#include "abspath.h"
#include "repository.h"
#include "config.h"
#include "environment.h"
#include "gettext.h"
#include "hex.h"
#include "lockfile.h"
#include "pack.h"
#include "refs.h"
#include "pkt-line.h"
#include "sideband.h"
#include "run-command.h"
#include "hook.h"
#include "exec-cmd.h"
#include "commit.h"
#include "object.h"
#include "remote.h"
#include "connect.h"
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
#include "string-list.h"
#include "oid-array.h"
#include "connected.h"
#include "strvec.h"
#include "version.h"
#include "tag.h"
#include "gpg-interface.h"
receive-pack: allow hooks to ignore its standard input stream The pre-receive and post-receive hooks were designed to be an improvement over old style update and post-update hooks, which take the update information on their command line and are limited by the command line length limit. The same information is fed from the standard input to pre/post-receive hooks instead to lift this limitation. It has been mandatory for these new style hooks to consume the update information fully from the standard input stream. Otherwise, they would risk killing the receive-pack process via SIGPIPE. If a hook does not want to look at all the information, it is easy to send its standard input to /dev/null (perhaps a niche use of hook might need to know only the fact that a push was made, without having to know what objects have been pushed to update which refs), and this has already been done by existing hooks that are written carefully. However, because there is no good way to consistently fail hooks that do not consume the input fully (a small push may result in a short update record that may fit within the pipe buffer, to which the receive-pack process may manage to write before the hook has a chance to exit without reading anything, which will not result in a death-by-SIGPIPE of receive-pack), it can lead to a hard to diagnose "once in a blue moon" phantom failure. Lift this "hooks must consume their input fully" mandate. A mandate that is not enforced strictly is not helping us to catch mistakes in hooks. If a hook has a good reason to decide the outcome of its operation without reading the information we feed it, let it do so as it pleases. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
#include "sigchain.h"
#include "fsck.h"
#include "tmp-objdir.h"
#include "oidset.h"
#include "packfile.h"
#include "object-name.h"
#include "object-store.h"
#include "protocol.h"
#include "commit-reach.h"
#include "server-info.h"
#include "trace.h"
#include "trace2.h"
receive.denyCurrentBranch: respect all worktrees The receive.denyCurrentBranch config option controls what happens if you push to a branch that is checked out into a non-bare repository. By default, it rejects it. It can be disabled via `ignore` or `warn`. Another yet trickier option is `updateInstead`. However, this setting was forgotten when the git worktree command was introduced: only the main worktree's current branch is respected. With this change, all worktrees are respected. That change also leads to revealing another bug, i.e. `receive.denyCurrentBranch = true` was ignored when pushing into a non-bare repository's unborn current branch using ref namespaces. As `is_ref_checked_out()` returns 0 which means `receive-pack` does not get into conditional statement to switch `deny_current_branch` accordingly (ignore, warn, refuse, unconfigured, updateInstead). receive.denyCurrentBranch uses the function `refs_resolve_ref_unsafe()` (called via `resolve_refdup()`) to resolve the symbolic ref HEAD, but that function fails when HEAD does not point at a valid commit. As we replace the call to `refs_resolve_ref_unsafe()` with `find_shared_symref()`, which has no problem finding the worktree for a given branch even if it is unborn yet, this bug is fixed at the same time: receive.denyCurrentBranch now also handles worktrees with unborn branches as intended even while using ref namespaces. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
#include "worktree.h"
#include "shallow.h"
#include "wrapper.h"
static const char * const receive_pack_usage[] = {
N_("git receive-pack <git-dir>"),
NULL
};
enum deny_action {
DENY_UNCONFIGURED,
DENY_IGNORE,
DENY_WARN,
DENY_REFUSE,
DENY_UPDATE_INSTEAD
};
static int deny_deletes;
static int deny_non_fast_forwards;
static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
static int receive_fsck_objects = -1;
static int transfer_fsck_objects = -1;
static struct strbuf fsck_msg_types = STRBUF_INIT;
static int receive_unpack_limit = -1;
static int transfer_unpack_limit = -1;
static int advertise_atomic_push = 1;
static int advertise_push_options;
static int advertise_sid;
static int unpack_limit = 100;
static off_t max_input_size;
static int report_status;
static int report_status_v2;
static int use_sideband;
static int use_atomic;
static int use_push_options;
static int quiet;
static int prefer_ofs_delta = 1;
static int auto_update_server_info;
static int auto_gc = 1;
static int reject_thin;
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
static int stateless_rpc;
static const char *service_dir;
static const char *head_name;
static void *head_name_to_free;
static int sent_capabilities;
static int shallow_update;
static const char *alt_shallow_file;
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
static struct strbuf push_cert = STRBUF_INIT;
static struct object_id push_cert_oid;
static struct signature_check sigcheck;
static const char *push_cert_nonce;
static const char *cert_nonce_seed;
static struct string_list hidden_refs = STRING_LIST_INIT_DUP;
static const char *NONCE_UNSOLICITED = "UNSOLICITED";
static const char *NONCE_BAD = "BAD";
static const char *NONCE_MISSING = "MISSING";
static const char *NONCE_OK = "OK";
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
static const char *NONCE_SLOP = "SLOP";
static const char *nonce_status;
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
static long nonce_stamp_slop;
static timestamp_t nonce_stamp_slop_limit;
static struct ref_transaction *transaction;
receive-pack: send keepalives during quiet periods After a client has sent us the complete pack, we may spend some time processing the data and running hooks. If the client asked us to be quiet, receive-pack won't send any progress data during the index-pack or connectivity-check steps. And hooks may or may not produce their own progress output. In these cases, the network connection is totally silent from both ends. Git itself doesn't care about this (it will wait forever), but other parts of the system (e.g., firewalls, load-balancers, etc) might hang up the connection. So we'd like to send some sort of keepalive to let the network and the client side know that we're still alive and processing. We can use the same trick we did in 05e9515 (upload-pack: send keepalive packets during pack computation, 2013-09-08). Namely, we will send an empty sideband data packet every `N` seconds that we do not relay any stderr data over the sideband channel. As with 05e9515, this means that we won't bother sending keepalives when there's actual progress data, but will kick in when it has been disabled (or if there is a lull in the progress data). The concept is simple, but the details are subtle enough that they need discussing here. Before the client sends us the pack, we don't want to do any keepalives. We'll have sent our ref advertisement, and we're waiting for them to send us the pack (and tell us that they support sidebands at all). While we're receiving the pack from the client (or waiting for it to start), there's no need for keepalives; it's up to them to keep the connection active by sending data. Moreover, it would be wrong for us to do so. When we are the server in the smart-http protocol, we must treat our connection as half-duplex. So any keepalives we send while receiving the pack would potentially be buffered by the webserver. Not only does this make them useless (since they would not be delivered in a timely manner), but it could actually cause a deadlock if we fill up the buffer with keepalives. (It wouldn't be wrong to send keepalives in this phase for a full-duplex connection like ssh; it's simply pointless, as it is the client's responsibility to speak). As soon as we've gotten all of the pack data, then the client is waiting for us to speak, and we should start keepalives immediately. From here until the end of the connection, we send one any time we are not otherwise sending data. But there's a catch. Receive-pack doesn't know the moment we've gotten all the data. It passes the descriptor to index-pack, who reads all of the data, and then starts resolving the deltas. We have to communicate that back. To make this work, we instruct the sideband muxer to enable keepalives in three phases: 1. In the beginning, not at all. 2. While reading from index-pack, wait for a signal indicating end-of-input, and then start them. 3. Afterwards, always. The signal from index-pack in phase 2 has to come over the stderr channel which the muxer is reading. We can't use an extra pipe because the portable run-command interface only gives us stderr and stdout. Stdout is already used to pass the .keep filename back to receive-pack. We could also send a signal there, but then we would find out about it in the main thread. And the keepalive needs to be done by the async muxer thread (since it's the one writing sideband data back to the client). And we can't reliably signal the async thread from the main thread, because the async code sometimes uses threads and sometimes uses forked processes. Therefore the signal must come over the stderr channel, where it may be interspersed with other random human-readable messages from index-pack. This patch makes the signal a single NUL byte. This is easy to parse, should not appear in any normal stderr output, and we don't have to worry about any timing issues (like seeing half the signal bytes in one read(), and half in a subsequent one). This is a bit ugly, but it's simple to code and should work reliably. Another option would be to stop using an async thread for muxing entirely, and just poll() both stderr and stdout of index-pack from the main thread. This would work for index-pack (because we aren't doing anything useful in the main thread while it runs anyway). But it would make the connectivity check and the hook muxers much more complicated, as they need to simultaneously feed the sub-programs while reading their stderr. The index-pack phase is the only one that needs this signaling, so it could simply behave differently than the other two. That would mean having two separate implementations of copy_to_sideband (and the keepalive code), though. And it still doesn't get rid of the signaling; it just means we can write a nicer message like "END_OF_INPUT" or something on stdout, since we don't have to worry about separating it from the stderr cruft. One final note: this signaling trick is only done with index-pack, not with unpack-objects. There's no point in doing it for the latter, because by definition it only kicks in for a small number of objects, where keepalives are not as useful (and this conveniently lets us avoid duplicating the implementation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
static enum {
KEEPALIVE_NEVER = 0,
KEEPALIVE_AFTER_NUL,
KEEPALIVE_ALWAYS
} use_keepalive;
static int keepalive_in_sec = 5;
static struct tmp_objdir *tmp_objdir;
static struct proc_receive_ref {
unsigned int want_add:1,
want_delete:1,
want_modify:1,
negative_ref:1;
char *ref_prefix;
struct proc_receive_ref *next;
} *proc_receive_ref;
static void proc_receive_ref_append(const char *prefix);
static enum deny_action parse_deny_action(const char *var, const char *value)
{
if (value) {
if (!strcasecmp(value, "ignore"))
return DENY_IGNORE;
if (!strcasecmp(value, "warn"))
return DENY_WARN;
if (!strcasecmp(value, "refuse"))
return DENY_REFUSE;
if (!strcasecmp(value, "updateinstead"))
return DENY_UPDATE_INSTEAD;
}
if (git_config_bool(var, value))
return DENY_REFUSE;
return DENY_IGNORE;
}
static int receive_pack_config(const char *var, const char *value, void *cb)
{
int status = parse_hide_refs_config(var, value, "receive", &hidden_refs);
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
if (status)
return status;
if (strcmp(var, "receive.denydeletes") == 0) {
deny_deletes = git_config_bool(var, value);
return 0;
}
if (strcmp(var, "receive.denynonfastforwards") == 0) {
deny_non_fast_forwards = git_config_bool(var, value);
return 0;
}
if (strcmp(var, "receive.unpacklimit") == 0) {
receive_unpack_limit = git_config_int(var, value);
return 0;
}
if (strcmp(var, "transfer.unpacklimit") == 0) {
transfer_unpack_limit = git_config_int(var, value);
return 0;
}
if (strcmp(var, "receive.fsck.skiplist") == 0) {
const char *path;
if (git_config_pathname(&path, var, value))
return 1;
strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
fsck_msg_types.len ? ',' : '=', path);
free((char *)path);
return 0;
}
if (skip_prefix(var, "receive.fsck.", &var)) {
if (is_valid_msg_type(var, value))
strbuf_addf(&fsck_msg_types, "%c%s=%s",
fsck_msg_types.len ? ',' : '=', var, value);
else
warning("skipping unknown msg id '%s'", var);
return 0;
}
if (strcmp(var, "receive.fsckobjects") == 0) {
receive_fsck_objects = git_config_bool(var, value);
return 0;
}
if (strcmp(var, "transfer.fsckobjects") == 0) {
transfer_fsck_objects = git_config_bool(var, value);
return 0;
}
if (!strcmp(var, "receive.denycurrentbranch")) {
deny_current_branch = parse_deny_action(var, value);
return 0;
}
if (strcmp(var, "receive.denydeletecurrent") == 0) {
deny_delete_current = parse_deny_action(var, value);
return 0;
}
if (strcmp(var, "repack.usedeltabaseoffset") == 0) {
prefer_ofs_delta = git_config_bool(var, value);
return 0;
}
if (strcmp(var, "receive.updateserverinfo") == 0) {
auto_update_server_info = git_config_bool(var, value);
return 0;
}
if (strcmp(var, "receive.autogc") == 0) {
auto_gc = git_config_bool(var, value);
return 0;
}
if (strcmp(var, "receive.shallowupdate") == 0) {
shallow_update = git_config_bool(var, value);
return 0;
}
if (strcmp(var, "receive.certnonceseed") == 0)
return git_config_string(&cert_nonce_seed, var, value);
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
if (strcmp(var, "receive.certnonceslop") == 0) {
nonce_stamp_slop_limit = git_config_ulong(var, value);
return 0;
}
if (strcmp(var, "receive.advertiseatomic") == 0) {
advertise_atomic_push = git_config_bool(var, value);
return 0;
}
if (strcmp(var, "receive.advertisepushoptions") == 0) {
advertise_push_options = git_config_bool(var, value);
return 0;
}
receive-pack: send keepalives during quiet periods After a client has sent us the complete pack, we may spend some time processing the data and running hooks. If the client asked us to be quiet, receive-pack won't send any progress data during the index-pack or connectivity-check steps. And hooks may or may not produce their own progress output. In these cases, the network connection is totally silent from both ends. Git itself doesn't care about this (it will wait forever), but other parts of the system (e.g., firewalls, load-balancers, etc) might hang up the connection. So we'd like to send some sort of keepalive to let the network and the client side know that we're still alive and processing. We can use the same trick we did in 05e9515 (upload-pack: send keepalive packets during pack computation, 2013-09-08). Namely, we will send an empty sideband data packet every `N` seconds that we do not relay any stderr data over the sideband channel. As with 05e9515, this means that we won't bother sending keepalives when there's actual progress data, but will kick in when it has been disabled (or if there is a lull in the progress data). The concept is simple, but the details are subtle enough that they need discussing here. Before the client sends us the pack, we don't want to do any keepalives. We'll have sent our ref advertisement, and we're waiting for them to send us the pack (and tell us that they support sidebands at all). While we're receiving the pack from the client (or waiting for it to start), there's no need for keepalives; it's up to them to keep the connection active by sending data. Moreover, it would be wrong for us to do so. When we are the server in the smart-http protocol, we must treat our connection as half-duplex. So any keepalives we send while receiving the pack would potentially be buffered by the webserver. Not only does this make them useless (since they would not be delivered in a timely manner), but it could actually cause a deadlock if we fill up the buffer with keepalives. (It wouldn't be wrong to send keepalives in this phase for a full-duplex connection like ssh; it's simply pointless, as it is the client's responsibility to speak). As soon as we've gotten all of the pack data, then the client is waiting for us to speak, and we should start keepalives immediately. From here until the end of the connection, we send one any time we are not otherwise sending data. But there's a catch. Receive-pack doesn't know the moment we've gotten all the data. It passes the descriptor to index-pack, who reads all of the data, and then starts resolving the deltas. We have to communicate that back. To make this work, we instruct the sideband muxer to enable keepalives in three phases: 1. In the beginning, not at all. 2. While reading from index-pack, wait for a signal indicating end-of-input, and then start them. 3. Afterwards, always. The signal from index-pack in phase 2 has to come over the stderr channel which the muxer is reading. We can't use an extra pipe because the portable run-command interface only gives us stderr and stdout. Stdout is already used to pass the .keep filename back to receive-pack. We could also send a signal there, but then we would find out about it in the main thread. And the keepalive needs to be done by the async muxer thread (since it's the one writing sideband data back to the client). And we can't reliably signal the async thread from the main thread, because the async code sometimes uses threads and sometimes uses forked processes. Therefore the signal must come over the stderr channel, where it may be interspersed with other random human-readable messages from index-pack. This patch makes the signal a single NUL byte. This is easy to parse, should not appear in any normal stderr output, and we don't have to worry about any timing issues (like seeing half the signal bytes in one read(), and half in a subsequent one). This is a bit ugly, but it's simple to code and should work reliably. Another option would be to stop using an async thread for muxing entirely, and just poll() both stderr and stdout of index-pack from the main thread. This would work for index-pack (because we aren't doing anything useful in the main thread while it runs anyway). But it would make the connectivity check and the hook muxers much more complicated, as they need to simultaneously feed the sub-programs while reading their stderr. The index-pack phase is the only one that needs this signaling, so it could simply behave differently than the other two. That would mean having two separate implementations of copy_to_sideband (and the keepalive code), though. And it still doesn't get rid of the signaling; it just means we can write a nicer message like "END_OF_INPUT" or something on stdout, since we don't have to worry about separating it from the stderr cruft. One final note: this signaling trick is only done with index-pack, not with unpack-objects. There's no point in doing it for the latter, because by definition it only kicks in for a small number of objects, where keepalives are not as useful (and this conveniently lets us avoid duplicating the implementation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
if (strcmp(var, "receive.keepalive") == 0) {
keepalive_in_sec = git_config_int(var, value);
return 0;
}
if (strcmp(var, "receive.maxinputsize") == 0) {
max_input_size = git_config_int64(var, value);
return 0;
}
if (strcmp(var, "receive.procreceiverefs") == 0) {
if (!value)
return config_error_nonbool(var);
proc_receive_ref_append(value);
return 0;
}
if (strcmp(var, "transfer.advertisesid") == 0) {
advertise_sid = git_config_bool(var, value);
return 0;
}
return git_default_config(var, value, cb);
}
static void show_ref(const char *path, const struct object_id *oid)
{
if (sent_capabilities) {
packet_write_fmt(1, "%s %s\n", oid_to_hex(oid), path);
} else {
struct strbuf cap = STRBUF_INIT;
strbuf_addstr(&cap,
"report-status report-status-v2 delete-refs side-band-64k quiet");
if (advertise_atomic_push)
strbuf_addstr(&cap, " atomic");
if (prefer_ofs_delta)
strbuf_addstr(&cap, " ofs-delta");
if (push_cert_nonce)
strbuf_addf(&cap, " push-cert=%s", push_cert_nonce);
if (advertise_push_options)
strbuf_addstr(&cap, " push-options");
if (advertise_sid)
strbuf_addf(&cap, " session-id=%s", trace2_session_id());
strbuf_addf(&cap, " object-format=%s", the_hash_algo->name);
strbuf_addf(&cap, " agent=%s", git_user_agent_sanitized());
packet_write_fmt(1, "%s %s%c%s\n",
oid_to_hex(oid), path, 0, cap.buf);
strbuf_release(&cap);
sent_capabilities = 1;
}
}
static int show_ref_cb(const char *path_full, const struct object_id *oid,
int flag UNUSED, void *data)
{
struct oidset *seen = data;
const char *path = strip_namespace(path_full);
if (ref_is_hidden(path, path_full, &hidden_refs))
return 0;
/*
* Advertise refs outside our current namespace as ".have"
* refs, so that the client can use them to minimize data
* transfer but will otherwise ignore them.
*/
if (!path) {
if (oidset_insert(seen, oid))
return 0;
path = ".have";
} else {
oidset_insert(seen, oid);
}
show_ref(path, oid);
return 0;
}
static void show_one_alternate_ref(const struct object_id *oid,
void *data)
{
struct oidset *seen = data;
if (oidset_insert(seen, oid))
return;
show_ref(".have", oid);
}
static void write_head_info(void)
{
static struct oidset seen = OIDSET_INIT;
for_each_ref(show_ref_cb, &seen);
for_each_alternate_ref(show_one_alternate_ref, &seen);
oidset_clear(&seen);
if (!sent_capabilities)
show_ref("capabilities^{}", null_oid());
make the sender advertise shallow commits to the receiver If either receive-pack or upload-pack is called on a shallow repository, shallow commits (*) will be sent after the ref advertisement (but before the packet flush), so that the receiver has the full "shape" of the sender's commit graph. This will be needed for the receiver to update its .git/shallow if necessary. This breaks the protocol for all clients trying to push to a shallow repo, or fetch from one. Which is basically the same end result as today's "is_repository_shallow() && die()" in receive-pack and upload-pack. New clients will be made aware of shallow upstream and can make use of this information. The sender must send all shallow commits that are sent in the following pack. It may send more shallow commits than necessary. upload-pack for example may choose to advertise no shallow commits if it knows in advance that the pack it's going to send contains no shallow commits. But upload-pack is the server, so we choose the cheaper way, send full .git/shallow and let the client deal with it. Smart HTTP is not affected by this patch. Shallow support on smart-http comes later separately. (*) A shallow commit is a commit that terminates the revision walker. It is usually put in .git/shallow in order to keep the revision walker from going out of bound because there is no guarantee that objects behind this commit is available. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
11 years ago
advertise_shallow_grafts(1);
/* EOF */
packet_flush(1);
}
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
#define RUN_PROC_RECEIVE_SCHEDULED 1
#define RUN_PROC_RECEIVE_RETURNED 2
struct command {
struct command *next;
const char *error_string;
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
struct ref_push_report *report;
unsigned int skip_update:1,
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
did_not_exist:1,
run_proc_receive:2;
int index;
struct object_id old_oid;
struct object_id new_oid;
char ref_name[FLEX_ARRAY]; /* more */
};
static void proc_receive_ref_append(const char *prefix)
{
struct proc_receive_ref *ref_pattern;
char *p;
int len;
CALLOC_ARRAY(ref_pattern, 1);
p = strchr(prefix, ':');
if (p) {
while (prefix < p) {
if (*prefix == 'a')
ref_pattern->want_add = 1;
else if (*prefix == 'd')
ref_pattern->want_delete = 1;
else if (*prefix == 'm')
ref_pattern->want_modify = 1;
else if (*prefix == '!')
ref_pattern->negative_ref = 1;
prefix++;
}
prefix++;
} else {
ref_pattern->want_add = 1;
ref_pattern->want_delete = 1;
ref_pattern->want_modify = 1;
}
len = strlen(prefix);
while (len && prefix[len - 1] == '/')
len--;
ref_pattern->ref_prefix = xmemdupz(prefix, len);
if (!proc_receive_ref) {
proc_receive_ref = ref_pattern;
} else {
struct proc_receive_ref *end;
end = proc_receive_ref;
while (end->next)
end = end->next;
end->next = ref_pattern;
}
}
static int proc_receive_ref_matches(struct command *cmd)
{
struct proc_receive_ref *p;
if (!proc_receive_ref)
return 0;
for (p = proc_receive_ref; p; p = p->next) {
const char *match = p->ref_prefix;
const char *remains;
if (!p->want_add && is_null_oid(&cmd->old_oid))
continue;
else if (!p->want_delete && is_null_oid(&cmd->new_oid))
continue;
else if (!p->want_modify &&
!is_null_oid(&cmd->old_oid) &&
!is_null_oid(&cmd->new_oid))
continue;
if (skip_prefix(cmd->ref_name, match, &remains) &&
(!*remains || *remains == '/')) {
if (!p->negative_ref)
return 1;
} else if (p->negative_ref) {
return 1;
}
}
return 0;
}
static void report_message(const char *prefix, const char *err, va_list params)
{
int sz;
char msg[4096];
sz = xsnprintf(msg, sizeof(msg), "%s", prefix);
sz += vsnprintf(msg + sz, sizeof(msg) - sz, err, params);
if (sz > (sizeof(msg) - 1))
sz = sizeof(msg) - 1;
msg[sz++] = '\n';
if (use_sideband)
send_sideband(1, 2, msg, sz, use_sideband);
else
xwrite(2, msg, sz);
}
__attribute__((format (printf, 1, 2)))
static void rp_warning(const char *err, ...)
{
va_list params;
va_start(params, err);
report_message("warning: ", err, params);
va_end(params);
}
__attribute__((format (printf, 1, 2)))
static void rp_error(const char *err, ...)
{
va_list params;
va_start(params, err);
report_message("error: ", err, params);
va_end(params);
}
static int copy_to_sideband(int in, int out UNUSED, void *arg UNUSED)
{
char data[128];
receive-pack: send keepalives during quiet periods After a client has sent us the complete pack, we may spend some time processing the data and running hooks. If the client asked us to be quiet, receive-pack won't send any progress data during the index-pack or connectivity-check steps. And hooks may or may not produce their own progress output. In these cases, the network connection is totally silent from both ends. Git itself doesn't care about this (it will wait forever), but other parts of the system (e.g., firewalls, load-balancers, etc) might hang up the connection. So we'd like to send some sort of keepalive to let the network and the client side know that we're still alive and processing. We can use the same trick we did in 05e9515 (upload-pack: send keepalive packets during pack computation, 2013-09-08). Namely, we will send an empty sideband data packet every `N` seconds that we do not relay any stderr data over the sideband channel. As with 05e9515, this means that we won't bother sending keepalives when there's actual progress data, but will kick in when it has been disabled (or if there is a lull in the progress data). The concept is simple, but the details are subtle enough that they need discussing here. Before the client sends us the pack, we don't want to do any keepalives. We'll have sent our ref advertisement, and we're waiting for them to send us the pack (and tell us that they support sidebands at all). While we're receiving the pack from the client (or waiting for it to start), there's no need for keepalives; it's up to them to keep the connection active by sending data. Moreover, it would be wrong for us to do so. When we are the server in the smart-http protocol, we must treat our connection as half-duplex. So any keepalives we send while receiving the pack would potentially be buffered by the webserver. Not only does this make them useless (since they would not be delivered in a timely manner), but it could actually cause a deadlock if we fill up the buffer with keepalives. (It wouldn't be wrong to send keepalives in this phase for a full-duplex connection like ssh; it's simply pointless, as it is the client's responsibility to speak). As soon as we've gotten all of the pack data, then the client is waiting for us to speak, and we should start keepalives immediately. From here until the end of the connection, we send one any time we are not otherwise sending data. But there's a catch. Receive-pack doesn't know the moment we've gotten all the data. It passes the descriptor to index-pack, who reads all of the data, and then starts resolving the deltas. We have to communicate that back. To make this work, we instruct the sideband muxer to enable keepalives in three phases: 1. In the beginning, not at all. 2. While reading from index-pack, wait for a signal indicating end-of-input, and then start them. 3. Afterwards, always. The signal from index-pack in phase 2 has to come over the stderr channel which the muxer is reading. We can't use an extra pipe because the portable run-command interface only gives us stderr and stdout. Stdout is already used to pass the .keep filename back to receive-pack. We could also send a signal there, but then we would find out about it in the main thread. And the keepalive needs to be done by the async muxer thread (since it's the one writing sideband data back to the client). And we can't reliably signal the async thread from the main thread, because the async code sometimes uses threads and sometimes uses forked processes. Therefore the signal must come over the stderr channel, where it may be interspersed with other random human-readable messages from index-pack. This patch makes the signal a single NUL byte. This is easy to parse, should not appear in any normal stderr output, and we don't have to worry about any timing issues (like seeing half the signal bytes in one read(), and half in a subsequent one). This is a bit ugly, but it's simple to code and should work reliably. Another option would be to stop using an async thread for muxing entirely, and just poll() both stderr and stdout of index-pack from the main thread. This would work for index-pack (because we aren't doing anything useful in the main thread while it runs anyway). But it would make the connectivity check and the hook muxers much more complicated, as they need to simultaneously feed the sub-programs while reading their stderr. The index-pack phase is the only one that needs this signaling, so it could simply behave differently than the other two. That would mean having two separate implementations of copy_to_sideband (and the keepalive code), though. And it still doesn't get rid of the signaling; it just means we can write a nicer message like "END_OF_INPUT" or something on stdout, since we don't have to worry about separating it from the stderr cruft. One final note: this signaling trick is only done with index-pack, not with unpack-objects. There's no point in doing it for the latter, because by definition it only kicks in for a small number of objects, where keepalives are not as useful (and this conveniently lets us avoid duplicating the implementation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
int keepalive_active = 0;
if (keepalive_in_sec <= 0)
use_keepalive = KEEPALIVE_NEVER;
if (use_keepalive == KEEPALIVE_ALWAYS)
keepalive_active = 1;
while (1) {
receive-pack: send keepalives during quiet periods After a client has sent us the complete pack, we may spend some time processing the data and running hooks. If the client asked us to be quiet, receive-pack won't send any progress data during the index-pack or connectivity-check steps. And hooks may or may not produce their own progress output. In these cases, the network connection is totally silent from both ends. Git itself doesn't care about this (it will wait forever), but other parts of the system (e.g., firewalls, load-balancers, etc) might hang up the connection. So we'd like to send some sort of keepalive to let the network and the client side know that we're still alive and processing. We can use the same trick we did in 05e9515 (upload-pack: send keepalive packets during pack computation, 2013-09-08). Namely, we will send an empty sideband data packet every `N` seconds that we do not relay any stderr data over the sideband channel. As with 05e9515, this means that we won't bother sending keepalives when there's actual progress data, but will kick in when it has been disabled (or if there is a lull in the progress data). The concept is simple, but the details are subtle enough that they need discussing here. Before the client sends us the pack, we don't want to do any keepalives. We'll have sent our ref advertisement, and we're waiting for them to send us the pack (and tell us that they support sidebands at all). While we're receiving the pack from the client (or waiting for it to start), there's no need for keepalives; it's up to them to keep the connection active by sending data. Moreover, it would be wrong for us to do so. When we are the server in the smart-http protocol, we must treat our connection as half-duplex. So any keepalives we send while receiving the pack would potentially be buffered by the webserver. Not only does this make them useless (since they would not be delivered in a timely manner), but it could actually cause a deadlock if we fill up the buffer with keepalives. (It wouldn't be wrong to send keepalives in this phase for a full-duplex connection like ssh; it's simply pointless, as it is the client's responsibility to speak). As soon as we've gotten all of the pack data, then the client is waiting for us to speak, and we should start keepalives immediately. From here until the end of the connection, we send one any time we are not otherwise sending data. But there's a catch. Receive-pack doesn't know the moment we've gotten all the data. It passes the descriptor to index-pack, who reads all of the data, and then starts resolving the deltas. We have to communicate that back. To make this work, we instruct the sideband muxer to enable keepalives in three phases: 1. In the beginning, not at all. 2. While reading from index-pack, wait for a signal indicating end-of-input, and then start them. 3. Afterwards, always. The signal from index-pack in phase 2 has to come over the stderr channel which the muxer is reading. We can't use an extra pipe because the portable run-command interface only gives us stderr and stdout. Stdout is already used to pass the .keep filename back to receive-pack. We could also send a signal there, but then we would find out about it in the main thread. And the keepalive needs to be done by the async muxer thread (since it's the one writing sideband data back to the client). And we can't reliably signal the async thread from the main thread, because the async code sometimes uses threads and sometimes uses forked processes. Therefore the signal must come over the stderr channel, where it may be interspersed with other random human-readable messages from index-pack. This patch makes the signal a single NUL byte. This is easy to parse, should not appear in any normal stderr output, and we don't have to worry about any timing issues (like seeing half the signal bytes in one read(), and half in a subsequent one). This is a bit ugly, but it's simple to code and should work reliably. Another option would be to stop using an async thread for muxing entirely, and just poll() both stderr and stdout of index-pack from the main thread. This would work for index-pack (because we aren't doing anything useful in the main thread while it runs anyway). But it would make the connectivity check and the hook muxers much more complicated, as they need to simultaneously feed the sub-programs while reading their stderr. The index-pack phase is the only one that needs this signaling, so it could simply behave differently than the other two. That would mean having two separate implementations of copy_to_sideband (and the keepalive code), though. And it still doesn't get rid of the signaling; it just means we can write a nicer message like "END_OF_INPUT" or something on stdout, since we don't have to worry about separating it from the stderr cruft. One final note: this signaling trick is only done with index-pack, not with unpack-objects. There's no point in doing it for the latter, because by definition it only kicks in for a small number of objects, where keepalives are not as useful (and this conveniently lets us avoid duplicating the implementation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
ssize_t sz;
if (keepalive_active) {
struct pollfd pfd;
int ret;
pfd.fd = in;
pfd.events = POLLIN;
ret = poll(&pfd, 1, 1000 * keepalive_in_sec);
if (ret < 0) {
if (errno == EINTR)
continue;
else
break;
} else if (ret == 0) {
/* no data; send a keepalive packet */
static const char buf[] = "0005\1";
write_or_die(1, buf, sizeof(buf) - 1);
continue;
} /* else there is actual data to read */
}
sz = xread(in, data, sizeof(data));
if (sz <= 0)
break;
receive-pack: send keepalives during quiet periods After a client has sent us the complete pack, we may spend some time processing the data and running hooks. If the client asked us to be quiet, receive-pack won't send any progress data during the index-pack or connectivity-check steps. And hooks may or may not produce their own progress output. In these cases, the network connection is totally silent from both ends. Git itself doesn't care about this (it will wait forever), but other parts of the system (e.g., firewalls, load-balancers, etc) might hang up the connection. So we'd like to send some sort of keepalive to let the network and the client side know that we're still alive and processing. We can use the same trick we did in 05e9515 (upload-pack: send keepalive packets during pack computation, 2013-09-08). Namely, we will send an empty sideband data packet every `N` seconds that we do not relay any stderr data over the sideband channel. As with 05e9515, this means that we won't bother sending keepalives when there's actual progress data, but will kick in when it has been disabled (or if there is a lull in the progress data). The concept is simple, but the details are subtle enough that they need discussing here. Before the client sends us the pack, we don't want to do any keepalives. We'll have sent our ref advertisement, and we're waiting for them to send us the pack (and tell us that they support sidebands at all). While we're receiving the pack from the client (or waiting for it to start), there's no need for keepalives; it's up to them to keep the connection active by sending data. Moreover, it would be wrong for us to do so. When we are the server in the smart-http protocol, we must treat our connection as half-duplex. So any keepalives we send while receiving the pack would potentially be buffered by the webserver. Not only does this make them useless (since they would not be delivered in a timely manner), but it could actually cause a deadlock if we fill up the buffer with keepalives. (It wouldn't be wrong to send keepalives in this phase for a full-duplex connection like ssh; it's simply pointless, as it is the client's responsibility to speak). As soon as we've gotten all of the pack data, then the client is waiting for us to speak, and we should start keepalives immediately. From here until the end of the connection, we send one any time we are not otherwise sending data. But there's a catch. Receive-pack doesn't know the moment we've gotten all the data. It passes the descriptor to index-pack, who reads all of the data, and then starts resolving the deltas. We have to communicate that back. To make this work, we instruct the sideband muxer to enable keepalives in three phases: 1. In the beginning, not at all. 2. While reading from index-pack, wait for a signal indicating end-of-input, and then start them. 3. Afterwards, always. The signal from index-pack in phase 2 has to come over the stderr channel which the muxer is reading. We can't use an extra pipe because the portable run-command interface only gives us stderr and stdout. Stdout is already used to pass the .keep filename back to receive-pack. We could also send a signal there, but then we would find out about it in the main thread. And the keepalive needs to be done by the async muxer thread (since it's the one writing sideband data back to the client). And we can't reliably signal the async thread from the main thread, because the async code sometimes uses threads and sometimes uses forked processes. Therefore the signal must come over the stderr channel, where it may be interspersed with other random human-readable messages from index-pack. This patch makes the signal a single NUL byte. This is easy to parse, should not appear in any normal stderr output, and we don't have to worry about any timing issues (like seeing half the signal bytes in one read(), and half in a subsequent one). This is a bit ugly, but it's simple to code and should work reliably. Another option would be to stop using an async thread for muxing entirely, and just poll() both stderr and stdout of index-pack from the main thread. This would work for index-pack (because we aren't doing anything useful in the main thread while it runs anyway). But it would make the connectivity check and the hook muxers much more complicated, as they need to simultaneously feed the sub-programs while reading their stderr. The index-pack phase is the only one that needs this signaling, so it could simply behave differently than the other two. That would mean having two separate implementations of copy_to_sideband (and the keepalive code), though. And it still doesn't get rid of the signaling; it just means we can write a nicer message like "END_OF_INPUT" or something on stdout, since we don't have to worry about separating it from the stderr cruft. One final note: this signaling trick is only done with index-pack, not with unpack-objects. There's no point in doing it for the latter, because by definition it only kicks in for a small number of objects, where keepalives are not as useful (and this conveniently lets us avoid duplicating the implementation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
if (use_keepalive == KEEPALIVE_AFTER_NUL && !keepalive_active) {
const char *p = memchr(data, '\0', sz);
if (p) {
/*
* The NUL tells us to start sending keepalives. Make
* sure we send any other data we read along
* with it.
*/
keepalive_active = 1;
send_sideband(1, 2, data, p - data, use_sideband);
send_sideband(1, 2, p + 1, sz - (p - data + 1), use_sideband);
continue;
}
}
/*
* Either we're not looking for a NUL signal, or we didn't see
* it yet; just pass along the data.
*/
send_sideband(1, 2, data, sz, use_sideband);
}
close(in);
return 0;
}
static void hmac_hash(unsigned char *out,
const char *key_in, size_t key_len,
const char *text, size_t text_len)
{
unsigned char key[GIT_MAX_BLKSZ];
unsigned char k_ipad[GIT_MAX_BLKSZ];
unsigned char k_opad[GIT_MAX_BLKSZ];
int i;
git_hash_ctx ctx;
/* RFC 2104 2. (1) */
memset(key, '\0', GIT_MAX_BLKSZ);
if (the_hash_algo->blksz < key_len) {
the_hash_algo->init_fn(&ctx);
the_hash_algo->update_fn(&ctx, key_in, key_len);
the_hash_algo->final_fn(key, &ctx);
} else {
memcpy(key, key_in, key_len);
}
/* RFC 2104 2. (2) & (5) */
for (i = 0; i < sizeof(key); i++) {
k_ipad[i] = key[i] ^ 0x36;
k_opad[i] = key[i] ^ 0x5c;
}
/* RFC 2104 2. (3) & (4) */
the_hash_algo->init_fn(&ctx);
the_hash_algo->update_fn(&ctx, k_ipad, sizeof(k_ipad));
the_hash_algo->update_fn(&ctx, text, text_len);
the_hash_algo->final_fn(out, &ctx);
/* RFC 2104 2. (6) & (7) */
the_hash_algo->init_fn(&ctx);
the_hash_algo->update_fn(&ctx, k_opad, sizeof(k_opad));
the_hash_algo->update_fn(&ctx, out, the_hash_algo->rawsz);
the_hash_algo->final_fn(out, &ctx);
}
static char *prepare_push_cert_nonce(const char *path, timestamp_t stamp)
{
struct strbuf buf = STRBUF_INIT;
unsigned char hash[GIT_MAX_RAWSZ];
strbuf_addf(&buf, "%s:%"PRItime, path, stamp);
hmac_hash(hash, buf.buf, buf.len, cert_nonce_seed, strlen(cert_nonce_seed));
strbuf_release(&buf);
/* RFC 2104 5. HMAC-SHA1 or HMAC-SHA256 */
strbuf_addf(&buf, "%"PRItime"-%.*s", stamp, (int)the_hash_algo->hexsz, hash_to_hex(hash));
return strbuf_detach(&buf, NULL);
}
static char *find_header(const char *msg, size_t len, const char *key,
const char **next_line)
{
size_t out_len;
const char *val = find_header_mem(msg, len, key, &out_len);
if (!val)
return NULL;
if (next_line)
*next_line = val + out_len + 1;
return xmemdupz(val, out_len);
}
builtin/receive-pack: use constant-time comparison for HMAC value When we're comparing a push cert nonce, we currently do so using strcmp. Most implementations of strcmp short-circuit and exit as soon as they know whether two values are equal. This, however, is a problem when we're comparing the output of HMAC, as it leaks information in the time taken about how much of the two values match if they do indeed differ. In our case, the nonce is used to prevent replay attacks against our server via the embedded timestamp and replay attacks using requests from a different server via the HMAC. Push certs, which contain the nonces, are signed, so an attacker cannot tamper with the nonces without breaking validation of the signature. They can, of course, create their own signatures with invalid nonces, but they can also create their own signatures with valid nonces, so there's nothing to be gained. Thus, there is no security problem. Even though it doesn't appear that there are any negative consequences from the current technique, for safety and to encourage good practices, let's use a constant time comparison function for nonce verification. POSIX does not provide one, but they are easy to write. The technique we use here is also used in NaCl and the Go standard library and relies on the fact that bitwise or and xor are constant time on all known architectures. We need not be concerned about exiting early if the actual and expected lengths differ, since the standard cryptographic assumption is that everyone, including an attacker, knows the format of and algorithm used in our nonces (and in any event, they have the source code and can determine it easily). As a result, we assume everyone knows how long our nonces should be. This philosophy is also taken by the Go standard library and other cryptographic libraries when performing constant time comparisons on HMAC values. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
/*
* Return zero if a and b are equal up to n bytes and nonzero if they are not.
* This operation is guaranteed to run in constant time to avoid leaking data.
*/
static int constant_memequal(const char *a, const char *b, size_t n)
{
int res = 0;
size_t i;
for (i = 0; i < n; i++)
builtin/receive-pack: use constant-time comparison for HMAC value When we're comparing a push cert nonce, we currently do so using strcmp. Most implementations of strcmp short-circuit and exit as soon as they know whether two values are equal. This, however, is a problem when we're comparing the output of HMAC, as it leaks information in the time taken about how much of the two values match if they do indeed differ. In our case, the nonce is used to prevent replay attacks against our server via the embedded timestamp and replay attacks using requests from a different server via the HMAC. Push certs, which contain the nonces, are signed, so an attacker cannot tamper with the nonces without breaking validation of the signature. They can, of course, create their own signatures with invalid nonces, but they can also create their own signatures with valid nonces, so there's nothing to be gained. Thus, there is no security problem. Even though it doesn't appear that there are any negative consequences from the current technique, for safety and to encourage good practices, let's use a constant time comparison function for nonce verification. POSIX does not provide one, but they are easy to write. The technique we use here is also used in NaCl and the Go standard library and relies on the fact that bitwise or and xor are constant time on all known architectures. We need not be concerned about exiting early if the actual and expected lengths differ, since the standard cryptographic assumption is that everyone, including an attacker, knows the format of and algorithm used in our nonces (and in any event, they have the source code and can determine it easily). As a result, we assume everyone knows how long our nonces should be. This philosophy is also taken by the Go standard library and other cryptographic libraries when performing constant time comparisons on HMAC values. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
res |= a[i] ^ b[i];
return res;
}
static const char *check_nonce(const char *buf, size_t len)
{
char *nonce = find_header(buf, len, "nonce", NULL);
timestamp_t stamp, ostamp;
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
char *bohmac, *expect = NULL;
const char *retval = NONCE_BAD;
builtin/receive-pack: use constant-time comparison for HMAC value When we're comparing a push cert nonce, we currently do so using strcmp. Most implementations of strcmp short-circuit and exit as soon as they know whether two values are equal. This, however, is a problem when we're comparing the output of HMAC, as it leaks information in the time taken about how much of the two values match if they do indeed differ. In our case, the nonce is used to prevent replay attacks against our server via the embedded timestamp and replay attacks using requests from a different server via the HMAC. Push certs, which contain the nonces, are signed, so an attacker cannot tamper with the nonces without breaking validation of the signature. They can, of course, create their own signatures with invalid nonces, but they can also create their own signatures with valid nonces, so there's nothing to be gained. Thus, there is no security problem. Even though it doesn't appear that there are any negative consequences from the current technique, for safety and to encourage good practices, let's use a constant time comparison function for nonce verification. POSIX does not provide one, but they are easy to write. The technique we use here is also used in NaCl and the Go standard library and relies on the fact that bitwise or and xor are constant time on all known architectures. We need not be concerned about exiting early if the actual and expected lengths differ, since the standard cryptographic assumption is that everyone, including an attacker, knows the format of and algorithm used in our nonces (and in any event, they have the source code and can determine it easily). As a result, we assume everyone knows how long our nonces should be. This philosophy is also taken by the Go standard library and other cryptographic libraries when performing constant time comparisons on HMAC values. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
size_t noncelen;
if (!nonce) {
retval = NONCE_MISSING;
goto leave;
} else if (!push_cert_nonce) {
retval = NONCE_UNSOLICITED;
goto leave;
} else if (!strcmp(push_cert_nonce, nonce)) {
retval = NONCE_OK;
goto leave;
}
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
if (!stateless_rpc) {
/* returned nonce MUST match what we gave out earlier */
retval = NONCE_BAD;
goto leave;
}
/*
* In stateless mode, we may be receiving a nonce issued by
* another instance of the server that serving the same
* repository, and the timestamps may not match, but the
* nonce-seed and dir should match, so we can recompute and
* report the time slop.
*
* In addition, when a nonce issued by another instance has
* timestamp within receive.certnonceslop seconds, we pretend
* as if we issued that nonce when reporting to the hook.
*/
/* nonce is concat(<seconds-since-epoch>, "-", <hmac>) */
if (*nonce <= '0' || '9' < *nonce) {
retval = NONCE_BAD;
goto leave;
}
stamp = parse_timestamp(nonce, &bohmac, 10);
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
if (bohmac == nonce || bohmac[0] != '-') {
retval = NONCE_BAD;
goto leave;
}
builtin/receive-pack: use constant-time comparison for HMAC value When we're comparing a push cert nonce, we currently do so using strcmp. Most implementations of strcmp short-circuit and exit as soon as they know whether two values are equal. This, however, is a problem when we're comparing the output of HMAC, as it leaks information in the time taken about how much of the two values match if they do indeed differ. In our case, the nonce is used to prevent replay attacks against our server via the embedded timestamp and replay attacks using requests from a different server via the HMAC. Push certs, which contain the nonces, are signed, so an attacker cannot tamper with the nonces without breaking validation of the signature. They can, of course, create their own signatures with invalid nonces, but they can also create their own signatures with valid nonces, so there's nothing to be gained. Thus, there is no security problem. Even though it doesn't appear that there are any negative consequences from the current technique, for safety and to encourage good practices, let's use a constant time comparison function for nonce verification. POSIX does not provide one, but they are easy to write. The technique we use here is also used in NaCl and the Go standard library and relies on the fact that bitwise or and xor are constant time on all known architectures. We need not be concerned about exiting early if the actual and expected lengths differ, since the standard cryptographic assumption is that everyone, including an attacker, knows the format of and algorithm used in our nonces (and in any event, they have the source code and can determine it easily). As a result, we assume everyone knows how long our nonces should be. This philosophy is also taken by the Go standard library and other cryptographic libraries when performing constant time comparisons on HMAC values. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
noncelen = strlen(nonce);
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
expect = prepare_push_cert_nonce(service_dir, stamp);
builtin/receive-pack: use constant-time comparison for HMAC value When we're comparing a push cert nonce, we currently do so using strcmp. Most implementations of strcmp short-circuit and exit as soon as they know whether two values are equal. This, however, is a problem when we're comparing the output of HMAC, as it leaks information in the time taken about how much of the two values match if they do indeed differ. In our case, the nonce is used to prevent replay attacks against our server via the embedded timestamp and replay attacks using requests from a different server via the HMAC. Push certs, which contain the nonces, are signed, so an attacker cannot tamper with the nonces without breaking validation of the signature. They can, of course, create their own signatures with invalid nonces, but they can also create their own signatures with valid nonces, so there's nothing to be gained. Thus, there is no security problem. Even though it doesn't appear that there are any negative consequences from the current technique, for safety and to encourage good practices, let's use a constant time comparison function for nonce verification. POSIX does not provide one, but they are easy to write. The technique we use here is also used in NaCl and the Go standard library and relies on the fact that bitwise or and xor are constant time on all known architectures. We need not be concerned about exiting early if the actual and expected lengths differ, since the standard cryptographic assumption is that everyone, including an attacker, knows the format of and algorithm used in our nonces (and in any event, they have the source code and can determine it easily). As a result, we assume everyone knows how long our nonces should be. This philosophy is also taken by the Go standard library and other cryptographic libraries when performing constant time comparisons on HMAC values. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
if (noncelen != strlen(expect)) {
/* This is not even the right size. */
retval = NONCE_BAD;
goto leave;
}
if (constant_memequal(expect, nonce, noncelen)) {
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
/* Not what we would have signed earlier */
retval = NONCE_BAD;
goto leave;
}
/*
* By how many seconds is this nonce stale? Negative value
* would mean it was issued by another server with its clock
* skewed in the future.
*/
ostamp = parse_timestamp(push_cert_nonce, NULL, 10);
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
nonce_stamp_slop = (long)ostamp - (long)stamp;
if (nonce_stamp_slop_limit &&
labs(nonce_stamp_slop) <= nonce_stamp_slop_limit) {
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
/*
* Pretend as if the received nonce (which passes the
* HMAC check, so it is not a forged by third-party)
* is what we issued.
*/
free((void *)push_cert_nonce);
push_cert_nonce = xstrdup(nonce);
retval = NONCE_OK;
} else {
retval = NONCE_SLOP;
}
leave:
free(nonce);
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
free(expect);
return retval;
}
/*
* Return 1 if there is no push_cert or if the push options in push_cert are
* the same as those in the argument; 0 otherwise.
*/
static int check_cert_push_options(const struct string_list *push_options)
{
const char *buf = push_cert.buf;
int len = push_cert.len;
char *option;
const char *next_line;
int options_seen = 0;
int retval = 1;
if (!len)
return 1;
while ((option = find_header(buf, len, "push-option", &next_line))) {
len -= (next_line - buf);
buf = next_line;
options_seen++;
if (options_seen > push_options->nr
|| strcmp(option,
push_options->items[options_seen - 1].string)) {
retval = 0;
goto leave;
}
free(option);
}
if (options_seen != push_options->nr)
retval = 0;
leave:
free(option);
return retval;
}
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
static void prepare_push_cert_sha1(struct child_process *proc)
{
static int already_done;
if (!push_cert.len)
return;
if (!already_done) {
int bogs /* beginning_of_gpg_sig */;
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
already_done = 1;
if (write_object_file(push_cert.buf, push_cert.len, OBJ_BLOB,
&push_cert_oid))
oidclr(&push_cert_oid);
memset(&sigcheck, '\0', sizeof(sigcheck));
bogs = parse_signed_buffer(push_cert.buf, push_cert.len);
sigcheck.payload = xmemdupz(push_cert.buf, bogs);
sigcheck.payload_len = bogs;
check_signature(&sigcheck, push_cert.buf + bogs,
push_cert.len - bogs);
nonce_status = check_nonce(push_cert.buf, bogs);
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
}
if (!is_null_oid(&push_cert_oid)) {
strvec_pushf(&proc->env, "GIT_PUSH_CERT=%s",
oid_to_hex(&push_cert_oid));
strvec_pushf(&proc->env, "GIT_PUSH_CERT_SIGNER=%s",
sigcheck.signer ? sigcheck.signer : "");
strvec_pushf(&proc->env, "GIT_PUSH_CERT_KEY=%s",
sigcheck.key ? sigcheck.key : "");
strvec_pushf(&proc->env, "GIT_PUSH_CERT_STATUS=%c",
sigcheck.result);
if (push_cert_nonce) {
strvec_pushf(&proc->env,
"GIT_PUSH_CERT_NONCE=%s",
push_cert_nonce);
strvec_pushf(&proc->env,
"GIT_PUSH_CERT_NONCE_STATUS=%s",
nonce_status);
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
if (nonce_status == NONCE_SLOP)
strvec_pushf(&proc->env,
"GIT_PUSH_CERT_NONCE_SLOP=%ld",
nonce_stamp_slop);
}
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
}
}
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
struct receive_hook_feed_state {
struct command *cmd;
struct ref_push_report *report;
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
int skip_broken;
struct strbuf buf;
const struct string_list *push_options;
};
typedef int (*feed_fn)(void *, const char **, size_t *);
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
static int run_and_feed_hook(const char *hook_name, feed_fn feed,
struct receive_hook_feed_state *feed_state)
{
struct child_process proc = CHILD_PROCESS_INIT;
struct async muxer;
int code;
const char *hook_path = find_hook(hook_name);
if (!hook_path)
return 0;
strvec_push(&proc.args, hook_path);
proc.in = -1;
proc.stdout_to_stderr = 1;
proc.trace2_hook_name = hook_name;
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
if (feed_state->push_options) {
size_t i;
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
for (i = 0; i < feed_state->push_options->nr; i++)
strvec_pushf(&proc.env,
"GIT_PUSH_OPTION_%"PRIuMAX"=%s",
(uintmax_t)i,
feed_state->push_options->items[i].string);
strvec_pushf(&proc.env, "GIT_PUSH_OPTION_COUNT=%"PRIuMAX"",
(uintmax_t)feed_state->push_options->nr);
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
} else
strvec_pushf(&proc.env, "GIT_PUSH_OPTION_COUNT");
if (tmp_objdir)
strvec_pushv(&proc.env, tmp_objdir_env(tmp_objdir));
if (use_sideband) {
memset(&muxer, 0, sizeof(muxer));
muxer.proc = copy_to_sideband;
muxer.in = -1;
code = start_async(&muxer);
if (code)
return code;
proc.err = muxer.in;
}
prepare_push_cert_sha1(&proc);
code = start_command(&proc);
if (code) {
if (use_sideband)
finish_async(&muxer);
return code;
}
receive-pack: allow hooks to ignore its standard input stream The pre-receive and post-receive hooks were designed to be an improvement over old style update and post-update hooks, which take the update information on their command line and are limited by the command line length limit. The same information is fed from the standard input to pre/post-receive hooks instead to lift this limitation. It has been mandatory for these new style hooks to consume the update information fully from the standard input stream. Otherwise, they would risk killing the receive-pack process via SIGPIPE. If a hook does not want to look at all the information, it is easy to send its standard input to /dev/null (perhaps a niche use of hook might need to know only the fact that a push was made, without having to know what objects have been pushed to update which refs), and this has already been done by existing hooks that are written carefully. However, because there is no good way to consistently fail hooks that do not consume the input fully (a small push may result in a short update record that may fit within the pipe buffer, to which the receive-pack process may manage to write before the hook has a chance to exit without reading anything, which will not result in a death-by-SIGPIPE of receive-pack), it can lead to a hard to diagnose "once in a blue moon" phantom failure. Lift this "hooks must consume their input fully" mandate. A mandate that is not enforced strictly is not helping us to catch mistakes in hooks. If a hook has a good reason to decide the outcome of its operation without reading the information we feed it, let it do so as it pleases. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
sigchain_push(SIGPIPE, SIG_IGN);
while (1) {
const char *buf;
size_t n;
if (feed(feed_state, &buf, &n))
break;
avoid "write_in_full(fd, buf, len) != len" pattern The return value of write_in_full() is either "-1", or the requested number of bytes[1]. If we make a partial write before seeing an error, we still return -1, not a partial value. This goes back to f6aa66cb95 (write_in_full: really write in full or return error on disk full., 2007-01-11). So checking anything except "was the return value negative" is pointless. And there are a couple of reasons not to do so: 1. It can do a funny signed/unsigned comparison. If your "len" is signed (e.g., a size_t) then the compiler will promote the "-1" to its unsigned variant. This works out for "!= len" (unless you really were trying to write the maximum size_t bytes), but is a bug if you check "< len" (an example of which was fixed recently in config.c). We should avoid promoting the mental model that you need to check the length at all, so that new sites are not tempted to copy us. 2. Checking for a negative value is shorter to type, especially when the length is an expression. 3. Linus says so. In d34cf19b89 (Clean up write_in_full() users, 2007-01-11), right after the write_in_full() semantics were changed, he wrote: I really wish every "write_in_full()" user would just check against "<0" now, but this fixes the nasty and stupid ones. Appeals to authority aside, this makes it clear that writing it this way does not have an intentional benefit. It's a historical curiosity that we never bothered to clean up (and which was undoubtedly cargo-culted into new sites). So let's convert these obviously-correct cases (this includes write_str_in_full(), which is just a wrapper for write_in_full()). [1] A careful reader may notice there is one way that write_in_full() can return a different value. If we ask write() to write N bytes and get a return value that is _larger_ than N, we could return a larger total. But besides the fact that this would imply a totally broken version of write(), it would already invoke undefined behavior. Our internal remaining counter is an unsigned size_t, which means that subtracting too many byte will wrap it around to a very large number. So we'll instantly begin reading off the end of the buffer, trying to write gigabytes (or petabytes) of data. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
7 years ago
if (write_in_full(proc.in, buf, n) < 0)
break;
}
close(proc.in);
if (use_sideband)
finish_async(&muxer);
receive-pack: allow hooks to ignore its standard input stream The pre-receive and post-receive hooks were designed to be an improvement over old style update and post-update hooks, which take the update information on their command line and are limited by the command line length limit. The same information is fed from the standard input to pre/post-receive hooks instead to lift this limitation. It has been mandatory for these new style hooks to consume the update information fully from the standard input stream. Otherwise, they would risk killing the receive-pack process via SIGPIPE. If a hook does not want to look at all the information, it is easy to send its standard input to /dev/null (perhaps a niche use of hook might need to know only the fact that a push was made, without having to know what objects have been pushed to update which refs), and this has already been done by existing hooks that are written carefully. However, because there is no good way to consistently fail hooks that do not consume the input fully (a small push may result in a short update record that may fit within the pipe buffer, to which the receive-pack process may manage to write before the hook has a chance to exit without reading anything, which will not result in a death-by-SIGPIPE of receive-pack), it can lead to a hard to diagnose "once in a blue moon" phantom failure. Lift this "hooks must consume their input fully" mandate. A mandate that is not enforced strictly is not helping us to catch mistakes in hooks. If a hook has a good reason to decide the outcome of its operation without reading the information we feed it, let it do so as it pleases. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
sigchain_pop(SIGPIPE);
return finish_command(&proc);
}
static int feed_receive_hook(void *state_, const char **bufp, size_t *sizep)
{
struct receive_hook_feed_state *state = state_;
struct command *cmd = state->cmd;
while (cmd &&
state->skip_broken && (cmd->error_string || cmd->did_not_exist))
cmd = cmd->next;
if (!cmd)
return -1; /* EOF */
if (!bufp)
return 0; /* OK, can feed something. */
strbuf_reset(&state->buf);
if (!state->report)
state->report = cmd->report;
if (state->report) {
struct object_id *old_oid;
struct object_id *new_oid;
const char *ref_name;
old_oid = state->report->old_oid ? state->report->old_oid : &cmd->old_oid;
new_oid = state->report->new_oid ? state->report->new_oid : &cmd->new_oid;
ref_name = state->report->ref_name ? state->report->ref_name : cmd->ref_name;
strbuf_addf(&state->buf, "%s %s %s\n",
oid_to_hex(old_oid), oid_to_hex(new_oid),
ref_name);
state->report = state->report->next;
if (!state->report)
state->cmd = cmd->next;
} else {
strbuf_addf(&state->buf, "%s %s %s\n",
oid_to_hex(&cmd->old_oid), oid_to_hex(&cmd->new_oid),
cmd->ref_name);
state->cmd = cmd->next;
}
if (bufp) {
*bufp = state->buf.buf;
*sizep = state->buf.len;
}
return 0;
}
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
static int run_receive_hook(struct command *commands,
const char *hook_name,
int skip_broken,
const struct string_list *push_options)
{
struct receive_hook_feed_state state;
int status;
strbuf_init(&state.buf, 0);
state.cmd = commands;
state.skip_broken = skip_broken;
state.report = NULL;
if (feed_receive_hook(&state, NULL, NULL))
return 0;
state.cmd = commands;
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
state.push_options = push_options;
status = run_and_feed_hook(hook_name, feed_receive_hook, &state);
strbuf_release(&state.buf);
return status;
}
static int run_update_hook(struct command *cmd)
{
struct child_process proc = CHILD_PROCESS_INIT;
int code;
const char *hook_path = find_hook("update");
if (!hook_path)
return 0;
strvec_push(&proc.args, hook_path);
strvec_push(&proc.args, cmd->ref_name);
strvec_push(&proc.args, oid_to_hex(&cmd->old_oid));
strvec_push(&proc.args, oid_to_hex(&cmd->new_oid));
proc.no_stdin = 1;
proc.stdout_to_stderr = 1;
proc.err = use_sideband ? -1 : 0;
proc.trace2_hook_name = "update";
code = start_command(&proc);
if (code)
return code;
if (use_sideband)
copy_to_sideband(proc.err, -1, NULL);
return finish_command(&proc);
}
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
static struct command *find_command_by_refname(struct command *list,
const char *refname)
{
for (; list; list = list->next)
if (!strcmp(list->ref_name, refname))
return list;
return NULL;
}
static int read_proc_receive_report(struct packet_reader *reader,
struct command *commands,
struct strbuf *errmsg)
{
struct command *cmd;
struct command *hint = NULL;
struct ref_push_report *report = NULL;
int new_report = 0;
int code = 0;
int once = 0;
int response = 0;
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
for (;;) {
struct object_id old_oid, new_oid;
const char *head;
const char *refname;
char *p;
enum packet_read_status status;
status = packet_reader_read(reader);
if (status != PACKET_READ_NORMAL) {
/* Check whether proc-receive exited abnormally */
if (status == PACKET_READ_EOF && !response) {
strbuf_addstr(errmsg, "proc-receive exited abnormally");
return -1;
}
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
break;
}
response++;
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
head = reader->line;
p = strchr(head, ' ');
if (!p) {
strbuf_addf(errmsg, "proc-receive reported incomplete status line: '%s'\n", head);
code = -1;
continue;
}
*p++ = '\0';
if (!strcmp(head, "option")) {
const char *key, *val;
if (!hint || !(report || new_report)) {
if (!once++)
strbuf_addstr(errmsg, "proc-receive reported 'option' without a matching 'ok/ng' directive\n");
code = -1;
continue;
}
if (new_report) {
if (!hint->report) {
CALLOC_ARRAY(hint->report, 1);
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
report = hint->report;
} else {
report = hint->report;
while (report->next)
report = report->next;
report->next = xcalloc(1, sizeof(struct ref_push_report));
report = report->next;
}
new_report = 0;
}
key = p;
p = strchr(key, ' ');
if (p)
*p++ = '\0';
val = p;
if (!strcmp(key, "refname"))
report->ref_name = xstrdup_or_null(val);
else if (!strcmp(key, "old-oid") && val &&
!parse_oid_hex(val, &old_oid, &val))
report->old_oid = oiddup(&old_oid);
else if (!strcmp(key, "new-oid") && val &&
!parse_oid_hex(val, &new_oid, &val))
report->new_oid = oiddup(&new_oid);
else if (!strcmp(key, "forced-update"))
report->forced_update = 1;
else if (!strcmp(key, "fall-through"))
/* Fall through, let 'receive-pack' to execute it. */
hint->run_proc_receive = 0;
continue;
}
report = NULL;
new_report = 0;
refname = p;
p = strchr(refname, ' ');
if (p)
*p++ = '\0';
if (strcmp(head, "ok") && strcmp(head, "ng")) {
strbuf_addf(errmsg, "proc-receive reported bad status '%s' on ref '%s'\n",
head, refname);
code = -1;
continue;
}
/* first try searching at our hint, falling back to all refs */
if (hint)
hint = find_command_by_refname(hint, refname);
if (!hint)
hint = find_command_by_refname(commands, refname);
if (!hint) {
strbuf_addf(errmsg, "proc-receive reported status on unknown ref: %s\n",
refname);
code = -1;
continue;
}
if (!hint->run_proc_receive) {
strbuf_addf(errmsg, "proc-receive reported status on unexpected ref: %s\n",
refname);
code = -1;
continue;
}
hint->run_proc_receive |= RUN_PROC_RECEIVE_RETURNED;
if (!strcmp(head, "ng")) {
if (p)
hint->error_string = xstrdup(p);
else
hint->error_string = "failed";
code = -1;
continue;
}
new_report = 1;
}
for (cmd = commands; cmd; cmd = cmd->next)
if (cmd->run_proc_receive && !cmd->error_string &&
!(cmd->run_proc_receive & RUN_PROC_RECEIVE_RETURNED)) {
cmd->error_string = "proc-receive failed to report status";
code = -1;
}
return code;
}
static int run_proc_receive_hook(struct command *commands,
const struct string_list *push_options)
{
struct child_process proc = CHILD_PROCESS_INIT;
struct async muxer;
struct command *cmd;
struct packet_reader reader;
struct strbuf cap = STRBUF_INIT;
struct strbuf errmsg = STRBUF_INIT;
int hook_use_push_options = 0;
int version = 0;
int code;
const char *hook_path = find_hook("proc-receive");
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
if (!hook_path) {
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
rp_error("cannot find hook 'proc-receive'");
return -1;
}
strvec_push(&proc.args, hook_path);
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
proc.in = -1;
proc.out = -1;
proc.trace2_hook_name = "proc-receive";
if (use_sideband) {
memset(&muxer, 0, sizeof(muxer));
muxer.proc = copy_to_sideband;
muxer.in = -1;
code = start_async(&muxer);
if (code)
return code;
proc.err = muxer.in;
} else {
proc.err = 0;
}
code = start_command(&proc);
if (code) {
if (use_sideband)
finish_async(&muxer);
return code;
}
sigchain_push(SIGPIPE, SIG_IGN);
/* Version negotiaton */
packet_reader_init(&reader, proc.out, NULL, 0,
PACKET_READ_CHOMP_NEWLINE |
PACKET_READ_GENTLE_ON_EOF);
if (use_atomic)
strbuf_addstr(&cap, " atomic");
if (use_push_options)
strbuf_addstr(&cap, " push-options");
if (cap.len) {
code = packet_write_fmt_gently(proc.in, "version=1%c%s\n", '\0', cap.buf + 1);
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
strbuf_release(&cap);
} else {
code = packet_write_fmt_gently(proc.in, "version=1\n");
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
}
if (!code)
code = packet_flush_gently(proc.in);
if (!code)
for (;;) {
int linelen;
enum packet_read_status status;
status = packet_reader_read(&reader);
if (status != PACKET_READ_NORMAL) {
/* Check whether proc-receive exited abnormally */
if (status == PACKET_READ_EOF)
code = -1;
break;
}
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
if (reader.pktlen > 8 && starts_with(reader.line, "version=")) {
version = atoi(reader.line + 8);
linelen = strlen(reader.line);
if (linelen < reader.pktlen) {
const char *feature_list = reader.line + linelen + 1;
if (parse_feature_request(feature_list, "push-options"))
hook_use_push_options = 1;
}
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
}
}
if (code) {
strbuf_addstr(&errmsg, "fail to negotiate version with proc-receive hook");
goto cleanup;
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
}
switch (version) {
case 0:
/* fallthrough */
case 1:
break;
default:
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
strbuf_addf(&errmsg, "proc-receive version '%d' is not supported",
version);
code = -1;
goto cleanup;
}
/* Send commands */
for (cmd = commands; cmd; cmd = cmd->next) {
if (!cmd->run_proc_receive || cmd->skip_update || cmd->error_string)
continue;
code = packet_write_fmt_gently(proc.in, "%s %s %s",
oid_to_hex(&cmd->old_oid),
oid_to_hex(&cmd->new_oid),
cmd->ref_name);
if (code)
break;
}
if (!code)
code = packet_flush_gently(proc.in);
if (code) {
strbuf_addstr(&errmsg, "fail to write commands to proc-receive hook");
goto cleanup;
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
}
/* Send push options */
if (hook_use_push_options) {
struct string_list_item *item;
for_each_string_list_item(item, push_options) {
code = packet_write_fmt_gently(proc.in, "%s", item->string);
if (code)
break;
}
if (!code)
code = packet_flush_gently(proc.in);
if (code) {
strbuf_addstr(&errmsg,
"fail to write push-options to proc-receive hook");
goto cleanup;
}
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
}
/* Read result from proc-receive */
code = read_proc_receive_report(&reader, commands, &errmsg);
cleanup:
close(proc.in);
close(proc.out);
if (use_sideband)
finish_async(&muxer);
if (finish_command(&proc))
code = -1;
if (errmsg.len >0) {
char *p = errmsg.buf;
p += errmsg.len - 1;
if (*p == '\n')
*p = '\0';
rp_error("%s", errmsg.buf);
strbuf_release(&errmsg);
}
sigchain_pop(SIGPIPE);
return code;
}
static char *refuse_unconfigured_deny_msg =
N_("By default, updating the current branch in a non-bare repository\n"
"is denied, because it will make the index and work tree inconsistent\n"
"with what you pushed, and will require 'git reset --hard' to match\n"
"the work tree to HEAD.\n"
"\n"
"You can set the 'receive.denyCurrentBranch' configuration variable\n"
"to 'ignore' or 'warn' in the remote repository to allow pushing into\n"
"its current branch; however, this is not recommended unless you\n"
"arranged to update its work tree to match what you pushed in some\n"
"other way.\n"
"\n"
"To squelch this message and still keep the default behaviour, set\n"
"'receive.denyCurrentBranch' configuration variable to 'refuse'.");
static void refuse_unconfigured_deny(void)
{
rp_error("%s", _(refuse_unconfigured_deny_msg));
}
static char *refuse_unconfigured_deny_delete_current_msg =
N_("By default, deleting the current branch is denied, because the next\n"
"'git clone' won't result in any file checked out, causing confusion.\n"
"\n"
"You can set 'receive.denyDeleteCurrent' configuration variable to\n"
"'warn' or 'ignore' in the remote repository to allow deleting the\n"
"current branch, with or without a warning message.\n"
"\n"
"To squelch this message, you can set it to 'refuse'.");
static void refuse_unconfigured_deny_delete_current(void)
{
rp_error("%s", _(refuse_unconfigured_deny_delete_current_msg));
}
connected: refactor iterator to return next object ID directly The object ID iterator used by the connectivity checks returns the next object ID via an out-parameter and then uses a return code to indicate whether an item was found. This is a bit roundabout: instead of a separate error code, we can just return the next object ID directly and use `NULL` pointers as indicator that the iterator got no items left. Furthermore, this avoids a copy of the object ID. Refactor the iterator and all its implementations to return object IDs directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs: Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s] Range (min … max): 29.934 s … 30.406 s 10 runs Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s] Range (min … max): 29.696 s … 29.996 s 10 runs Summary '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran 1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch' While this 1% speedup could be labelled as statistically insignificant, the speedup is consistent on my machine. Furthermore, this is an end to end test, so it is expected that the improvement in the connectivity check itself is more significant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
static const struct object_id *command_singleton_iterator(void *cb_data);
static int update_shallow_ref(struct command *cmd, struct shallow_info *si)
{
struct shallow_lock shallow_lock = SHALLOW_LOCK_INIT;
struct oid_array extra = OID_ARRAY_INIT;
check_everything_connected: use a struct with named options The number of variants of check_everything_connected has grown over the years, so that the "real" function takes several possibly-zero, possibly-NULL arguments. We hid the complexity behind some wrapper functions, but this doesn't scale well when we want to add new options. If we add more wrapper variants to handle the new options, then we can get a combinatorial explosion when those options might be used together (right now nobody wants to use both "shallow" and "transport" together, so we get by with just a few wrappers). If instead we add new parameters to each function, each of which can have a default value, then callers who want the defaults end up with confusing invocations like: check_everything_connected(fn, 0, data, -1, 0, NULL); where it is unclear which parameter is which (and every caller needs updated when we add new options). Instead, let's add a struct to hold all of the optional parameters. This is a little more verbose for the callers (who have to declare the struct and fill it in), but it makes their code much easier to follow, because every option is named as it is set (and unused options do not have to be mentioned at all). Note that we could also stick the iteration function and its callback data into the option struct, too. But since those are required for each call, by avoiding doing so, we can let very simple callers just pass "NULL" for the options and not worry about the struct at all. While we're touching each site, let's also rename the function to check_connected(). The existing name was quite long, and not all of the wrappers even used the full name. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
struct check_connected_options opt = CHECK_CONNECTED_INIT;
uint32_t mask = 1 << (cmd->index % 32);
int i;
trace_printf_key(&trace_shallow,
"shallow: update_shallow_ref %s\n", cmd->ref_name);
for (i = 0; i < si->shallow->nr; i++)
if (si->used_shallow[i] &&
(si->used_shallow[i][cmd->index / 32] & mask) &&
!delayed_reachability_test(si, i))
oid_array_append(&extra, &si->shallow->oid[i]);
opt.env = tmp_objdir_env(tmp_objdir);
check_everything_connected: use a struct with named options The number of variants of check_everything_connected has grown over the years, so that the "real" function takes several possibly-zero, possibly-NULL arguments. We hid the complexity behind some wrapper functions, but this doesn't scale well when we want to add new options. If we add more wrapper variants to handle the new options, then we can get a combinatorial explosion when those options might be used together (right now nobody wants to use both "shallow" and "transport" together, so we get by with just a few wrappers). If instead we add new parameters to each function, each of which can have a default value, then callers who want the defaults end up with confusing invocations like: check_everything_connected(fn, 0, data, -1, 0, NULL); where it is unclear which parameter is which (and every caller needs updated when we add new options). Instead, let's add a struct to hold all of the optional parameters. This is a little more verbose for the callers (who have to declare the struct and fill it in), but it makes their code much easier to follow, because every option is named as it is set (and unused options do not have to be mentioned at all). Note that we could also stick the iteration function and its callback data into the option struct, too. But since those are required for each call, by avoiding doing so, we can let very simple callers just pass "NULL" for the options and not worry about the struct at all. While we're touching each site, let's also rename the function to check_connected(). The existing name was quite long, and not all of the wrappers even used the full name. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
setup_alternate_shallow(&shallow_lock, &opt.shallow_file, &extra);
if (check_connected(command_singleton_iterator, cmd, &opt)) {
shallow.c: use '{commit,rollback}_shallow_file' In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily, 2019-01-10), the author noted that 'is_repository_shallow' produces visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'. This is a problem for e.g., fetching with '--update-shallow' in a shallow repository with 'fetch.writeCommitGraph' enabled, since the update to '.git/shallow' will cause Git to think that the repository isn't shallow when it is, thereby circumventing the commit-graph compatibility check. This causes problems in shallow repositories with at least shallow refs that have at least one ancestor (since the client won't have those objects, and therefore can't take the reachability closure over commits when writing a commit-graph). Address this by introducing thin wrappers over 'commit_lock_file' and 'rollback_lock_file' for use specifically when the lock is held over '.git/shallow'. These wrappers (appropriately called 'commit_shallow_file' and 'rollback_shallow_file') call into their respective functions in 'lockfile.h', but additionally reset validity checks used by the shallow machinery. Replace each instance of 'commit_lock_file' and 'rollback_lock_file' with 'commit_shallow_file' and 'rollback_shallow_file' when the lock being held is over the '.git/shallow' file. As a result, 'prune_shallow' can now only be called once (since 'check_shallow_file_for_update' will die after calling 'reset_repository_shallow'). But, this is OK since we only call 'prune_shallow' at most once per process. Helped-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
rollback_shallow_file(the_repository, &shallow_lock);
oid_array_clear(&extra);
return -1;
}
shallow.c: use '{commit,rollback}_shallow_file' In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily, 2019-01-10), the author noted that 'is_repository_shallow' produces visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'. This is a problem for e.g., fetching with '--update-shallow' in a shallow repository with 'fetch.writeCommitGraph' enabled, since the update to '.git/shallow' will cause Git to think that the repository isn't shallow when it is, thereby circumventing the commit-graph compatibility check. This causes problems in shallow repositories with at least shallow refs that have at least one ancestor (since the client won't have those objects, and therefore can't take the reachability closure over commits when writing a commit-graph). Address this by introducing thin wrappers over 'commit_lock_file' and 'rollback_lock_file' for use specifically when the lock is held over '.git/shallow'. These wrappers (appropriately called 'commit_shallow_file' and 'rollback_shallow_file') call into their respective functions in 'lockfile.h', but additionally reset validity checks used by the shallow machinery. Replace each instance of 'commit_lock_file' and 'rollback_lock_file' with 'commit_shallow_file' and 'rollback_shallow_file' when the lock being held is over the '.git/shallow' file. As a result, 'prune_shallow' can now only be called once (since 'check_shallow_file_for_update' will die after calling 'reset_repository_shallow'). But, this is OK since we only call 'prune_shallow' at most once per process. Helped-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
commit_shallow_file(the_repository, &shallow_lock);
/*
* Make sure setup_alternate_shallow() for the next ref does
* not lose these new roots..
*/
for (i = 0; i < extra.nr; i++)
register_shallow(the_repository, &extra.oid[i]);
si->shallow_ref[cmd->index] = 0;
oid_array_clear(&extra);
return 0;
}
/*
* NEEDSWORK: we should consolidate various implementions of "are we
* on an unborn branch?" test into one, and make the unified one more
* robust. !get_sha1() based check used here and elsewhere would not
* allow us to tell an unborn branch from corrupt ref, for example.
* For the purpose of fixing "deploy-to-update does not work when
* pushing into an empty repository" issue, this should suffice for
* now.
*/
static int head_has_history(void)
{
struct object_id oid;
return !repo_get_oid(the_repository, "HEAD", &oid);
}
static const char *push_to_deploy(unsigned char *sha1,
struct strvec *env,
const char *work_tree)
{
struct child_process child = CHILD_PROCESS_INIT;
strvec_pushl(&child.args, "update-index", "-q", "--ignore-submodules",
"--refresh", NULL);
strvec_pushv(&child.env, env->v);
child.dir = work_tree;
child.no_stdin = 1;
child.stdout_to_stderr = 1;
child.git_cmd = 1;
if (run_command(&child))
return "Up-to-date check failed";
/* run_command() does not clean up completely; reinitialize */
child_process_init(&child);
strvec_pushl(&child.args, "diff-files", "--quiet",
"--ignore-submodules", "--", NULL);
strvec_pushv(&child.env, env->v);
child.dir = work_tree;
child.no_stdin = 1;
child.stdout_to_stderr = 1;
child.git_cmd = 1;
if (run_command(&child))
return "Working directory has unstaged changes";
child_process_init(&child);
strvec_pushl(&child.args, "diff-index", "--quiet", "--cached",
"--ignore-submodules",
/* diff-index with either HEAD or an empty tree */
head_has_history() ? "HEAD" : empty_tree_oid_hex(),
"--", NULL);
strvec_pushv(&child.env, env->v);
child.no_stdin = 1;
child.no_stdout = 1;
child.stdout_to_stderr = 0;
child.git_cmd = 1;
if (run_command(&child))
return "Working directory has staged changes";
child_process_init(&child);
strvec_pushl(&child.args, "read-tree", "-u", "-m", hash_to_hex(sha1),
NULL);
strvec_pushv(&child.env, env->v);
child.dir = work_tree;
child.no_stdin = 1;
child.no_stdout = 1;
child.stdout_to_stderr = 0;
child.git_cmd = 1;
if (run_command(&child))
return "Could not update working tree to new HEAD";
return NULL;
}
static const char *push_to_checkout_hook = "push-to-checkout";
static const char *push_to_checkout(unsigned char *hash,
hooks: fix an obscure TOCTOU "did we just run a hook?" race Fix a Time-of-check to time-of-use (TOCTOU) race in code added in 680ee550d72 (commit: skip discarding the index if there is no pre-commit hook, 2017-08-14). This obscure race condition can occur if we e.g. ran the "pre-commit" hook and it modified the index, but hook_exists() returns false later on (e.g., because the hook itself went away, the directory became unreadable, etc.). Then we won't call discard_cache() when we should have. The race condition itself probably doesn't matter, and users would have been unlikely to run into it in practice. This problem has been noted on-list when 680ee550d72 was discussed[1], but had not been fixed. This change is mainly intended to improve the readability of the code involved, and to make reasoning about it more straightforward. It wasn't as obvious what we were trying to do here, but by having an "invoked_hook" it's clearer that e.g. our discard_cache() is happening because of the earlier hook execution. Let's also change this for the push-to-checkout hook. Now instead of checking if the hook exists and either doing a push to checkout or a push to deploy we'll always attempt a push to checkout. If the hook doesn't exist we'll fall back on push to deploy. The same behavior as before, without the TOCTOU race. See 0855331941b (receive-pack: support push-to-checkout hook, 2014-12-01) for the introduction of the previous behavior. This leaves uses of hook_exists() in two places that matter. The "reference-transaction" check in refs.c, see 67541597670 (refs: implement reference transaction hook, 2020-06-19), and the "prepare-commit-msg" hook, see 66618a50f9c (sequencer: run 'prepare-commit-msg' hook, 2018-01-24). In both of those cases we're saving ourselves CPU time by not preparing data for the hook that we'll then do nothing with if we don't have the hook. So using this "invoked_hook" pattern doesn't make sense in those cases. The "reference-transaction" and "prepare-commit-msg" hook also aren't racy. In those cases we'll skip the hook runs if we race with a new hook being added, whereas in the TOCTOU races being fixed here we were incorrectly skipping the required post-hook logic. 1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
int *invoked_hook,
struct strvec *env,
const char *work_tree)
{
struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
hooks: fix an obscure TOCTOU "did we just run a hook?" race Fix a Time-of-check to time-of-use (TOCTOU) race in code added in 680ee550d72 (commit: skip discarding the index if there is no pre-commit hook, 2017-08-14). This obscure race condition can occur if we e.g. ran the "pre-commit" hook and it modified the index, but hook_exists() returns false later on (e.g., because the hook itself went away, the directory became unreadable, etc.). Then we won't call discard_cache() when we should have. The race condition itself probably doesn't matter, and users would have been unlikely to run into it in practice. This problem has been noted on-list when 680ee550d72 was discussed[1], but had not been fixed. This change is mainly intended to improve the readability of the code involved, and to make reasoning about it more straightforward. It wasn't as obvious what we were trying to do here, but by having an "invoked_hook" it's clearer that e.g. our discard_cache() is happening because of the earlier hook execution. Let's also change this for the push-to-checkout hook. Now instead of checking if the hook exists and either doing a push to checkout or a push to deploy we'll always attempt a push to checkout. If the hook doesn't exist we'll fall back on push to deploy. The same behavior as before, without the TOCTOU race. See 0855331941b (receive-pack: support push-to-checkout hook, 2014-12-01) for the introduction of the previous behavior. This leaves uses of hook_exists() in two places that matter. The "reference-transaction" check in refs.c, see 67541597670 (refs: implement reference transaction hook, 2020-06-19), and the "prepare-commit-msg" hook, see 66618a50f9c (sequencer: run 'prepare-commit-msg' hook, 2018-01-24). In both of those cases we're saving ourselves CPU time by not preparing data for the hook that we'll then do nothing with if we don't have the hook. So using this "invoked_hook" pattern doesn't make sense in those cases. The "reference-transaction" and "prepare-commit-msg" hook also aren't racy. In those cases we'll skip the hook runs if we race with a new hook being added, whereas in the TOCTOU races being fixed here we were incorrectly skipping the required post-hook logic. 1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
opt.invoked_hook = invoked_hook;
strvec_pushf(env, "GIT_WORK_TREE=%s", absolute_path(work_tree));
strvec_pushv(&opt.env, env->v);
strvec_push(&opt.args, hash_to_hex(hash));
if (run_hooks_opt(push_to_checkout_hook, &opt))
return "push-to-checkout hook declined";
else
return NULL;
}
receive.denyCurrentBranch: respect all worktrees The receive.denyCurrentBranch config option controls what happens if you push to a branch that is checked out into a non-bare repository. By default, it rejects it. It can be disabled via `ignore` or `warn`. Another yet trickier option is `updateInstead`. However, this setting was forgotten when the git worktree command was introduced: only the main worktree's current branch is respected. With this change, all worktrees are respected. That change also leads to revealing another bug, i.e. `receive.denyCurrentBranch = true` was ignored when pushing into a non-bare repository's unborn current branch using ref namespaces. As `is_ref_checked_out()` returns 0 which means `receive-pack` does not get into conditional statement to switch `deny_current_branch` accordingly (ignore, warn, refuse, unconfigured, updateInstead). receive.denyCurrentBranch uses the function `refs_resolve_ref_unsafe()` (called via `resolve_refdup()`) to resolve the symbolic ref HEAD, but that function fails when HEAD does not point at a valid commit. As we replace the call to `refs_resolve_ref_unsafe()` with `find_shared_symref()`, which has no problem finding the worktree for a given branch even if it is unborn yet, this bug is fixed at the same time: receive.denyCurrentBranch now also handles worktrees with unborn branches as intended even while using ref namespaces. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
static const char *update_worktree(unsigned char *sha1, const struct worktree *worktree)
{
const char *retval, *git_dir;
struct strvec env = STRVEC_INIT;
hooks: fix an obscure TOCTOU "did we just run a hook?" race Fix a Time-of-check to time-of-use (TOCTOU) race in code added in 680ee550d72 (commit: skip discarding the index if there is no pre-commit hook, 2017-08-14). This obscure race condition can occur if we e.g. ran the "pre-commit" hook and it modified the index, but hook_exists() returns false later on (e.g., because the hook itself went away, the directory became unreadable, etc.). Then we won't call discard_cache() when we should have. The race condition itself probably doesn't matter, and users would have been unlikely to run into it in practice. This problem has been noted on-list when 680ee550d72 was discussed[1], but had not been fixed. This change is mainly intended to improve the readability of the code involved, and to make reasoning about it more straightforward. It wasn't as obvious what we were trying to do here, but by having an "invoked_hook" it's clearer that e.g. our discard_cache() is happening because of the earlier hook execution. Let's also change this for the push-to-checkout hook. Now instead of checking if the hook exists and either doing a push to checkout or a push to deploy we'll always attempt a push to checkout. If the hook doesn't exist we'll fall back on push to deploy. The same behavior as before, without the TOCTOU race. See 0855331941b (receive-pack: support push-to-checkout hook, 2014-12-01) for the introduction of the previous behavior. This leaves uses of hook_exists() in two places that matter. The "reference-transaction" check in refs.c, see 67541597670 (refs: implement reference transaction hook, 2020-06-19), and the "prepare-commit-msg" hook, see 66618a50f9c (sequencer: run 'prepare-commit-msg' hook, 2018-01-24). In both of those cases we're saving ourselves CPU time by not preparing data for the hook that we'll then do nothing with if we don't have the hook. So using this "invoked_hook" pattern doesn't make sense in those cases. The "reference-transaction" and "prepare-commit-msg" hook also aren't racy. In those cases we'll skip the hook runs if we race with a new hook being added, whereas in the TOCTOU races being fixed here we were incorrectly skipping the required post-hook logic. 1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
int invoked_hook;
if (!worktree || !worktree->path)
BUG("worktree->path must be non-NULL");
receive.denyCurrentBranch: respect all worktrees The receive.denyCurrentBranch config option controls what happens if you push to a branch that is checked out into a non-bare repository. By default, it rejects it. It can be disabled via `ignore` or `warn`. Another yet trickier option is `updateInstead`. However, this setting was forgotten when the git worktree command was introduced: only the main worktree's current branch is respected. With this change, all worktrees are respected. That change also leads to revealing another bug, i.e. `receive.denyCurrentBranch = true` was ignored when pushing into a non-bare repository's unborn current branch using ref namespaces. As `is_ref_checked_out()` returns 0 which means `receive-pack` does not get into conditional statement to switch `deny_current_branch` accordingly (ignore, warn, refuse, unconfigured, updateInstead). receive.denyCurrentBranch uses the function `refs_resolve_ref_unsafe()` (called via `resolve_refdup()`) to resolve the symbolic ref HEAD, but that function fails when HEAD does not point at a valid commit. As we replace the call to `refs_resolve_ref_unsafe()` with `find_shared_symref()`, which has no problem finding the worktree for a given branch even if it is unborn yet, this bug is fixed at the same time: receive.denyCurrentBranch now also handles worktrees with unborn branches as intended even while using ref namespaces. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
if (worktree->is_bare)
return "denyCurrentBranch = updateInstead needs a worktree";
git_dir = get_worktree_git_dir(worktree);
strvec_pushf(&env, "GIT_DIR=%s", absolute_path(git_dir));
hooks: fix an obscure TOCTOU "did we just run a hook?" race Fix a Time-of-check to time-of-use (TOCTOU) race in code added in 680ee550d72 (commit: skip discarding the index if there is no pre-commit hook, 2017-08-14). This obscure race condition can occur if we e.g. ran the "pre-commit" hook and it modified the index, but hook_exists() returns false later on (e.g., because the hook itself went away, the directory became unreadable, etc.). Then we won't call discard_cache() when we should have. The race condition itself probably doesn't matter, and users would have been unlikely to run into it in practice. This problem has been noted on-list when 680ee550d72 was discussed[1], but had not been fixed. This change is mainly intended to improve the readability of the code involved, and to make reasoning about it more straightforward. It wasn't as obvious what we were trying to do here, but by having an "invoked_hook" it's clearer that e.g. our discard_cache() is happening because of the earlier hook execution. Let's also change this for the push-to-checkout hook. Now instead of checking if the hook exists and either doing a push to checkout or a push to deploy we'll always attempt a push to checkout. If the hook doesn't exist we'll fall back on push to deploy. The same behavior as before, without the TOCTOU race. See 0855331941b (receive-pack: support push-to-checkout hook, 2014-12-01) for the introduction of the previous behavior. This leaves uses of hook_exists() in two places that matter. The "reference-transaction" check in refs.c, see 67541597670 (refs: implement reference transaction hook, 2020-06-19), and the "prepare-commit-msg" hook, see 66618a50f9c (sequencer: run 'prepare-commit-msg' hook, 2018-01-24). In both of those cases we're saving ourselves CPU time by not preparing data for the hook that we'll then do nothing with if we don't have the hook. So using this "invoked_hook" pattern doesn't make sense in those cases. The "reference-transaction" and "prepare-commit-msg" hook also aren't racy. In those cases we'll skip the hook runs if we race with a new hook being added, whereas in the TOCTOU races being fixed here we were incorrectly skipping the required post-hook logic. 1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
retval = push_to_checkout(sha1, &invoked_hook, &env, worktree->path);
if (!invoked_hook)
retval = push_to_deploy(sha1, &env, worktree->path);
strvec_clear(&env);
return retval;
}
static const char *update(struct command *cmd, struct shallow_info *si)
{
const char *name = cmd->ref_name;
struct strbuf namespaced_name_buf = STRBUF_INIT;
static char *namespaced_name;
const char *ret;
struct object_id *old_oid = &cmd->old_oid;
struct object_id *new_oid = &cmd->new_oid;
int do_update_worktree = 0;
struct worktree **worktrees = get_worktrees();
const struct worktree *worktree =
find_shared_symref(worktrees, "HEAD", name);
/* only refs/... are allowed */
if (!starts_with(name, "refs/") ||
check_refname_format(name + 5, is_null_oid(new_oid) ?
REFNAME_ALLOW_ONELEVEL : 0)) {
rp_error("refusing to update funny ref '%s' remotely", name);
ret = "funny refname";
goto out;
}
strbuf_addf(&namespaced_name_buf, "%s%s", get_git_namespace(), name);
free(namespaced_name);
namespaced_name = strbuf_detach(&namespaced_name_buf, NULL);
if (worktree && !worktree->is_bare) {
switch (deny_current_branch) {
case DENY_IGNORE:
break;
case DENY_WARN:
rp_warning("updating the current branch");
break;
case DENY_REFUSE:
case DENY_UNCONFIGURED:
rp_error("refusing to update checked out branch: %s", name);
if (deny_current_branch == DENY_UNCONFIGURED)
refuse_unconfigured_deny();
ret = "branch is currently checked out";
goto out;
case DENY_UPDATE_INSTEAD:
/* pass -- let other checks intervene first */
do_update_worktree = 1;
break;
}
}
if (!is_null_oid(new_oid) && !repo_has_object_file(the_repository, new_oid)) {
error("unpack should have generated %s, "
"but I can't find it!", oid_to_hex(new_oid));
ret = "bad pack";
goto out;
}
if (!is_null_oid(old_oid) && is_null_oid(new_oid)) {
if (deny_deletes && starts_with(name, "refs/heads/")) {
rp_error("denying ref deletion for %s", name);
ret = "deletion prohibited";
goto out;
}
receive.denyCurrentBranch: respect all worktrees The receive.denyCurrentBranch config option controls what happens if you push to a branch that is checked out into a non-bare repository. By default, it rejects it. It can be disabled via `ignore` or `warn`. Another yet trickier option is `updateInstead`. However, this setting was forgotten when the git worktree command was introduced: only the main worktree's current branch is respected. With this change, all worktrees are respected. That change also leads to revealing another bug, i.e. `receive.denyCurrentBranch = true` was ignored when pushing into a non-bare repository's unborn current branch using ref namespaces. As `is_ref_checked_out()` returns 0 which means `receive-pack` does not get into conditional statement to switch `deny_current_branch` accordingly (ignore, warn, refuse, unconfigured, updateInstead). receive.denyCurrentBranch uses the function `refs_resolve_ref_unsafe()` (called via `resolve_refdup()`) to resolve the symbolic ref HEAD, but that function fails when HEAD does not point at a valid commit. As we replace the call to `refs_resolve_ref_unsafe()` with `find_shared_symref()`, which has no problem finding the worktree for a given branch even if it is unborn yet, this bug is fixed at the same time: receive.denyCurrentBranch now also handles worktrees with unborn branches as intended even while using ref namespaces. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 years ago
if (worktree || (head_name && !strcmp(namespaced_name, head_name))) {
switch (deny_delete_current) {
case DENY_IGNORE:
break;
case DENY_WARN:
rp_warning("deleting the current branch");
break;
case DENY_REFUSE:
case DENY_UNCONFIGURED:
case DENY_UPDATE_INSTEAD:
if (deny_delete_current == DENY_UNCONFIGURED)
refuse_unconfigured_deny_delete_current();
rp_error("refusing to delete the current branch: %s", name);
ret = "deletion of the current branch prohibited";
goto out;
default:
ret = "Invalid denyDeleteCurrent setting";
goto out;
}
}
}
if (deny_non_fast_forwards && !is_null_oid(new_oid) &&
!is_null_oid(old_oid) &&
starts_with(name, "refs/heads/")) {
struct object *old_object, *new_object;
struct commit *old_commit, *new_commit;
old_object = parse_object(the_repository, old_oid);
new_object = parse_object(the_repository, new_oid);
if (!old_object || !new_object ||
old_object->type != OBJ_COMMIT ||
new_object->type != OBJ_COMMIT) {
error("bad sha1 objects for %s", name);
ret = "bad ref";
goto out;
}
old_commit = (struct commit *)old_object;
new_commit = (struct commit *)new_object;
if (!repo_in_merge_bases(the_repository, old_commit, new_commit)) {
rp_error("denying non-fast-forward %s"
" (you should pull first)", name);
ret = "non-fast-forward";
goto out;
}
}
if (run_update_hook(cmd)) {
rp_error("hook declined to update %s", name);
ret = "hook declined";
goto out;
}
if (do_update_worktree) {
ret = update_worktree(new_oid->hash, worktree);
if (ret)
goto out;
}
if (is_null_oid(new_oid)) {
struct strbuf err = STRBUF_INIT;
if (!parse_object(the_repository, old_oid)) {
old_oid = NULL;
if (ref_exists(name)) {
rp_warning("allowing deletion of corrupt ref");
} else {
rp_warning("deleting a non-existent ref");
cmd->did_not_exist = 1;
}
}
if (ref_transaction_delete(transaction,
namespaced_name,
old_oid,
0, "push", &err)) {
rp_error("%s", err.buf);
ret = "failed to delete";
} else {
ret = NULL; /* good */
}
strbuf_release(&err);
}
else {
struct strbuf err = STRBUF_INIT;
if (shallow_update && si->shallow_ref[cmd->index] &&
update_shallow_ref(cmd, si)) {
ret = "shallow error";
goto out;
}
if (ref_transaction_update(transaction,
namespaced_name,
new_oid, old_oid,
0, "push",
&err)) {
rp_error("%s", err.buf);
ret = "failed to update ref";
} else {
ret = NULL; /* good */
}
strbuf_release(&err);
}
out:
free_worktrees(worktrees);
return ret;
}
static void run_update_post_hook(struct command *commands)
{
struct command *cmd;
struct child_process proc = CHILD_PROCESS_INIT;
const char *hook;
hook = find_hook("post-update");
if (!hook)
return;
for (cmd = commands; cmd; cmd = cmd->next) {
if (cmd->error_string || cmd->did_not_exist)
continue;
if (!proc.args.nr)
strvec_push(&proc.args, hook);
strvec_push(&proc.args, cmd->ref_name);
}
if (!proc.args.nr)
return;
proc.no_stdin = 1;
proc.stdout_to_stderr = 1;
proc.err = use_sideband ? -1 : 0;
proc.trace2_hook_name = "post-update";
if (!start_command(&proc)) {
if (use_sideband)
copy_to_sideband(proc.err, -1, NULL);
finish_command(&proc);
}
}
receive-pack: fix use-after-free bug The resolve_ref_unsafe() function can, and sometimes will in the case of this codepath, return the char * passed to it to the caller. In this case we construct a strbuf, free it, and then continue using the dst_name after that free(). The code being fixed dates back to da3efdb17b ("receive-pack: detect aliased updates which can occur with symrefs", 2010-04-19). When it was originally added it didn't have this bug, it was introduced when it was subsequently modified to use strbuf in 6b01ecfe22 ("ref namespaces: Support remote repositories via upload-pack and receive-pack", 2011-07-08). This is theoretically a security issue, the C standard makes no guarantees that a value you use after free() hasn't been poked at or changed by something else on the system, but in practice modern OSs will have mapped the relevant page to this process, so nothing else would have used it. We do no further allocations between the free() and use-after-free, so we ourselves didn't corrupt or change the value. Jeff investigated that and found: "It probably would be an issue if the allocation were larger. glibc at least will use mmap()/munmap() after some cutoff[1], in which case we'd get a segfault from hitting the unmapped page. But for small allocations, it just bumps brk() and the memory is still available for further allocations after free(). [...] If you had a sufficiently large refname you might be able to trigger the bug [...]. I tried to push such a ref. I had to manually make a packed-refs file with the long name to avoid filesystem limits (though probably you could have a long a/b/c/ name on ext4). But the result can't actually be pushed, because it all has to fit into a 64k pkt-line as part of the push protocol.". An a alternative and more succinct way of implementing this would have been to do the strbuf_release() at the end of check_aliased_update() and use "goto out" instead of the early "return" statements. Hopefully this approach of using a helper instead makes it easier to follow. 1. Jeff: "Weirdly, the mmap() cutoff on my glibc system is 135168 bytes. Which is...2^17 + 2^12? 33 pages? I'm sure there's a good reason for that, but I didn't dig into it." Reported-by: 王健强 <jianqiang.wang@securitygossip.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
6 years ago
static void check_aliased_update_internal(struct command *cmd,
struct string_list *list,
const char *dst_name, int flag)
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
{
struct string_list_item *item;
struct command *dst_cmd;
if (!(flag & REF_ISSYMREF))
return;
if (!dst_name) {
rp_error("refusing update to broken symref '%s'", cmd->ref_name);
cmd->skip_update = 1;
cmd->error_string = "broken symref";
return;
}
dst_name = strip_namespace(dst_name);
if (!(item = string_list_lookup(list, dst_name)))
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
return;
cmd->skip_update = 1;
dst_cmd = (struct command *) item->util;
if (oideq(&cmd->old_oid, &dst_cmd->old_oid) &&
oideq(&cmd->new_oid, &dst_cmd->new_oid))
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
return;
dst_cmd->skip_update = 1;
rp_error("refusing inconsistent update between symref '%s' (%s..%s) and"
" its target '%s' (%s..%s)",
cmd->ref_name,
repo_find_unique_abbrev(the_repository, &cmd->old_oid, DEFAULT_ABBREV),
repo_find_unique_abbrev(the_repository, &cmd->new_oid, DEFAULT_ABBREV),
dst_cmd->ref_name,
repo_find_unique_abbrev(the_repository, &dst_cmd->old_oid, DEFAULT_ABBREV),
repo_find_unique_abbrev(the_repository, &dst_cmd->new_oid, DEFAULT_ABBREV));
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
cmd->error_string = dst_cmd->error_string =
"inconsistent aliased update";
}
receive-pack: fix use-after-free bug The resolve_ref_unsafe() function can, and sometimes will in the case of this codepath, return the char * passed to it to the caller. In this case we construct a strbuf, free it, and then continue using the dst_name after that free(). The code being fixed dates back to da3efdb17b ("receive-pack: detect aliased updates which can occur with symrefs", 2010-04-19). When it was originally added it didn't have this bug, it was introduced when it was subsequently modified to use strbuf in 6b01ecfe22 ("ref namespaces: Support remote repositories via upload-pack and receive-pack", 2011-07-08). This is theoretically a security issue, the C standard makes no guarantees that a value you use after free() hasn't been poked at or changed by something else on the system, but in practice modern OSs will have mapped the relevant page to this process, so nothing else would have used it. We do no further allocations between the free() and use-after-free, so we ourselves didn't corrupt or change the value. Jeff investigated that and found: "It probably would be an issue if the allocation were larger. glibc at least will use mmap()/munmap() after some cutoff[1], in which case we'd get a segfault from hitting the unmapped page. But for small allocations, it just bumps brk() and the memory is still available for further allocations after free(). [...] If you had a sufficiently large refname you might be able to trigger the bug [...]. I tried to push such a ref. I had to manually make a packed-refs file with the long name to avoid filesystem limits (though probably you could have a long a/b/c/ name on ext4). But the result can't actually be pushed, because it all has to fit into a 64k pkt-line as part of the push protocol.". An a alternative and more succinct way of implementing this would have been to do the strbuf_release() at the end of check_aliased_update() and use "goto out" instead of the early "return" statements. Hopefully this approach of using a helper instead makes it easier to follow. 1. Jeff: "Weirdly, the mmap() cutoff on my glibc system is 135168 bytes. Which is...2^17 + 2^12? 33 pages? I'm sure there's a good reason for that, but I didn't dig into it." Reported-by: 王健强 <jianqiang.wang@securitygossip.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
6 years ago
static void check_aliased_update(struct command *cmd, struct string_list *list)
{
struct strbuf buf = STRBUF_INIT;
const char *dst_name;
int flag;
strbuf_addf(&buf, "%s%s", get_git_namespace(), cmd->ref_name);
dst_name = resolve_ref_unsafe(buf.buf, 0, NULL, &flag);
check_aliased_update_internal(cmd, list, dst_name, flag);
strbuf_release(&buf);
}
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
static void check_aliased_updates(struct command *commands)
{
struct command *cmd;
struct string_list ref_list = STRING_LIST_INIT_NODUP;
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
for (cmd = commands; cmd; cmd = cmd->next) {
struct string_list_item *item =
string_list_append(&ref_list, cmd->ref_name);
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
item->util = (void *)cmd;
}
string_list_sort(&ref_list);
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
for (cmd = commands; cmd; cmd = cmd->next) {
if (!cmd->error_string)
check_aliased_update(cmd, &ref_list);
}
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
string_list_clear(&ref_list, 0);
}
connected: refactor iterator to return next object ID directly The object ID iterator used by the connectivity checks returns the next object ID via an out-parameter and then uses a return code to indicate whether an item was found. This is a bit roundabout: instead of a separate error code, we can just return the next object ID directly and use `NULL` pointers as indicator that the iterator got no items left. Furthermore, this avoids a copy of the object ID. Refactor the iterator and all its implementations to return object IDs directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs: Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s] Range (min … max): 29.934 s … 30.406 s 10 runs Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s] Range (min … max): 29.696 s … 29.996 s 10 runs Summary '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran 1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch' While this 1% speedup could be labelled as statistically insignificant, the speedup is consistent on my machine. Furthermore, this is an end to end test, so it is expected that the improvement in the connectivity check itself is more significant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
static const struct object_id *command_singleton_iterator(void *cb_data)
{
struct command **cmd_list = cb_data;
struct command *cmd = *cmd_list;
if (!cmd || is_null_oid(&cmd->new_oid))
connected: refactor iterator to return next object ID directly The object ID iterator used by the connectivity checks returns the next object ID via an out-parameter and then uses a return code to indicate whether an item was found. This is a bit roundabout: instead of a separate error code, we can just return the next object ID directly and use `NULL` pointers as indicator that the iterator got no items left. Furthermore, this avoids a copy of the object ID. Refactor the iterator and all its implementations to return object IDs directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs: Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s] Range (min … max): 29.934 s … 30.406 s 10 runs Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s] Range (min … max): 29.696 s … 29.996 s 10 runs Summary '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran 1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch' While this 1% speedup could be labelled as statistically insignificant, the speedup is consistent on my machine. Furthermore, this is an end to end test, so it is expected that the improvement in the connectivity check itself is more significant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
return NULL;
*cmd_list = NULL; /* this returns only one */
connected: refactor iterator to return next object ID directly The object ID iterator used by the connectivity checks returns the next object ID via an out-parameter and then uses a return code to indicate whether an item was found. This is a bit roundabout: instead of a separate error code, we can just return the next object ID directly and use `NULL` pointers as indicator that the iterator got no items left. Furthermore, this avoids a copy of the object ID. Refactor the iterator and all its implementations to return object IDs directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs: Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s] Range (min … max): 29.934 s … 30.406 s 10 runs Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s] Range (min … max): 29.696 s … 29.996 s 10 runs Summary '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran 1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch' While this 1% speedup could be labelled as statistically insignificant, the speedup is consistent on my machine. Furthermore, this is an end to end test, so it is expected that the improvement in the connectivity check itself is more significant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
return &cmd->new_oid;
}
static void set_connectivity_errors(struct command *commands,
struct shallow_info *si)
{
struct command *cmd;
for (cmd = commands; cmd; cmd = cmd->next) {
struct command *singleton = cmd;
struct check_connected_options opt = CHECK_CONNECTED_INIT;
if (shallow_update && si->shallow_ref[cmd->index])
/* to be checked in update_shallow_ref() */
continue;
opt.env = tmp_objdir_env(tmp_objdir);
check_everything_connected: use a struct with named options The number of variants of check_everything_connected has grown over the years, so that the "real" function takes several possibly-zero, possibly-NULL arguments. We hid the complexity behind some wrapper functions, but this doesn't scale well when we want to add new options. If we add more wrapper variants to handle the new options, then we can get a combinatorial explosion when those options might be used together (right now nobody wants to use both "shallow" and "transport" together, so we get by with just a few wrappers). If instead we add new parameters to each function, each of which can have a default value, then callers who want the defaults end up with confusing invocations like: check_everything_connected(fn, 0, data, -1, 0, NULL); where it is unclear which parameter is which (and every caller needs updated when we add new options). Instead, let's add a struct to hold all of the optional parameters. This is a little more verbose for the callers (who have to declare the struct and fill it in), but it makes their code much easier to follow, because every option is named as it is set (and unused options do not have to be mentioned at all). Note that we could also stick the iteration function and its callback data into the option struct, too. But since those are required for each call, by avoiding doing so, we can let very simple callers just pass "NULL" for the options and not worry about the struct at all. While we're touching each site, let's also rename the function to check_connected(). The existing name was quite long, and not all of the wrappers even used the full name. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
if (!check_connected(command_singleton_iterator, &singleton,
&opt))
continue;
cmd->error_string = "missing necessary objects";
}
}
struct iterate_data {
struct command *cmds;
struct shallow_info *si;
};
connected: refactor iterator to return next object ID directly The object ID iterator used by the connectivity checks returns the next object ID via an out-parameter and then uses a return code to indicate whether an item was found. This is a bit roundabout: instead of a separate error code, we can just return the next object ID directly and use `NULL` pointers as indicator that the iterator got no items left. Furthermore, this avoids a copy of the object ID. Refactor the iterator and all its implementations to return object IDs directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs: Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s] Range (min … max): 29.934 s … 30.406 s 10 runs Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s] Range (min … max): 29.696 s … 29.996 s 10 runs Summary '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran 1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch' While this 1% speedup could be labelled as statistically insignificant, the speedup is consistent on my machine. Furthermore, this is an end to end test, so it is expected that the improvement in the connectivity check itself is more significant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
static const struct object_id *iterate_receive_command_list(void *cb_data)
{
struct iterate_data *data = cb_data;
struct command **cmd_list = &data->cmds;
struct command *cmd = *cmd_list;
for (; cmd; cmd = cmd->next) {
if (shallow_update && data->si->shallow_ref[cmd->index])
/* to be checked in update_shallow_ref() */
continue;
if (!is_null_oid(&cmd->new_oid) && !cmd->skip_update) {
*cmd_list = cmd->next;
connected: refactor iterator to return next object ID directly The object ID iterator used by the connectivity checks returns the next object ID via an out-parameter and then uses a return code to indicate whether an item was found. This is a bit roundabout: instead of a separate error code, we can just return the next object ID directly and use `NULL` pointers as indicator that the iterator got no items left. Furthermore, this avoids a copy of the object ID. Refactor the iterator and all its implementations to return object IDs directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs: Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s] Range (min … max): 29.934 s … 30.406 s 10 runs Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s] Range (min … max): 29.696 s … 29.996 s 10 runs Summary '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran 1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch' While this 1% speedup could be labelled as statistically insignificant, the speedup is consistent on my machine. Furthermore, this is an end to end test, so it is expected that the improvement in the connectivity check itself is more significant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
return &cmd->new_oid;
}
}
connected: refactor iterator to return next object ID directly The object ID iterator used by the connectivity checks returns the next object ID via an out-parameter and then uses a return code to indicate whether an item was found. This is a bit roundabout: instead of a separate error code, we can just return the next object ID directly and use `NULL` pointers as indicator that the iterator got no items left. Furthermore, this avoids a copy of the object ID. Refactor the iterator and all its implementations to return object IDs directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs: Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s] Range (min … max): 29.934 s … 30.406 s 10 runs Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s] Range (min … max): 29.696 s … 29.996 s 10 runs Summary '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran 1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch' While this 1% speedup could be labelled as statistically insignificant, the speedup is consistent on my machine. Furthermore, this is an end to end test, so it is expected that the improvement in the connectivity check itself is more significant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
return NULL;
}
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
static void reject_updates_to_hidden(struct command *commands)
{
struct strbuf refname_full = STRBUF_INIT;
size_t prefix_len;
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
struct command *cmd;
strbuf_addstr(&refname_full, get_git_namespace());
prefix_len = refname_full.len;
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
for (cmd = commands; cmd; cmd = cmd->next) {
if (cmd->error_string)
continue;
strbuf_setlen(&refname_full, prefix_len);
strbuf_addstr(&refname_full, cmd->ref_name);
if (!ref_is_hidden(cmd->ref_name, refname_full.buf, &hidden_refs))
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
continue;
if (is_null_oid(&cmd->new_oid))
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
cmd->error_string = "deny deleting a hidden ref";
else
cmd->error_string = "deny updating a hidden ref";
}
strbuf_release(&refname_full);
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
}
static int should_process_cmd(struct command *cmd)
{
return !cmd->error_string && !cmd->skip_update;
}
static void BUG_if_skipped_connectivity_check(struct command *commands,
struct shallow_info *si)
{
struct command *cmd;
for (cmd = commands; cmd; cmd = cmd->next) {
if (should_process_cmd(cmd) && si->shallow_ref[cmd->index])
bug("connectivity check has not been run on ref %s",
cmd->ref_name);
}
BUG_if_bug("connectivity check skipped???");
}
static void execute_commands_non_atomic(struct command *commands,
struct shallow_info *si)
{
struct command *cmd;
struct strbuf err = STRBUF_INIT;
for (cmd = commands; cmd; cmd = cmd->next) {
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
if (!should_process_cmd(cmd) || cmd->run_proc_receive)
continue;
transaction = ref_transaction_begin(&err);
if (!transaction) {
rp_error("%s", err.buf);
strbuf_reset(&err);
cmd->error_string = "transaction failed to start";
continue;
}
cmd->error_string = update(cmd, si);
if (!cmd->error_string
&& ref_transaction_commit(transaction, &err)) {
rp_error("%s", err.buf);
strbuf_reset(&err);
cmd->error_string = "failed to update ref";
}
ref_transaction_free(transaction);
}
strbuf_release(&err);
}
static void execute_commands_atomic(struct command *commands,
struct shallow_info *si)
{
struct command *cmd;
struct strbuf err = STRBUF_INIT;
const char *reported_error = "atomic push failure";
transaction = ref_transaction_begin(&err);
if (!transaction) {
rp_error("%s", err.buf);
strbuf_reset(&err);
reported_error = "transaction failed to start";
goto failure;
}
for (cmd = commands; cmd; cmd = cmd->next) {
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
if (!should_process_cmd(cmd) || cmd->run_proc_receive)
continue;
cmd->error_string = update(cmd, si);
if (cmd->error_string)
goto failure;
}
if (ref_transaction_commit(transaction, &err)) {
rp_error("%s", err.buf);
reported_error = "atomic transaction failed";
goto failure;
}
goto cleanup;
failure:
for (cmd = commands; cmd; cmd = cmd->next)
if (!cmd->error_string)
cmd->error_string = reported_error;
cleanup:
ref_transaction_free(transaction);
strbuf_release(&err);
}
static void execute_commands(struct command *commands,
const char *unpacker_error,
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
struct shallow_info *si,
const struct string_list *push_options)
{
struct check_connected_options opt = CHECK_CONNECTED_INIT;
struct command *cmd;
struct iterate_data data;
struct async muxer;
int err_fd = 0;
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
int run_proc_receive = 0;
if (unpacker_error) {
for (cmd = commands; cmd; cmd = cmd->next)
cmd->error_string = "unpacker error";
return;
}
if (use_sideband) {
memset(&muxer, 0, sizeof(muxer));
muxer.proc = copy_to_sideband;
muxer.in = -1;
if (!start_async(&muxer))
err_fd = muxer.in;
/* ...else, continue without relaying sideband */
}
data.cmds = commands;
data.si = si;
opt.err_fd = err_fd;
opt.progress = err_fd && !quiet;
opt.env = tmp_objdir_env(tmp_objdir);
receive-pack: only use visible refs for connectivity check When serving a push, git-receive-pack(1) needs to verify that the packfile sent by the client contains all objects that are required by the updated references. This connectivity check works by marking all preexisting references as uninteresting and using the new reference tips as starting point for a graph walk. Marking all preexisting references as uninteresting can be a problem when it comes to performance. Git forges tend to do internal bookkeeping to keep alive sets of objects for internal use or make them easy to find via certain references. These references are typically hidden away from the user so that they are neither advertised nor writeable. At GitLab, we have one particular repository that contains a total of 7 million references, of which 6.8 million are indeed internal references. With the current connectivity check we are forced to load all these references in order to mark them as uninteresting, and this alone takes around 15 seconds to compute. We can optimize this by only taking into account the set of visible refs when marking objects as uninteresting. This means that we may now walk more objects until we hit any object that is marked as uninteresting. But it is rather unlikely that clients send objects that make large parts of objects reachable that have previously only ever been hidden, whereas the common case is to push incremental changes that build on top of the visible object graph. This provides a huge boost to performance in the mentioned repository, where the vast majority of its refs hidden. Pushing a new commit into this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs as it is configured in Gitaly leads to a 4.5-fold speedup: Benchmark 1: main Time (mean ± σ): 30.977 s ± 0.157 s [User: 30.226 s, System: 1.083 s] Range (min … max): 30.796 s … 31.071 s 3 runs Benchmark 2: pks-connectivity-check-hide-refs Time (mean ± σ): 6.799 s ± 0.063 s [User: 6.803 s, System: 0.354 s] Range (min … max): 6.729 s … 6.850 s 3 runs Summary 'pks-connectivity-check-hide-refs' ran 4.56 ± 0.05 times faster than 'main' As we mostly go through the same codepaths even in the case where there are no hidden refs at all compared to the code before there is no change in performance when no refs are hidden: Benchmark 1: main Time (mean ± σ): 48.188 s ± 0.432 s [User: 49.326 s, System: 5.009 s] Range (min … max): 47.706 s … 48.539 s 3 runs Benchmark 2: pks-connectivity-check-hide-refs Time (mean ± σ): 48.027 s ± 0.500 s [User: 48.934 s, System: 5.025 s] Range (min … max): 47.504 s … 48.500 s 3 runs Summary 'pks-connectivity-check-hide-refs' ran 1.00 ± 0.01 times faster than 'main' Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2 years ago
opt.exclude_hidden_refs_section = "receive";
if (check_connected(iterate_receive_command_list, &data, &opt))
set_connectivity_errors(commands, si);
if (use_sideband)
finish_async(&muxer);
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
reject_updates_to_hidden(commands);
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
/*
* Try to find commands that have special prefix in their reference names,
* and mark them to run an external "proc-receive" hook later.
*/
if (proc_receive_ref) {
for (cmd = commands; cmd; cmd = cmd->next) {
if (!should_process_cmd(cmd))
continue;
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
if (proc_receive_ref_matches(cmd)) {
cmd->run_proc_receive = RUN_PROC_RECEIVE_SCHEDULED;
run_proc_receive = 1;
}
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
}
}
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
if (run_receive_hook(commands, "pre-receive", 0, push_options)) {
for (cmd = commands; cmd; cmd = cmd->next) {
if (!cmd->error_string)
cmd->error_string = "pre-receive hook declined";
}
Teach receive-pack to run pre-receive/post-receive hooks Bill Lear pointed out that it is easy to send out notifications of changes with the update hook, but successful execution of the update hook does not necessarily mean that the ref was actually updated. Lock contention on the ref or being unable to append to the reflog may prevent the ref from being changed. Sending out notifications prior to the ref actually changing is very misleading. To help this situation I am introducing two new hooks to the receive-pack flow: pre-receive and post-receive. These new hooks are invoked only once per receive-pack execution and are passed three arguments per ref (refname, old-sha1, new-sha1). The new post-receive hook is ideal for sending out notifications, as it has the complete list of all refnames that were successfully updated as well as the old and new SHA-1 values. This allows more interesting notifications to be sent. Multiple ref updates could be easily summarized into one email, for example. The new pre-receive hook is ideal for logging update attempts, as it is run only once for the entire receive-pack operation. It can also be used to verify multiple updates happen at once, e.g. an update to the `maint` head must also be accompained by a new annotated tag. Lots of documentation improvements for receive-pack are included in this change, as we want to make sure the new hooks are clearly explained. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
18 years ago
return;
}
/*
* If there is no command ready to run, should return directly to destroy
* temporary data in the quarantine area.
*/
for (cmd = commands; cmd && cmd->error_string; cmd = cmd->next)
; /* nothing */
if (!cmd)
return;
/*
* Now we'll start writing out refs, which means the objects need
* to be in their final positions so that other processes can see them.
*/
if (tmp_objdir_migrate(tmp_objdir) < 0) {
for (cmd = commands; cmd; cmd = cmd->next) {
if (!cmd->error_string)
cmd->error_string = "unable to migrate objects to permanent storage";
}
return;
}
tmp_objdir = NULL;
receive-pack: detect aliased updates which can occur with symrefs When pushing to a remote repo the sending side filters out aliased updates (e.g., foo:baz bar:baz). However, it is not possible for the sender to know if two refs are aliased on the receiving side via symrefs. Here is one such scenario: $ git init origin $ (cd origin && touch file && git add file && git commit -a -m intial) $ git clone --bare origin origin.git $ rm -rf origin $ git clone origin.git client $ git clone --mirror client backup.git && $ (cd backup.git && git remote set-head origin --auto) $ (cd client && git remote add --mirror backup ../backup.git && echo change1 > file && git commit -a -m change1 && git push origin && git push backup ) The push to backup fails with: Counting objects: 5, done. Writing objects: 100% (3/3), 244 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. error: Ref refs/remotes/origin/master is at ef3... but expected 262... remote: error: failed to lock refs/remotes/origin/master To ../backup.git 262cd57..ef307ff master -> master 262cd57..ef307ff origin/HEAD -> origin/HEAD ! [remote rejected] origin/master -> origin/master (failed to lock) error: failed to push some refs to '../backup.git' The reason is that refs/remotes/origin/HEAD is a symref to refs/remotes/origin/master, but it is not possible for the sending side to unambiguously know this. This commit fixes the issue by having receive-pack ignore any update to a symref whose target is being identically updated. If a symref and its target are being updated inconsistently, then the update for both fails with an error message ("refusing inconsistent update...") to help diagnose the situation. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
check_aliased_updates(commands);
free(head_name_to_free);
head_name = head_name_to_free = resolve_refdup("HEAD", 0, NULL, NULL);
receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
if (run_proc_receive &&
run_proc_receive_hook(commands, push_options))
for (cmd = commands; cmd; cmd = cmd->next)
if (!cmd->error_string &&
!(cmd->run_proc_receive & RUN_PROC_RECEIVE_RETURNED) &&
(cmd->run_proc_receive || use_atomic))
cmd->error_string = "fail to run proc-receive hook";
if (use_atomic)
execute_commands_atomic(commands, si);
else
execute_commands_non_atomic(commands, si);
if (shallow_update)
BUG_if_skipped_connectivity_check(commands, si);
}
static struct command **queue_command(struct command **tail,
const char *line,
int linelen)
{
struct object_id old_oid, new_oid;
struct command *cmd;
const char *refname;
int reflen;
const char *p;
if (parse_oid_hex(line, &old_oid, &p) ||
*p++ != ' ' ||
parse_oid_hex(p, &new_oid, &p) ||
*p++ != ' ')
die("protocol error: expected old/new/ref, got '%s'", line);
refname = p;
reflen = linelen - (p - line);
FLEX_ALLOC_MEM(cmd, ref_name, refname, reflen);
oidcpy(&cmd->old_oid, &old_oid);
oidcpy(&cmd->new_oid, &new_oid);
*tail = cmd;
return &cmd->next;
}
static void free_commands(struct command *commands)
{
while (commands) {
struct command *next = commands->next;
free(commands);
commands = next;
}
}
static void queue_commands_from_cert(struct command **tail,
struct strbuf *push_cert)
{
const char *boc, *eoc;
if (*tail)
die("protocol error: got both push certificate and unsigned commands");
boc = strstr(push_cert->buf, "\n\n");
if (!boc)
die("malformed push certificate %.*s", 100, push_cert->buf);
else
boc += 2;
eoc = push_cert->buf + parse_signed_buffer(push_cert->buf, push_cert->len);
while (boc < eoc) {
const char *eol = memchr(boc, '\n', eoc - boc);
tail = queue_command(tail, boc, eol ? eol - boc : eoc - boc);
boc = eol ? eol + 1 : eoc;
}
}
static struct command *read_head_info(struct packet_reader *reader,
struct oid_array *shallow)
{
struct command *commands = NULL;
struct command **p = &commands;
for (;;) {
int linelen;
if (packet_reader_read(reader) != PACKET_READ_NORMAL)
break;
if (reader->pktlen > 8 && starts_with(reader->line, "shallow ")) {
struct object_id oid;
if (get_oid_hex(reader->line + 8, &oid))
die("protocol error: expected shallow sha, got '%s'",
reader->line + 8);
oid_array_append(shallow, &oid);
continue;
}
linelen = strlen(reader->line);
if (linelen < reader->pktlen) {
const char *feature_list = reader->line + linelen + 1;
const char *hash = NULL;
const char *client_sid;
int len = 0;
if (parse_feature_request(feature_list, "report-status"))
report_status = 1;
if (parse_feature_request(feature_list, "report-status-v2"))
report_status_v2 = 1;
if (parse_feature_request(feature_list, "side-band-64k"))
use_sideband = LARGE_PACKET_MAX;
if (parse_feature_request(feature_list, "quiet"))
quiet = 1;
if (advertise_atomic_push
&& parse_feature_request(feature_list, "atomic"))
use_atomic = 1;
if (advertise_push_options
&& parse_feature_request(feature_list, "push-options"))
use_push_options = 1;
hash = parse_feature_value(feature_list, "object-format", &len, NULL);
if (!hash) {
hash = hash_algos[GIT_HASH_SHA1].name;
len = strlen(hash);
}
if (xstrncmpz(the_hash_algo->name, hash, len))
die("error: unsupported object format '%s'", hash);
client_sid = parse_feature_value(feature_list, "session-id", &len, NULL);
if (client_sid) {
char *sid = xstrndup(client_sid, len);
trace2_data_string("transfer", NULL, "client-sid", client_sid);
free(sid);
}
}
if (!strcmp(reader->line, "push-cert")) {
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
int true_flush = 0;
int saved_options = reader->options;
reader->options &= ~PACKET_READ_CHOMP_NEWLINE;
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
for (;;) {
packet_reader_read(reader);
if (reader->status == PACKET_READ_FLUSH) {
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
true_flush = 1;
break;
}
if (reader->status != PACKET_READ_NORMAL) {
die("protocol error: got an unexpected packet");
}
if (!strcmp(reader->line, "push-cert-end\n"))
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
break; /* end of cert */
strbuf_addstr(&push_cert, reader->line);
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
}
reader->options = saved_options;
push: the beginning of "git push --signed" While signed tags and commits assert that the objects thusly signed came from you, who signed these objects, there is not a good way to assert that you wanted to have a particular object at the tip of a particular branch. My signing v2.0.1 tag only means I want to call the version v2.0.1, and it does not mean I want to push it out to my 'master' branch---it is likely that I only want it in 'maint', so the signature on the object alone is insufficient. The only assurance to you that 'maint' points at what I wanted to place there comes from your trust on the hosting site and my authentication with it, which cannot easily audited later. Introduce a mechanism that allows you to sign a "push certificate" (for the lack of better name) every time you push, asserting that what object you are pushing to update which ref that used to point at what other object. Think of it as a cryptographic protection for ref updates, similar to signed tags/commits but working on an orthogonal axis. The basic flow based on this mechanism goes like this: 1. You push out your work with "git push --signed". 2. The sending side learns where the remote refs are as usual, together with what protocol extension the receiving end supports. If the receiving end does not advertise the protocol extension "push-cert", an attempt to "git push --signed" fails. Otherwise, a text file, that looks like the following, is prepared in core: certificate version 0.1 pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700 7339ca65... 21580ecb... refs/heads/master 3793ac56... 12850bec... refs/heads/next The file begins with a few header lines, which may grow as we gain more experience. The 'pusher' header records the name of the signer (the value of user.signingkey configuration variable, falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the certificate generation. After the header, a blank line follows, followed by a copy of the protocol message lines. Each line shows the old and the new object name at the tip of the ref this push tries to update, in the way identical to how the underlying "git push" protocol exchange tells the ref updates to the receiving end (by recording the "old" object name, the push certificate also protects against replaying). It is expected that new command packet types other than the old-new-refname kind will be included in push certificate in the same way as would appear in the plain vanilla command packets in unsigned pushes. The user then is asked to sign this push certificate using GPG, formatted in a way similar to how signed tag objects are signed, and the result is sent to the other side (i.e. receive-pack). In the protocol exchange, this step comes immediately before the sender tells what the result of the push should be, which in turn comes before it sends the pack data. 3. When the receiving end sees a push certificate, the certificate is written out as a blob. The pre-receive hook can learn about the certificate by checking GIT_PUSH_CERT environment variable, which, if present, tells the object name of this blob, and make the decision to allow or reject this push. Additionally, the post-receive hook can also look at the certificate, which may be a good place to log all the received certificates for later audits. Because a push certificate carry the same information as the usual command packets in the protocol exchange, we can omit the latter when a push certificate is in use and reduce the protocol overhead. This however is not included in this patch to make it easier to review (in other words, the series at this step should never be released without the remainder of the series, as it implements an interim protocol that will be incompatible with the final one). As such, the documentation update for the protocol is left out of this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
if (true_flush)
break;
continue;
}
p = queue_command(p, reader->line, linelen);
}
if (push_cert.len)
queue_commands_from_cert(p, &push_cert);
return commands;
}
static void read_push_options(struct packet_reader *reader,
struct string_list *options)
{
while (1) {
if (packet_reader_read(reader) != PACKET_READ_NORMAL)
break;
string_list_append(options, reader->line);
}
}
static const char *parse_pack_header(struct pack_header *hdr)
{
switch (read_pack_header(0, hdr)) {
case PH_ERROR_EOF:
return "eof before pack header was fully read";
case PH_ERROR_PACK_SIGNATURE:
return "protocol error (pack signature mismatch detected)";
case PH_ERROR_PROTOCOL:
return "protocol error (pack version unsupported)";
default:
return "unknown error in parse_pack_header";
case 0:
return NULL;
}
}
static const char *pack_lockfile;
static void push_header_arg(struct strvec *args, struct pack_header *hdr)
{
strvec_pushf(args, "--pack_header=%"PRIu32",%"PRIu32,
ntohl(hdr->hdr_version), ntohl(hdr->hdr_entries));
}
static const char *unpack(int err_fd, struct shallow_info *si)
{
struct pack_header hdr;
const char *hdr_err;
int status;
struct child_process child = CHILD_PROCESS_INIT;
int fsck_objects = (receive_fsck_objects >= 0
? receive_fsck_objects
: transfer_fsck_objects >= 0
? transfer_fsck_objects
: 0);
hdr_err = parse_pack_header(&hdr);
if (hdr_err) {
if (err_fd > 0)
close(err_fd);
return hdr_err;
}
if (si->nr_ours || si->nr_theirs) {
alt_shallow_file = setup_temporary_shallow(si->shallow);
strvec_push(&child.args, "--shallow-file");
strvec_push(&child.args, alt_shallow_file);
}
tmp_objdir = tmp_objdir_create("incoming");
if (!tmp_objdir) {
if (err_fd > 0)
close(err_fd);
return "unable to create temporary object directory";
}
strvec_pushv(&child.env, tmp_objdir_env(tmp_objdir));
/*
* Normally we just pass the tmp_objdir environment to the child
* processes that do the heavy lifting, but we may need to see these
* objects ourselves to set up shallow information.
*/
tmp_objdir_add_as_alternate(tmp_objdir);
if (ntohl(hdr.hdr_entries) < unpack_limit) {
strvec_push(&child.args, "unpack-objects");
push_header_arg(&child.args, &hdr);
if (quiet)
strvec_push(&child.args, "-q");
if (fsck_objects)
strvec_pushf(&child.args, "--strict%s",
fsck_msg_types.buf);
if (max_input_size)
strvec_pushf(&child.args, "--max-input-size=%"PRIuMAX,
(uintmax_t)max_input_size);
child.no_stdout = 1;
receive-pack: send pack-processing stderr over sideband Receive-pack invokes either unpack-objects or index-pack to handle the incoming pack. However, we do not redirect the stderr of the sub-processes at all, so it is never seen by the client. From the initial thread adding sideband support, which is here: http://thread.gmane.org/gmane.comp.version-control.git/139471 it is clear that some messages are specifically kept off the sideband (with the assumption that they are of interest only to an administrator, not the client). The stderr of the subprocesses is mentioned in the thread, but it's unclear if they are included in that group, or were simply forgotten. However, there are a few good reasons to show them to the client: 1. In many cases, they are directly about the incoming packfile (e.g., fsck warnings with --strict, corruption in the packfile, etc). Without these messages, the client just gets "unpacker error" with no extra useful diagnosis. 2. No matter what the cause, we are probably better off showing the errors to the client. If the client and the server admin are not the same entity, it is probably much easier for the client to cut-and-paste the errors they see than for the admin to try to dig them out of a log and correlate them with a particular session. 3. Users of the ssh transport typically already see these stderr messages, as the remote's stderr is copied literally by ssh. This brings other transports (http, and push-over-git if you are crazy enough to enable it) more in line with ssh. As a bonus for ssh users, because the messages are now fed through the sideband and printed by the local git, they will have "remote:" prepended and be properly interleaved with any local output to stderr. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
child.err = err_fd;
child.git_cmd = 1;
status = run_command(&child);
if (status)
return "unpack-objects abnormal exit";
} else {
char hostname[HOST_NAME_MAX + 1];
strvec_pushl(&child.args, "index-pack", "--stdin", NULL);
push_header_arg(&child.args, &hdr);
if (xgethostname(hostname, sizeof(hostname)))
xsnprintf(hostname, sizeof(hostname), "localhost");
strvec_pushf(&child.args,
"--keep=receive-pack %"PRIuMAX" on %s",
(uintmax_t)getpid(),
hostname);
if (!quiet && err_fd)
strvec_push(&child.args, "--show-resolving-progress");
receive-pack: send keepalives during quiet periods After a client has sent us the complete pack, we may spend some time processing the data and running hooks. If the client asked us to be quiet, receive-pack won't send any progress data during the index-pack or connectivity-check steps. And hooks may or may not produce their own progress output. In these cases, the network connection is totally silent from both ends. Git itself doesn't care about this (it will wait forever), but other parts of the system (e.g., firewalls, load-balancers, etc) might hang up the connection. So we'd like to send some sort of keepalive to let the network and the client side know that we're still alive and processing. We can use the same trick we did in 05e9515 (upload-pack: send keepalive packets during pack computation, 2013-09-08). Namely, we will send an empty sideband data packet every `N` seconds that we do not relay any stderr data over the sideband channel. As with 05e9515, this means that we won't bother sending keepalives when there's actual progress data, but will kick in when it has been disabled (or if there is a lull in the progress data). The concept is simple, but the details are subtle enough that they need discussing here. Before the client sends us the pack, we don't want to do any keepalives. We'll have sent our ref advertisement, and we're waiting for them to send us the pack (and tell us that they support sidebands at all). While we're receiving the pack from the client (or waiting for it to start), there's no need for keepalives; it's up to them to keep the connection active by sending data. Moreover, it would be wrong for us to do so. When we are the server in the smart-http protocol, we must treat our connection as half-duplex. So any keepalives we send while receiving the pack would potentially be buffered by the webserver. Not only does this make them useless (since they would not be delivered in a timely manner), but it could actually cause a deadlock if we fill up the buffer with keepalives. (It wouldn't be wrong to send keepalives in this phase for a full-duplex connection like ssh; it's simply pointless, as it is the client's responsibility to speak). As soon as we've gotten all of the pack data, then the client is waiting for us to speak, and we should start keepalives immediately. From here until the end of the connection, we send one any time we are not otherwise sending data. But there's a catch. Receive-pack doesn't know the moment we've gotten all the data. It passes the descriptor to index-pack, who reads all of the data, and then starts resolving the deltas. We have to communicate that back. To make this work, we instruct the sideband muxer to enable keepalives in three phases: 1. In the beginning, not at all. 2. While reading from index-pack, wait for a signal indicating end-of-input, and then start them. 3. Afterwards, always. The signal from index-pack in phase 2 has to come over the stderr channel which the muxer is reading. We can't use an extra pipe because the portable run-command interface only gives us stderr and stdout. Stdout is already used to pass the .keep filename back to receive-pack. We could also send a signal there, but then we would find out about it in the main thread. And the keepalive needs to be done by the async muxer thread (since it's the one writing sideband data back to the client). And we can't reliably signal the async thread from the main thread, because the async code sometimes uses threads and sometimes uses forked processes. Therefore the signal must come over the stderr channel, where it may be interspersed with other random human-readable messages from index-pack. This patch makes the signal a single NUL byte. This is easy to parse, should not appear in any normal stderr output, and we don't have to worry about any timing issues (like seeing half the signal bytes in one read(), and half in a subsequent one). This is a bit ugly, but it's simple to code and should work reliably. Another option would be to stop using an async thread for muxing entirely, and just poll() both stderr and stdout of index-pack from the main thread. This would work for index-pack (because we aren't doing anything useful in the main thread while it runs anyway). But it would make the connectivity check and the hook muxers much more complicated, as they need to simultaneously feed the sub-programs while reading their stderr. The index-pack phase is the only one that needs this signaling, so it could simply behave differently than the other two. That would mean having two separate implementations of copy_to_sideband (and the keepalive code), though. And it still doesn't get rid of the signaling; it just means we can write a nicer message like "END_OF_INPUT" or something on stdout, since we don't have to worry about separating it from the stderr cruft. One final note: this signaling trick is only done with index-pack, not with unpack-objects. There's no point in doing it for the latter, because by definition it only kicks in for a small number of objects, where keepalives are not as useful (and this conveniently lets us avoid duplicating the implementation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
if (use_sideband)
strvec_push(&child.args, "--report-end-of-input");
if (fsck_objects)
strvec_pushf(&child.args, "--strict%s",
fsck_msg_types.buf);
if (!reject_thin)
strvec_push(&child.args, "--fix-thin");
if (max_input_size)
strvec_pushf(&child.args, "--max-input-size=%"PRIuMAX,
(uintmax_t)max_input_size);
child.out = -1;
child.err = err_fd;
child.git_cmd = 1;
status = start_command(&child);
if (status)
return "index-pack fork failed";
pack_lockfile = index_pack_lockfile(child.out, NULL);
close(child.out);
status = finish_command(&child);
if (status)
return "index-pack abnormal exit";
reprepare_packed_git(the_repository);
}
return NULL;
}
static const char *unpack_with_sideband(struct shallow_info *si)
receive-pack: send pack-processing stderr over sideband Receive-pack invokes either unpack-objects or index-pack to handle the incoming pack. However, we do not redirect the stderr of the sub-processes at all, so it is never seen by the client. From the initial thread adding sideband support, which is here: http://thread.gmane.org/gmane.comp.version-control.git/139471 it is clear that some messages are specifically kept off the sideband (with the assumption that they are of interest only to an administrator, not the client). The stderr of the subprocesses is mentioned in the thread, but it's unclear if they are included in that group, or were simply forgotten. However, there are a few good reasons to show them to the client: 1. In many cases, they are directly about the incoming packfile (e.g., fsck warnings with --strict, corruption in the packfile, etc). Without these messages, the client just gets "unpacker error" with no extra useful diagnosis. 2. No matter what the cause, we are probably better off showing the errors to the client. If the client and the server admin are not the same entity, it is probably much easier for the client to cut-and-paste the errors they see than for the admin to try to dig them out of a log and correlate them with a particular session. 3. Users of the ssh transport typically already see these stderr messages, as the remote's stderr is copied literally by ssh. This brings other transports (http, and push-over-git if you are crazy enough to enable it) more in line with ssh. As a bonus for ssh users, because the messages are now fed through the sideband and printed by the local git, they will have "remote:" prepended and be properly interleaved with any local output to stderr. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
{
struct async muxer;
const char *ret;
if (!use_sideband)
return unpack(0, si);
receive-pack: send pack-processing stderr over sideband Receive-pack invokes either unpack-objects or index-pack to handle the incoming pack. However, we do not redirect the stderr of the sub-processes at all, so it is never seen by the client. From the initial thread adding sideband support, which is here: http://thread.gmane.org/gmane.comp.version-control.git/139471 it is clear that some messages are specifically kept off the sideband (with the assumption that they are of interest only to an administrator, not the client). The stderr of the subprocesses is mentioned in the thread, but it's unclear if they are included in that group, or were simply forgotten. However, there are a few good reasons to show them to the client: 1. In many cases, they are directly about the incoming packfile (e.g., fsck warnings with --strict, corruption in the packfile, etc). Without these messages, the client just gets "unpacker error" with no extra useful diagnosis. 2. No matter what the cause, we are probably better off showing the errors to the client. If the client and the server admin are not the same entity, it is probably much easier for the client to cut-and-paste the errors they see than for the admin to try to dig them out of a log and correlate them with a particular session. 3. Users of the ssh transport typically already see these stderr messages, as the remote's stderr is copied literally by ssh. This brings other transports (http, and push-over-git if you are crazy enough to enable it) more in line with ssh. As a bonus for ssh users, because the messages are now fed through the sideband and printed by the local git, they will have "remote:" prepended and be properly interleaved with any local output to stderr. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
receive-pack: send keepalives during quiet periods After a client has sent us the complete pack, we may spend some time processing the data and running hooks. If the client asked us to be quiet, receive-pack won't send any progress data during the index-pack or connectivity-check steps. And hooks may or may not produce their own progress output. In these cases, the network connection is totally silent from both ends. Git itself doesn't care about this (it will wait forever), but other parts of the system (e.g., firewalls, load-balancers, etc) might hang up the connection. So we'd like to send some sort of keepalive to let the network and the client side know that we're still alive and processing. We can use the same trick we did in 05e9515 (upload-pack: send keepalive packets during pack computation, 2013-09-08). Namely, we will send an empty sideband data packet every `N` seconds that we do not relay any stderr data over the sideband channel. As with 05e9515, this means that we won't bother sending keepalives when there's actual progress data, but will kick in when it has been disabled (or if there is a lull in the progress data). The concept is simple, but the details are subtle enough that they need discussing here. Before the client sends us the pack, we don't want to do any keepalives. We'll have sent our ref advertisement, and we're waiting for them to send us the pack (and tell us that they support sidebands at all). While we're receiving the pack from the client (or waiting for it to start), there's no need for keepalives; it's up to them to keep the connection active by sending data. Moreover, it would be wrong for us to do so. When we are the server in the smart-http protocol, we must treat our connection as half-duplex. So any keepalives we send while receiving the pack would potentially be buffered by the webserver. Not only does this make them useless (since they would not be delivered in a timely manner), but it could actually cause a deadlock if we fill up the buffer with keepalives. (It wouldn't be wrong to send keepalives in this phase for a full-duplex connection like ssh; it's simply pointless, as it is the client's responsibility to speak). As soon as we've gotten all of the pack data, then the client is waiting for us to speak, and we should start keepalives immediately. From here until the end of the connection, we send one any time we are not otherwise sending data. But there's a catch. Receive-pack doesn't know the moment we've gotten all the data. It passes the descriptor to index-pack, who reads all of the data, and then starts resolving the deltas. We have to communicate that back. To make this work, we instruct the sideband muxer to enable keepalives in three phases: 1. In the beginning, not at all. 2. While reading from index-pack, wait for a signal indicating end-of-input, and then start them. 3. Afterwards, always. The signal from index-pack in phase 2 has to come over the stderr channel which the muxer is reading. We can't use an extra pipe because the portable run-command interface only gives us stderr and stdout. Stdout is already used to pass the .keep filename back to receive-pack. We could also send a signal there, but then we would find out about it in the main thread. And the keepalive needs to be done by the async muxer thread (since it's the one writing sideband data back to the client). And we can't reliably signal the async thread from the main thread, because the async code sometimes uses threads and sometimes uses forked processes. Therefore the signal must come over the stderr channel, where it may be interspersed with other random human-readable messages from index-pack. This patch makes the signal a single NUL byte. This is easy to parse, should not appear in any normal stderr output, and we don't have to worry about any timing issues (like seeing half the signal bytes in one read(), and half in a subsequent one). This is a bit ugly, but it's simple to code and should work reliably. Another option would be to stop using an async thread for muxing entirely, and just poll() both stderr and stdout of index-pack from the main thread. This would work for index-pack (because we aren't doing anything useful in the main thread while it runs anyway). But it would make the connectivity check and the hook muxers much more complicated, as they need to simultaneously feed the sub-programs while reading their stderr. The index-pack phase is the only one that needs this signaling, so it could simply behave differently than the other two. That would mean having two separate implementations of copy_to_sideband (and the keepalive code), though. And it still doesn't get rid of the signaling; it just means we can write a nicer message like "END_OF_INPUT" or something on stdout, since we don't have to worry about separating it from the stderr cruft. One final note: this signaling trick is only done with index-pack, not with unpack-objects. There's no point in doing it for the latter, because by definition it only kicks in for a small number of objects, where keepalives are not as useful (and this conveniently lets us avoid duplicating the implementation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
use_keepalive = KEEPALIVE_AFTER_NUL;
receive-pack: send pack-processing stderr over sideband Receive-pack invokes either unpack-objects or index-pack to handle the incoming pack. However, we do not redirect the stderr of the sub-processes at all, so it is never seen by the client. From the initial thread adding sideband support, which is here: http://thread.gmane.org/gmane.comp.version-control.git/139471 it is clear that some messages are specifically kept off the sideband (with the assumption that they are of interest only to an administrator, not the client). The stderr of the subprocesses is mentioned in the thread, but it's unclear if they are included in that group, or were simply forgotten. However, there are a few good reasons to show them to the client: 1. In many cases, they are directly about the incoming packfile (e.g., fsck warnings with --strict, corruption in the packfile, etc). Without these messages, the client just gets "unpacker error" with no extra useful diagnosis. 2. No matter what the cause, we are probably better off showing the errors to the client. If the client and the server admin are not the same entity, it is probably much easier for the client to cut-and-paste the errors they see than for the admin to try to dig them out of a log and correlate them with a particular session. 3. Users of the ssh transport typically already see these stderr messages, as the remote's stderr is copied literally by ssh. This brings other transports (http, and push-over-git if you are crazy enough to enable it) more in line with ssh. As a bonus for ssh users, because the messages are now fed through the sideband and printed by the local git, they will have "remote:" prepended and be properly interleaved with any local output to stderr. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
memset(&muxer, 0, sizeof(muxer));
muxer.proc = copy_to_sideband;
muxer.in = -1;
if (start_async(&muxer))
return NULL;
ret = unpack(muxer.in, si);
receive-pack: send pack-processing stderr over sideband Receive-pack invokes either unpack-objects or index-pack to handle the incoming pack. However, we do not redirect the stderr of the sub-processes at all, so it is never seen by the client. From the initial thread adding sideband support, which is here: http://thread.gmane.org/gmane.comp.version-control.git/139471 it is clear that some messages are specifically kept off the sideband (with the assumption that they are of interest only to an administrator, not the client). The stderr of the subprocesses is mentioned in the thread, but it's unclear if they are included in that group, or were simply forgotten. However, there are a few good reasons to show them to the client: 1. In many cases, they are directly about the incoming packfile (e.g., fsck warnings with --strict, corruption in the packfile, etc). Without these messages, the client just gets "unpacker error" with no extra useful diagnosis. 2. No matter what the cause, we are probably better off showing the errors to the client. If the client and the server admin are not the same entity, it is probably much easier for the client to cut-and-paste the errors they see than for the admin to try to dig them out of a log and correlate them with a particular session. 3. Users of the ssh transport typically already see these stderr messages, as the remote's stderr is copied literally by ssh. This brings other transports (http, and push-over-git if you are crazy enough to enable it) more in line with ssh. As a bonus for ssh users, because the messages are now fed through the sideband and printed by the local git, they will have "remote:" prepended and be properly interleaved with any local output to stderr. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 years ago
finish_async(&muxer);
return ret;
}
static void prepare_shallow_update(struct shallow_info *si)
{
int i, j, k, bitmap_size = DIV_ROUND_UP(si->ref->nr, 32);
ALLOC_ARRAY(si->used_shallow, si->shallow->nr);
assign_shallow_commits_to_refs(si, si->used_shallow, NULL);
CALLOC_ARRAY(si->need_reachability_test, si->shallow->nr);
CALLOC_ARRAY(si->reachable, si->shallow->nr);
CALLOC_ARRAY(si->shallow_ref, si->ref->nr);
for (i = 0; i < si->nr_ours; i++)
si->need_reachability_test[si->ours[i]] = 1;
for (i = 0; i < si->shallow->nr; i++) {
if (!si->used_shallow[i])
continue;
for (j = 0; j < bitmap_size; j++) {
if (!si->used_shallow[i][j])
continue;
si->need_reachability_test[i]++;
for (k = 0; k < 32; k++)
if (si->used_shallow[i][j] & (1U << k))
si->shallow_ref[j * 32 + k]++;
}
/*
* true for those associated with some refs and belong
* in "ours" list aka "step 7 not done yet"
*/
si->need_reachability_test[i] =
si->need_reachability_test[i] > 1;
}
/*
* keep hooks happy by forcing a temporary shallow file via
* env variable because we can't add --shallow-file to every
* command. check_connected() will be done with
* true .git/shallow though.
*/
setenv(GIT_SHALLOW_FILE_ENVIRONMENT, alt_shallow_file, 1);
}
static void update_shallow_info(struct command *commands,
struct shallow_info *si,
struct oid_array *ref)
{
struct command *cmd;
int *ref_status;
remove_nonexistent_theirs_shallow(si);
if (!si->nr_ours && !si->nr_theirs) {
shallow_update = 0;
return;
}
for (cmd = commands; cmd; cmd = cmd->next) {
if (is_null_oid(&cmd->new_oid))
continue;
oid_array_append(ref, &cmd->new_oid);
cmd->index = ref->nr - 1;
}
si->ref = ref;
if (shallow_update) {
prepare_shallow_update(si);
return;
}
ALLOC_ARRAY(ref_status, ref->nr);
assign_shallow_commits_to_refs(si, NULL, ref_status);
for (cmd = commands; cmd; cmd = cmd->next) {
if (is_null_oid(&cmd->new_oid))
continue;
if (ref_status[cmd->index]) {
cmd->error_string = "shallow update not allowed";
cmd->skip_update = 1;
}
}
free(ref_status);
}
static void report(struct command *commands, const char *unpack_status)
{
struct command *cmd;
struct strbuf buf = STRBUF_INIT;
packet_buf_write(&buf, "unpack %s\n",
unpack_status ? unpack_status : "ok");
for (cmd = commands; cmd; cmd = cmd->next) {
if (!cmd->error_string)
packet_buf_write(&buf, "ok %s\n",
cmd->ref_name);
else
packet_buf_write(&buf, "ng %s %s\n",
cmd->ref_name, cmd->error_string);
}
packet_buf_flush(&buf);
if (use_sideband)
send_sideband(1, 1, buf.buf, buf.len, use_sideband);
else
write_or_die(1, buf.buf, buf.len);
strbuf_release(&buf);
}
static void report_v2(struct command *commands, const char *unpack_status)
{
struct command *cmd;
struct strbuf buf = STRBUF_INIT;
struct ref_push_report *report;
packet_buf_write(&buf, "unpack %s\n",
unpack_status ? unpack_status : "ok");
for (cmd = commands; cmd; cmd = cmd->next) {
int count = 0;
if (cmd->error_string) {
packet_buf_write(&buf, "ng %s %s\n",
cmd->ref_name,
cmd->error_string);
continue;
}
packet_buf_write(&buf, "ok %s\n",
cmd->ref_name);
for (report = cmd->report; report; report = report->next) {
if (count++ > 0)
packet_buf_write(&buf, "ok %s\n",
cmd->ref_name);
if (report->ref_name)
packet_buf_write(&buf, "option refname %s\n",
report->ref_name);
if (report->old_oid)
packet_buf_write(&buf, "option old-oid %s\n",
oid_to_hex(report->old_oid));
if (report->new_oid)
packet_buf_write(&buf, "option new-oid %s\n",
oid_to_hex(report->new_oid));
if (report->forced_update)
packet_buf_write(&buf, "option forced-update\n");
}
}
packet_buf_flush(&buf);
if (use_sideband)
send_sideband(1, 1, buf.buf, buf.len, use_sideband);
else
write_or_die(1, buf.buf, buf.len);
strbuf_release(&buf);
}
static int delete_only(struct command *commands)
{
struct command *cmd;
for (cmd = commands; cmd; cmd = cmd->next) {
if (!is_null_oid(&cmd->new_oid))
return 0;
}
return 1;
}
int cmd_receive_pack(int argc, const char **argv, const char *prefix)
{
int advertise_refs = 0;
struct command *commands;
struct oid_array shallow = OID_ARRAY_INIT;
struct oid_array ref = OID_ARRAY_INIT;
struct shallow_info si;
struct packet_reader reader;
struct option options[] = {
OPT__QUIET(&quiet, N_("quiet")),
OPT_HIDDEN_BOOL(0, "stateless-rpc", &stateless_rpc, NULL),
OPT_HIDDEN_BOOL(0, "http-backend-info-refs", &advertise_refs, NULL),
OPT_ALIAS(0, "advertise-refs", "http-backend-info-refs"),
OPT_HIDDEN_BOOL(0, "reject-thin-pack-for-testing", &reject_thin, NULL),
OPT_END()
};
packet_trace_identity("receive-pack");
argc = parse_options(argc, argv, prefix, options, receive_pack_usage, 0);
if (argc > 1)
usage_msg_opt(_("too many arguments"), receive_pack_usage, options);
if (argc == 0)
usage_msg_opt(_("you must specify a directory"), receive_pack_usage, options);
service_dir = argv[0];
setup_path();
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
if (!enter_repo(service_dir, 0))
die("'%s' does not appear to be a git repository", service_dir);
git_config(receive_pack_config, NULL);
if (cert_nonce_seed)
signed push: allow stale nonce in stateless mode When operating with the stateless RPC mode, we will receive a nonce issued by another instance of us that advertised our capability and refs some time ago. Update the logic to check received nonce to detect this case, compute how much time has passed since the nonce was issued and report the status with a new environment variable GIT_PUSH_CERT_NONCE_SLOP to the hooks. GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case. The hooks are free to decide how large a slop it is willing to accept. Strictly speaking, the "nonce" is not really a "nonce" anymore in the stateless RPC mode, as it will happily take any "nonce" issued by it (which is protected by HMAC and its secret key) as long as it is fresh enough. The degree of this security degradation, relative to the native protocol, is about the same as the "we make sure that the 'git push' decided to update our refs with new objects based on the freshest observation of our refs by making sure the values they claim the original value of the refs they ask us to update exactly match the current state" security is loosened to accomodate the stateless RPC mode in the existing code without this series, so there is no need for those who are already using smart HTTP to push to their repositories to be alarmed any more than they already are. In addition, the server operator can set receive.certnonceslop configuration variable to specify how stale a nonce can be (in seconds). When this variable is set, and if the nonce received in the certificate that passes the HMAC check was less than that many seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS (instead of "SLOP") and the received nonce value is given in GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded hook to check if the certificate we received is recent enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
push_cert_nonce = prepare_push_cert_nonce(service_dir, time(NULL));
if (0 <= transfer_unpack_limit)
unpack_limit = transfer_unpack_limit;
else if (0 <= receive_unpack_limit)
unpack_limit = receive_unpack_limit;
switch (determine_protocol_version_server()) {
case protocol_v2:
/*
* push support for protocol v2 has not been implemented yet,
* so ignore the request to use v2 and fallback to using v0.
*/
break;
case protocol_v1:
/*
* v1 is just the original protocol with a version string,
* so just fall through after writing the version string.
*/
if (advertise_refs || !stateless_rpc)
packet_write_fmt(1, "version 1\n");
/* fallthrough */
case protocol_v0:
break;
case protocol_unknown_version:
BUG("unknown protocol version");
}
if (advertise_refs || !stateless_rpc) {
write_head_info();
}
if (advertise_refs)
return 0;
packet_reader_init(&reader, 0, NULL, 0,
PACKET_READ_CHOMP_NEWLINE |
PACKET_READ_DIE_ON_ERR_PACKET);
if ((commands = read_head_info(&reader, &shallow))) {
const char *unpack_status = NULL;
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
struct string_list push_options = STRING_LIST_INIT_DUP;
if (use_push_options)
read_push_options(&reader, &push_options);
if (!check_cert_push_options(&push_options)) {
struct command *cmd;
for (cmd = commands; cmd; cmd = cmd->next)
cmd->error_string = "inconsistent push options";
}
prepare_shallow_info(&si, &shallow);
if (!si.nr_ours && !si.nr_theirs)
shallow_update = 0;
if (!delete_only(commands)) {
unpack_status = unpack_with_sideband(&si);
update_shallow_info(commands, &si, &ref);
}
receive-pack: send keepalives during quiet periods After a client has sent us the complete pack, we may spend some time processing the data and running hooks. If the client asked us to be quiet, receive-pack won't send any progress data during the index-pack or connectivity-check steps. And hooks may or may not produce their own progress output. In these cases, the network connection is totally silent from both ends. Git itself doesn't care about this (it will wait forever), but other parts of the system (e.g., firewalls, load-balancers, etc) might hang up the connection. So we'd like to send some sort of keepalive to let the network and the client side know that we're still alive and processing. We can use the same trick we did in 05e9515 (upload-pack: send keepalive packets during pack computation, 2013-09-08). Namely, we will send an empty sideband data packet every `N` seconds that we do not relay any stderr data over the sideband channel. As with 05e9515, this means that we won't bother sending keepalives when there's actual progress data, but will kick in when it has been disabled (or if there is a lull in the progress data). The concept is simple, but the details are subtle enough that they need discussing here. Before the client sends us the pack, we don't want to do any keepalives. We'll have sent our ref advertisement, and we're waiting for them to send us the pack (and tell us that they support sidebands at all). While we're receiving the pack from the client (or waiting for it to start), there's no need for keepalives; it's up to them to keep the connection active by sending data. Moreover, it would be wrong for us to do so. When we are the server in the smart-http protocol, we must treat our connection as half-duplex. So any keepalives we send while receiving the pack would potentially be buffered by the webserver. Not only does this make them useless (since they would not be delivered in a timely manner), but it could actually cause a deadlock if we fill up the buffer with keepalives. (It wouldn't be wrong to send keepalives in this phase for a full-duplex connection like ssh; it's simply pointless, as it is the client's responsibility to speak). As soon as we've gotten all of the pack data, then the client is waiting for us to speak, and we should start keepalives immediately. From here until the end of the connection, we send one any time we are not otherwise sending data. But there's a catch. Receive-pack doesn't know the moment we've gotten all the data. It passes the descriptor to index-pack, who reads all of the data, and then starts resolving the deltas. We have to communicate that back. To make this work, we instruct the sideband muxer to enable keepalives in three phases: 1. In the beginning, not at all. 2. While reading from index-pack, wait for a signal indicating end-of-input, and then start them. 3. Afterwards, always. The signal from index-pack in phase 2 has to come over the stderr channel which the muxer is reading. We can't use an extra pipe because the portable run-command interface only gives us stderr and stdout. Stdout is already used to pass the .keep filename back to receive-pack. We could also send a signal there, but then we would find out about it in the main thread. And the keepalive needs to be done by the async muxer thread (since it's the one writing sideband data back to the client). And we can't reliably signal the async thread from the main thread, because the async code sometimes uses threads and sometimes uses forked processes. Therefore the signal must come over the stderr channel, where it may be interspersed with other random human-readable messages from index-pack. This patch makes the signal a single NUL byte. This is easy to parse, should not appear in any normal stderr output, and we don't have to worry about any timing issues (like seeing half the signal bytes in one read(), and half in a subsequent one). This is a bit ugly, but it's simple to code and should work reliably. Another option would be to stop using an async thread for muxing entirely, and just poll() both stderr and stdout of index-pack from the main thread. This would work for index-pack (because we aren't doing anything useful in the main thread while it runs anyway). But it would make the connectivity check and the hook muxers much more complicated, as they need to simultaneously feed the sub-programs while reading their stderr. The index-pack phase is the only one that needs this signaling, so it could simply behave differently than the other two. That would mean having two separate implementations of copy_to_sideband (and the keepalive code), though. And it still doesn't get rid of the signaling; it just means we can write a nicer message like "END_OF_INPUT" or something on stdout, since we don't have to worry about separating it from the stderr cruft. One final note: this signaling trick is only done with index-pack, not with unpack-objects. There's no point in doing it for the latter, because by definition it only kicks in for a small number of objects, where keepalives are not as useful (and this conveniently lets us avoid duplicating the implementation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
use_keepalive = KEEPALIVE_ALWAYS;
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
execute_commands(commands, unpack_status, &si,
&push_options);
if (pack_lockfile)
unlink_or_warn(pack_lockfile);
sigchain_push(SIGPIPE, SIG_IGN);
if (report_status_v2)
report_v2(commands, unpack_status);
else if (report_status)
report(commands, unpack_status);
sigchain_pop(SIGPIPE);
push options: {pre,post}-receive hook learns about push options The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
run_receive_hook(commands, "post-receive", 1,
&push_options);
run_update_post_hook(commands);
free_commands(commands);
string_list_clear(&push_options, 0);
if (auto_gc) {
struct child_process proc = CHILD_PROCESS_INIT;
proc.no_stdin = 1;
proc.stdout_to_stderr = 1;
proc.err = use_sideband ? -1 : 0;
proc.git_cmd = proc.close_object_store = 1;
strvec_pushl(&proc.args, "gc", "--auto", "--quiet",
NULL);
if (!start_command(&proc)) {
if (use_sideband)
copy_to_sideband(proc.err, -1, NULL);
finish_command(&proc);
}
}
if (auto_update_server_info)
update_server_info(0);
clear_shallow_info(&si);
}
if (use_sideband)
packet_flush(1);
oid_array_clear(&shallow);
oid_array_clear(&ref);
string_list_clear(&hidden_refs, 0);
free((void *)push_cert_nonce);
return 0;
}