The git-over-rsync protocol is inefficient and broken, and
has been for a long time. It transfers way more objects than
it needs (grabbing all of the remote's "objects/",
regardless of which objects we need). It does its own ad-hoc
parsing of loose and packed refs from the remote, but
doesn't properly override packed refs with loose ones,
leading to garbage results (e.g., expecting the other side
to have an object pointed to by a stale packed-refs entry,
or complaining that the other side has two copies of the
refs[1]).
This latter breakage means that nobody could have
successfully pulled from a moderately active repository
since cd547b4 (fetch/push: readd rsync support, 2007-10-01).
We never made an official deprecation notice in the release
notes for git's rsync protocol, but the tutorial has marked
it as such since 914328a (Update tutorial., 2005-08-30).
And on the mailing list as far back as Oct 2005, we can find
Junio mentioning it as having "been deprecated for quite
some time."[2,3,4]. So it was old news then; cogito had
deprecated the transport in July of 2005[5] (though it did
come back briefly when Linus broke git-http-pull!).
Of course some people professed their love of rsync through
2006, but Linus clarified in his usual gentle manner[6]:
> Thanks! This is why I still use rsync, even though
> everybody and their mother tells me "Linus says rsync is
> deprecated."
No. You're using rsync because you're actively doing
something _wrong_.
The deprecation sentiment was reinforced in 2008, with a
mention that cloning via rsync is broken (with no fix)[7].
Even the commit porting rsync over to C from shell (cd547b4)
lists it as deprecated! So between the 10 years of informal
warnings, and the fact that it has been severely broken
since 2007, it's probably safe to simply remove it without
further deprecation warnings.
[1] http://article.gmane.org/gmane.comp.version-control.git/285101
[2] http://article.gmane.org/gmane.comp.version-control.git/10093
[3] http://article.gmane.org/gmane.comp.version-control.git/17734
[4] http://article.gmane.org/gmane.comp.version-control.git/18911
[5] http://article.gmane.org/gmane.comp.version-control.git/5617
[6] http://article.gmane.org/gmane.comp.version-control.git/19354
[7] http://article.gmane.org/gmane.comp.version-control.git/103635
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While working in connect.c to perform non-blocking connections,
I noticed calling "git fetch -v" was not causing the progress
messages inside git_tcp_connect_sock to be emitted as I
expected.
Looking at history, it seems connect_setup has never been called
with the verbose parameter. Since transport already has a
"verbose" field, use that field instead of another parameter
in connect_setup.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Since ec7dbd145 (receive-pack: allow hooks to ignore its
standard input stream) the pre-receive and post-receive
hooks ignore SIGPIPE. Do the same for the remaining hooks
pre-push and post-rewrite, which read from standard input.
The same arguments for ignoring SIGPIPE apply.
Include test by Jeff King which checks that SIGPIPE does not
cause pre-push hook failure. With the use of git update-ref
--stdin it is fast enough to be enabled by default.
Signed-off-by: Clemens Buchacher <clemens.buchacher@intel.com>
Signed-off-by: Jeff King <peff@peff.net>
We generate range strings like "1234abcd...5678efab" for use
in the the fetch and push status tables. We use fixed-size
buffers along with strcat to do so. These aren't buggy, as
our manual size computation is correct, but there's nothing
checking that this is so. Let's switch them to strbufs
instead, which are obviously correct, and make it easier to
audit the code base for problematic calls to strcat().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current callers only want to die when their transport is
prohibited. But future callers want to query the mechanism
without dying.
Let's break out a few query functions, and also save the
results in a static list so we don't have to re-parse for
each query.
Based-on-a-patch-by: Blake Burkhart <bburky@bburky.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are cloning an untrusted remote repository into a
sandbox, we may also want to fetch remote submodules in
order to get the complete view as intended by the other
side. However, that opens us up to attacks where a malicious
user gets us to clone something they would not otherwise
have access to (this is not necessarily a problem by itself,
but we may then act on the cloned contents in a way that
exposes them to the attacker).
Ideally such a setup would sandbox git entirely away from
high-value items, but this is not always practical or easy
to set up (e.g., OS network controls may block multiple
protocols, and we would want to enable some but not others).
We can help this case by providing a way to restrict
particular protocols. We use a whitelist in the environment.
This is more annoying to set up than a blacklist, but
defaults to safety if the set of protocols git supports
grows). If no whitelist is specified, we continue to default
to allowing all protocols (this is an "unsafe" default, but
since the minority of users will want this sandboxing
effect, it is the only sensible one).
A note on the tests: ideally these would all be in a single
test file, but the git-daemon and httpd test infrastructure
is an all-or-nothing proposition rather than a test-by-test
prerequisite. By putting them all together, we would be
unable to test the file-local code on machines without
apache.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the callsites covered by this change call write_file() or
write_file_gently() to create a one-liner file. Drop the caller
supplied LF and let these callees to append it as necessary.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All callers except three passed 1 for the "fatal" parameter to ask
this function to die upon error, but to a casual reader of the code,
it was not all obvious what that 1 meant. Instead, split the
function into two based on a common write_file_v() that takes the
flag, introduce write_file_gently() as a new way to attempt creating
a file without dying on error, and make three callers to call it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new flag --sign=true (or --sign=false), which means the same
thing as the original --signed (or --no-signed). Give it a third
value --sign=if-asked to tell push and send-pack to send a push
certificate if and only if the server advertised a push cert nonce.
If not, warn the user that their push may not be as secure as they
thought.
Signed-off-by: Dave Borowitz <dborowitz@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This field was set in transport_set_option, but never read in the push
code. The push code basically ignores the smart_options field
entirely, and derives its options from the flags arguments to the
push* callbacks. Note that in git_transport_push there are already
several args set from flags that have no corresponding field in
git_transport_options; after this change, push_cert is just like
those.
Signed-off-by: Dave Borowitz <dborowitz@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".
To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().
This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetch_refs_via_pack calls fetch_pack(), we pass a
list of refs to fetch, and the function returns either a
copy of that list, with the fetched items filled in, or
NULL. We check the return value to see whether the fetch was
successful, but do not otherwise look at the copy, and
simply leak it at the end of the function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the standard function isxdigit() to make the intent clearer and
avoid using magic constants.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a command line argument to the git push command to request atomic
pushes.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds support to send-pack to negotiate and use atomic pushes
iff the server supports it. Atomic pushes are activated by a new command
line flag --atomic.
In order to do this we also need to change the semantics for send_pack()
slightly. The existing send_pack() function actually doesn't send all the
refs back to the server when multiple refs are involved, for example
when using --all. Several of the failure modes for pushes can already be
detected locally in the send_pack client based on the information from the
initial server side list of all the refs as generated by receive-pack.
Any such refs that we thus know would fail to push are thus pruned from
the list of refs we send to the server to update.
For atomic pushes, we have to deal thus with both failures that are detected
locally as well as failures that are reported back from the server. In order
to do so we treat all local failures as push failures too.
We introduce a new status code REF_STATUS_ATOMIC_PUSH_FAILED so we can
flag all refs that we would normally have tried to push to the server
but we did not due to local failures. This is to improve the error message
back to the end user to flag that "these refs failed to update since the
atomic push operation failed."
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes common problems in these code about error handling,
forgetting to close the file handle after fprintf() fails, or not
printing out the error string..
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref
resolves successfully for writing but not for reading). Change this to be
a flags field instead, and pass the new constant RESOLVE_REF_READING when
we want this behaviour.
While at it, swap two of the arguments in the function to put output
arguments at the end. As a nice side effect, this ensures that we can
catch callers that were unaware of the new API so they can be audited.
Give the wrapper functions resolve_refdup and read_ref_full the same
treatment for consistency.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Record the URL of the intended recipient for a push (after
anonymizing it if it has authentication material) on a new "pushee
URL" header. Because the networking configuration (SSH-tunnels,
proxies, etc.) on the pushing user's side varies, the receiving
repository may not know the single canonical URL all the pushing
users would refer it as (besides, many sites allow pushing over
ssh://host/path and https://host/path protocols to the same
repository but with different local part of the path). So this
value may not be reliably used for replay-attack prevention
purposes, but this will still serve as a human readable hint to
identify the repository the certificate refers to.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While signed tags and commits assert that the objects thusly signed
came from you, who signed these objects, there is not a good way to
assert that you wanted to have a particular object at the tip of a
particular branch. My signing v2.0.1 tag only means I want to call
the version v2.0.1, and it does not mean I want to push it out to my
'master' branch---it is likely that I only want it in 'maint', so
the signature on the object alone is insufficient.
The only assurance to you that 'maint' points at what I wanted to
place there comes from your trust on the hosting site and my
authentication with it, which cannot easily audited later.
Introduce a mechanism that allows you to sign a "push certificate"
(for the lack of better name) every time you push, asserting that
what object you are pushing to update which ref that used to point
at what other object. Think of it as a cryptographic protection for
ref updates, similar to signed tags/commits but working on an
orthogonal axis.
The basic flow based on this mechanism goes like this:
1. You push out your work with "git push --signed".
2. The sending side learns where the remote refs are as usual,
together with what protocol extension the receiving end
supports. If the receiving end does not advertise the protocol
extension "push-cert", an attempt to "git push --signed" fails.
Otherwise, a text file, that looks like the following, is
prepared in core:
certificate version 0.1
pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700
7339ca65... 21580ecb... refs/heads/master
3793ac56... 12850bec... refs/heads/next
The file begins with a few header lines, which may grow as we
gain more experience. The 'pusher' header records the name of
the signer (the value of user.signingkey configuration variable,
falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the
certificate generation. After the header, a blank line follows,
followed by a copy of the protocol message lines.
Each line shows the old and the new object name at the tip of
the ref this push tries to update, in the way identical to how
the underlying "git push" protocol exchange tells the ref
updates to the receiving end (by recording the "old" object
name, the push certificate also protects against replaying). It
is expected that new command packet types other than the
old-new-refname kind will be included in push certificate in the
same way as would appear in the plain vanilla command packets in
unsigned pushes.
The user then is asked to sign this push certificate using GPG,
formatted in a way similar to how signed tag objects are signed,
and the result is sent to the other side (i.e. receive-pack).
In the protocol exchange, this step comes immediately before the
sender tells what the result of the push should be, which in
turn comes before it sends the pack data.
3. When the receiving end sees a push certificate, the certificate
is written out as a blob. The pre-receive hook can learn about
the certificate by checking GIT_PUSH_CERT environment variable,
which, if present, tells the object name of this blob, and make
the decision to allow or reject this push. Additionally, the
post-receive hook can also look at the certificate, which may be
a good place to log all the received certificates for later
audits.
Because a push certificate carry the same information as the usual
command packets in the protocol exchange, we can omit the latter
when a push certificate is in use and reduce the protocol overhead.
This however is not included in this patch to make it easier to
review (in other words, the series at this step should never be
released without the remainder of the series, as it implements an
interim protocol that will be incompatible with the final one).
As such, the documentation update for the protocol is left out of
this step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most struct child_process variables are cleared using memset first after
declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead. That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function starts by creating a copy of the static buffer
returned by real_path, but forgets to free it in the error
code paths. We can solve this by jumping to the cleanup code
that is already there.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using memset and then manually setting values of the string-list
members is not future proof as the internal representation of
string-list may change any time.
Use `string_list_init()` or STRING_LIST_INIT_* macros instead of
memset.
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the existing argv_array member instead of building the arguments
list using a string array and a strbuf. This way we don't need magic
number constants and allocations are cleaned up for us automatically
by run_command().
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The skip_prefix() function returns a pointer to the content
past the prefix, or NULL if the prefix was not found. While
this is nice and simple, in practice it makes it hard to use
for two reasons:
1. When you want to conditionally skip or keep the string
as-is, you have to introduce a temporary variable.
For example:
tmp = skip_prefix(buf, "foo");
if (tmp)
buf = tmp;
2. It is verbose to check the outcome in a conditional, as
you need extra parentheses to silence compiler
warnings. For example:
if ((cp = skip_prefix(buf, "foo"))
/* do something with cp */
Both of these make it harder to use for long if-chains, and
we tend to use starts_with() instead. However, the first line
of "do something" is often to then skip forward in buf past
the prefix, either using a magic constant or with an extra
strlen(3) (which is generally computed at compile time, but
means we are repeating ourselves).
This patch refactors skip_prefix() to return a simple boolean,
and to provide the pointer value as an out-parameter. If the
prefix is not found, the out-parameter is untouched. This
lets you write:
if (skip_prefix(arg, "foo ", &arg))
do_foo(arg);
else if (skip_prefix(arg, "bar ", &arg))
do_bar(arg);
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing, we do not even look at our push refspecs until
after we have made contact with the remote receive-pack and
gotten its list of refs. This means that we may go to some
work, including asking the user to log in, before realizing
we have simple errors like "git push origin matser".
We cannot catch all refspec problems, since fully evaluating
the refspecs requires knowing what the remote side has. But
we can do a quick sanity check of the local side and catch a
few simple error cases.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The same steps are done as in when --update-shallow is not given. The
only difference is we now add all shallow commits in "ours" and
"theirs" to .git/shallow (aka "step 8").
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch just put together pieces from the 8 steps patch. We stop at
step 7 and reject refs that require new shallow commits.
Note that, by rejecting refs that require new shallow commits, we
leave dangling objects in the repo, which become "object islands" by
the next "git fetch" of the same source.
If the first fetch our "ours" set is zero and we do practically
nothing at step 7, "ours" is full at the next fetch and we may need to
walk through commits for reachability test. Room for improvement.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cloning from a shallow repository does not follow the "8 steps for new
.git/shallow" because if it does we need to get through step 6 for all
refs. That means commit walking down to the bottom.
Instead the rule to create .git/shallow is simpler and, more
importantly, cheap: if a shallow commit is found in the pack, it's
probably used (i.e. reachable from some refs), so we add it. Others
are dropped.
One may notice this method seems flawed by the word "probably". A
shallow commit may not be reachable from any refs at all if it's
attached to an object island (a group of objects that are not
reachable by any refs).
If that object island is not complete, a new fetch request may send
more objects to connect it to some ref. At that time, because we
incorrectly installed the shallow commit in this island, the user will
not see anything after that commit (fsck is still ok). This is not
desired.
Given that object islands are rare (C Git never sends such islands for
security reasons) and do not really harm the repository integrity, a
tradeoff is made to surprise the user occasionally but work faster
everyday.
A new option --strict could be added later that follows exactly the 8
steps. "git prune" can also learn to remove dangling objects _and_ the
shallow commits that are attached to them from .git/shallow.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No callers pass a non-empty pointer as shallow_points at this
stage. As a result, all clients still refuse to talk to shallow
repository on the other end.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The latter can do everything the former can and is used in many more
places.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the function is_local() in transport.c public, rename it into
url_is_local_not_ssh() and use it in both transport.c and connect.c
Use a protocol "local" for URLs for the local file system.
One note about using file:// under Windows:
The (absolute) path on Unix like system typically starts with "/".
When the host is empty, it can be omitted, so that a shell scriptlet
url=file://$pwd
will give a URL like "file:///home/user/repo".
Windows does not have the same concept of a root directory located in "/".
When parsing the URL allow "file://C:/user/repo"
(even if RFC1738 indicates that "file:///C:/user/repo" should be used).
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A Git-aware "connect" transport allows the "transport_take_over" to
redirect generic transport requests like fetch(), push_refs() and
get_refs_list() to the native Git transport handling methods. The
take-over process replaces transport->data with a fake data that
these method implementations understand.
While this hack works OK for a single request, it breaks when the
transport needs to make more than one requests. transport->data
that used to hold necessary information for the specific helper to
work correctly is destroyed during the take-over process.
One codepath that this matters is "git fetch" in auto-follow mode;
when it does not get all the tags that ought to point at the history
it got (which can be determined by looking at the peeled tags in the
initial advertisement) from the primary transfer, it internally
makes a second request to complete the fetch. Because "take-over"
hack has already destroyed the data necessary to talk to the
transport helper by the time this happens, the second request cannot
make a request to the helper to make another connection to fetch
these additional tags.
Mark such a transport as "cannot_reuse", and use a separate
transport to perform the backfill fetch in order to work around
this breakage.
Note that this problem does not manifest itself when running t5802,
because our upload-pack gives you all the necessary auto-followed
tags during the primary transfer. You would need to step through
"git fetch" in a debugger, stop immediately after the primary
transfer finishes and writes these auto-followed tags, remove the
tag references and repack/prune the repository to convince the
"find-non-local-tags" procedure that the primary transfer failed to
give us all the necessary tags, and then let it continue, in order
to trigger the bug in the secondary transfer this patch fixes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This teaches the deepest part of the callchain for "git push" (and
"git send-pack") to enforce "the old value of the ref must be this,
otherwise fail this push" (aka "compare-and-swap" / "--lockref").
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This plugs the push_cas_option data collected by the command line
option parser to the transport system with a new function
apply_push_cas(), which is called after match_push_refs() has
already been called.
At this point, we know which remote we are talking to, and what
remote refs we are going to update, so we can fill in the details
that may have been missing from the command line, such as
(1) what abbreviated refname the user gave us matches the actual
refname at the remote; and
(2) which remote-tracking branch in our local repository to read
the value of the object to expect at the remote.
to populate the old_sha1_expect[] field of each of the remote ref.
As stated in the documentation, the use of remote-tracking branch
as the default is a tentative one, and we may come up with a better
logic as we gain experience.
Still nobody uses this information, which is the topic of the next
patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The definition of "struct ref" in "cache.h", a header file so
central to the system, always confused me. This structure is not
about the local ref used by sha1-name API to name local objects.
It is what refspecs are expanded into, after finding out what refs
the other side has, to define what refs are updated after object
transfer succeeds to what values. It belongs to "remote.h" together
with "struct refspec".
While we are at it, also move the types and functions related to the
Git transport connection to a new header file connect.h
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation and some comments still refer to files in builtin/
as 'builtin-*.[cho]'. Update these to show the correct location.
Signed-off-by: Phil Hord <hordp@cisco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Assisted-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to make sure the cloned repository is good, we run "rev-list
--objects --not --all $new_refs" on the repository. This is expensive
on large repositories. This patch attempts to mitigate the impact in
this special case.
In the "good" clone case, we only have one pack. If all of the
following are met, we can be sure that all objects reachable from the
new refs exist, which is the intention of running "rev-list ...":
- all refs point to an object in the pack
- there are no dangling pointers in any object in the pack
- no objects in the pack point to objects outside the pack
The second and third checks can be done with the help of index-pack as
a slight variation of --strict check (which introduces a new condition
for the shortcut: pack transfer must be used and the number of objects
large enough to call index-pack). The first is checked in
check_everything_connected after we get an "ok" from index-pack.
"index-pack + new checks" is still faster than the current "index-pack
+ rev-list", which is the whole point of this patch. If any of the
conditions fail, we fall back to the good old but expensive "rev-list
..". In that case it's even more expensive because we have to pay for
the new checks in index-pack. But that should only happen when the
other side is either buggy or malicious.
Cloning linux-2.6 over file://
before after
real 3m25.693s 2m53.050s
user 5m2.037s 4m42.396s
sys 0m13.750s 0m16.574s
A more realistic test with ssh:// over wireless
before after
real 11m26.629s 10m4.213s
user 5m43.196s 5m19.444s
sys 0m35.812s 0m37.630s
This shortcut is not applied to shallow clones, partly because shallow
clones should have no more objects than a usual fetch and the cost of
rev-list is acceptable, partly to avoid dealing with corner cases when
grafting is involved.
This shortcut does not apply to unpack-objects code path either
because the number of objects must be small in order to trigger that
code path.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To a human reader, it is quite obvious that cmp is assigned before
it is used, but gcc 4.6.3 that ships with Ubuntu 12.04 is among
those that do not get this right.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to 47ec794, this initialization is meant to
squelch an erroneous uninitialized variable warning from gcc
4.0.1. That version is quite old at this point, and gcc 4.1
and up handle it fine, with one exception. There seems to be
a regression in gcc 4.6.3, which produces the warning;
however, gcc versions 4.4.7 and 4.7.2 do not.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new option "--follow-tags" tells "git push" to push annotated
tags that are missing from the other side and that can be reached by
the history that is otherwise pushed out.
For example, if you are using the "simple", "current", or "upstream"
push, you would ordinarily push the history leading to the commit at
your current HEAD and nothing else. With this option, you would
also push all annotated tags that can be reached from that commit to
the other side.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we can read packet data from memory as easily as a
descriptor, get_remote_heads can take either one as a
source. This will allow further refactoring in remote-curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though "git fetch" has full infrastructure to parse refspecs to
be fetched and match them against the list of refs to come up with
the final list of refs to be fetched, the list of refs that are
requested to be fetched were internally converted to a plain list of
strings at the transport layer and then passed to the underlying
fetch-pack driver.
Stop this conversion and instead pass around an array of refs.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After a push of a branch other than the current branch fails in
a no-ff error and if you are still on an unborn branch, the code
recently added to report the failure dereferenced a null pointer
while checking the name of the current branch.
Signed-off-by: Fraser Tweedale <frase@frase.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>