Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object. This provides no
functional change, as it is essentially a macro substitution.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
struct object is one of the major data structures dealing with object
IDs. Convert it to use struct object_id instead of an unsigned char
array. Convert get_object_hash to refer to the new member as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash. Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
In addition to matching stripped refs, one can now add hideRefs
patterns that the full (unstripped) ref is matched against. To
distinguish between stripped and full matches, those new patterns
must be prefixed with a circumflex (^).
This commit also removes support for the undocumented and unintended
hideRefs settings ".have" (suppressing all "have" lines) and
"capabilities^{}" (suppressing the capabilities line).
Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make hideRefs handling in upload-pack consistent with the behavior
described in the documentation by stripping refs before comparing them
with prefixes in hideRefs.
Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of open-coding the function pop_commit() just call it. This
makes the intent clearer and reduces code size.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".
To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().
This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With uploadpack.allowReachableSHA1InWant configuration option set on the
server side, "git fetch" can make a request with a "want" line that names
an object that has not been advertised (likely to have been obtained out
of band or from a submodule pointer). Only objects reachable from the
branch tips, i.e. the union of advertised branches and branches hidden by
transfer.hideRefs, will be processed. Note that there is an associated
cost of having to walk back the history to check the reachability.
This feature can be used when obtaining the content of a certain commit,
for which the sha1 is known, without the need of cloning the whole
repository, especially if a shallow fetch is used. Useful cases are e.g.
repositories containing large files in the history, fetching only the
needed data for a submodule checkout, when sharing a sha1 without telling
which exact branch it belongs to and in Gerrit, if you think in terms of
commits instead of change numbers. (The Gerrit case has already been
solved through allowTipSHA1InWant as every Gerrit change has a ref.)
Signed-off-by: Fredrik Medley <fredrik.medley@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To allow future extensions, e.g. allowing non-tip sha1, replace the
boolean allow_tip_sha1_in_want variable with the flag-style
allow_request_with_bare_object_name variable.
Signed-off-by: Fredrik Medley <fredrik.medley@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct commit_graft and necessary local parts of commit.c.
Also, convert several constants based on the hex length of an SHA-1 to
use GIT_SHA1_HEXSZ, and move several magic constants into variables for
readability.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check whether the return value of lookup_unknown_object
is NULL, but some code paths dereference it before our
check. This turns out not to be capable of causing a
segfault, though. The lookup_unknown_object function will
never return NULL, since the whole point is to allocate an
object struct if it does not find an existing one. So the
code here is not wrong, it is just confusing. Let's just
drop the NULL check.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When upload-pack advertises the refs (either for a normal,
non-stateless request, or for the initial contact in a
stateless one), we call for_each_ref with the send_ref
function as its callback. send_ref, in turn, calls
mark_our_ref, which checks whether the ref is hidden, and
sets OUR_REF or HIDDEN_REF on the object as appropriate. If
it is hidden, mark_our_ref also returns "1" to signal
send_ref that the ref should not be advertised.
If we are not advertising refs, (i.e., the follow-up
invocation by an http client to send its "want" lines), we
use mark_our_ref directly as a callback to for_each_ref. Its
marking does the right thing, but when it then returns "1"
to for_each_ref, the latter interprets this as an error and
stops iterating. As a result, we skip marking all of the
refs that come lexicographically after it. Any "want" lines
from the client asking for those objects will fail, as they
were not properly marked with OUR_REF.
To solve this, we introduce a wrapper callback around
mark_our_ref which always returns 0 (even if the ref is
hidden, we want to keep iterating). We also tweak the
signature of mark_our_ref to exclude unnecessary parameters
that were present only to conform to the callback interface.
This should make it less likely for somebody to accidentally
use it as a callback in the future.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching into or pushing from a shallow repository, we want to
aggressively mark edges as uninteresting, since this decreases the pack
size. However, aggressively marking edges can negatively affect
performance on large non-shallow repositories with lots of refs.
Teach pack-objects a --shallow option to indicate that we're pushing
from or fetching into a shallow repository. Use
--objects-edge-aggressive only for shallow repositories and otherwise
use --objects-edge, which performs better in the general case. Update
the callers to pass the --shallow option when they are dealing with a
shallow repository.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref
resolves successfully for writing but not for reading). Change this to be
a flags field instead, and pass the new constant RESOLVE_REF_READING when
we want this behaviour.
While at it, swap two of the arguments in the function to put output
arguments at the end. As a nice side effect, this ensures that we can
catch callers that were unaware of the new API so they can be audited.
Give the wrapper functions resolve_refdup and read_ref_full the same
treatment for consistency.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Keep poll's timeout at -1 when uploadpack.keepalive = 0, instead of
setting it to -1000, since some pedantic old systems (eg HP-UX) and
the gnulib compat/poll will treat only -1 as the valid value for
an infinite timeout.
Signed-off-by: Edward Thomson <ethomson@microsoft.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most struct child_process variables are cleared using memset first after
declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead. That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the field "flags" is mainly used by the revision walker, it is
also used in many other places. Centralize the whole flag allocation to
one place for a better overview (and easier to move flags if we have
too).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before cdab485 (upload-pack: delegate rev walking in shallow fetch to
pack-objects - 2013-08-16) upload-pack does not write to the source
repository. cdab485 starts to write $GIT_DIR/shallow_XXXXXX if it's a
shallow fetch, so the source repo must be writable.
git:// servers do not need write access to repos and usually don't
have it, which means cdab485 breaks shallow clone over git://
Instead of using a temporary file as the media for shallow points, we
can send them over stdin to pack-objects as well. Prepend shallow
SHA-1 with --shallow so pack-objects knows what is what.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We sometimes write tempfiles of the form "shallow_XXXXXX"
during fetch/push operations with shallow repositories.
Under normal circumstances, we clean up the result when we
are done. However, we do no take steps to clean up after
ourselves when we exit due to die() or signal death.
This patch teaches the tempfile creation code to register
handlers to clean up after ourselves. To handle this, we
change the ownership semantics of the filename returned by
setup_temporary_shallow. It now keeps a copy of the filename
itself, and returns only a const pointer to it.
We can also do away with explicit tempfile removal in the
callers. They all exit not long after finishing with the
file, so they can rely on the auto-cleanup, simplifying the
code.
Note that we keep things simple and maintain only a single
filename to be cleaned. This is sufficient for the current
caller, but we future-proof it with a die("BUG").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The semantics of this flag was changed in commit
e1111cef23 inline lookup_replace_object() calls
but wasn't renamed at the time to minimize code churn. Rename it now,
and add a comment explaining its use.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "fetch --depth=N" where N exceeds the longest chain of history in
the source repo, usually we just send an "unshallow" line to the
client so full history is obtained.
When the source repo is shallow we need to make sure to "unshallow"
the current shallow point _and_ "shallow" again when the commit
reaches its shallow bottom in the source repo.
This should fix both cases: large <N> and --unshallow.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If either receive-pack or upload-pack is called on a shallow
repository, shallow commits (*) will be sent after the ref
advertisement (but before the packet flush), so that the receiver has
the full "shape" of the sender's commit graph. This will be needed for
the receiver to update its .git/shallow if necessary.
This breaks the protocol for all clients trying to push to a shallow
repo, or fetch from one. Which is basically the same end result as
today's "is_repository_shallow() && die()" in receive-pack and
upload-pack. New clients will be made aware of shallow upstream and
can make use of this information.
The sender must send all shallow commits that are sent in the
following pack. It may send more shallow commits than necessary.
upload-pack for example may choose to advertise no shallow commits if
it knows in advance that the pack it's going to send contains no
shallow commits. But upload-pack is the server, so we choose the
cheaper way, send full .git/shallow and let the client deal with it.
Smart HTTP is not affected by this patch. Shallow support on
smart-http comes later separately.
(*) A shallow commit is a commit that terminates the revision
walker. It is usually put in .git/shallow in order to keep the
revision walker from going out of bound because there is no
guarantee that objects behind this commit is available.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 5e7dcad771cb873e278a0571b46910d7c32e2f6c; there
may be unbounded number of symbolic refs in the repository, but the
capability header line in the on-wire protocol has a rather low
length limit.
Many calls to parse_commit detect errors and die. In some
cases, the custom error messages are more useful than what
parse_commit_or_die could produce, because they give some
context, like which ref the commit came from. Some, however,
just say "invalid commit". Let's convert the latter to use
parse_commit_or_die; its message is slightly more informative,
and it makes the error more consistent throughout git.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the same mechanism as used to tell where "HEAD" points at to
the other end, we can tell the target of other symbolic refs as
well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One long-standing flaw in the pack transfer protocol was that there
was no way to tell the other end which branch "HEAD" points at.
With a capability "symref=HEAD:refs/heads/master", let the sender to
tell the receiver what symbolic ref points at what ref.
This capability can be repeated more than once to represent symbolic
refs other than HEAD, such as "refs/remotes/origin/HEAD").
Add an infrastructure to collect symbolic refs, format them as extra
capabilities and put it on the wire. For now, just send information
on the "HEAD" and nothing else.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callee does not use cb_data, and the caller is an intermediate
function in a callchain that later wants to use the cb_data for its
own use. Clarify the code by breaking the dataflow explicitly by
not passing cb_data down to mark_our_ref().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no reason not to turn on keepalives by default.
They take very little bandwidth, and significantly less than
the progress reporting they are replacing. And in the case
that progress reporting is on, we should never need to send
a keepalive anyway, as we will constantly be showing
progress and resetting the keepalive timer.
We do not necessarily know what the client's idea of a
reasonable timeout is, so let's keep this on the low side of
5 seconds. That is high enough that we will always prefer
our normal 1-second progress reports to sending a keepalive
packet, but low enough that no sane client should consider
the connection hung.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When upload-pack has started pack-objects, there may be a quiet
period while pack-objects prepares the pack (i.e., counting objects
and delta compression). Normally we would see (and send to the
client) progress information, but if "--quiet" is in effect,
pack-objects will produce nothing at all until the pack data is
ready. On a large repository, this can take tens of seconds (or even
minutes if the system is loaded or the repository is badly packed).
Clients or intermediate proxies can sometimes give up in this
situation, assuming that the server or connection has hung.
This patch introduces a "keepalive" option; if upload-pack sees no
data from pack-objects for a certain number of seconds, it will send
an empty sideband data packet to let the other side know that we are
still working on it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
upload-pack has a special revision walking code for shallow
recipients. It works almost like the similar code in pack-objects
except:
1. in upload-pack, graft points could be added for deepening;
2. also when the repository is deepened, the shallow point will be
moved further away from the tip, but the old shallow point will be
marked as edge to produce more efficient packs. See 6523078 (make
shallow repository deepening more network efficient - 2009-09-03).
Pass the file to pack-objects via --shallow-file. This will override
$GIT_DIR/shallow and give pack-objects the exact repository shape
that upload-pack has.
mark edge commits by revision command arguments. Even if old shallow
points are passed as "--not" revisions as in this patch, they will not
be picked up by mark_edges_uninteresting() because this function looks
up to parents for edges, while in this case the edge is the children,
in the opposite direction. This will be fixed in an later patch when
all given uninteresting commits are marked as edges.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The definition of "struct ref" in "cache.h", a header file so
central to the system, always confused me. This structure is not
about the local ref used by sha1-name API to name local objects.
It is what refspecs are expanded into, after finding out what refs
the other side has, to define what refs are updated after object
transfer succeeds to what values. It belongs to "remote.h" together
with "struct refspec".
While we are at it, also move the types and functions related to the
Git transport connection to a new header file connect.h
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the client sends a 'shallow' line for an object that the server does
not have, the server currently dies with the error: "did not find object
for shallow <obj-id>". The client may have truncated the history at
the commit by fetching shallowly from a different server, or the commit
may have been garbage collected by the server. In either case, this
unknown commit is not relevant for calculating the pack that is to be
sent and can be safely ignored, and it is not used when recomputing where
the updated history of the client is cauterised.
The documentation in technical/pack-protocol.txt has been updated to
remove the restriction that "Clients MUST NOT mention an obj-id which it
does not know exists on the server". This requirement is not realistic
because clients cannot know whether an object has been garbage collected
by the server.
Signed-off-by: Michael Heemskerk <mheemskerk@atlassian.com>
Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is a long-time security feature that upload-pack will not
serve any "want" lines that do not correspond to the tip of
one of our refs. Traditionally, this was enforced by
checking the objects in the in-memory hash; they should have
been loaded and received the OUR_REF flag during the
advertisement.
The stateless-rpc mode, however, has a race condition here:
one process advertises, and another receives the want lines,
so the refs may have changed in the interim. To address
this, commit 051e400 added a new verification mode; if the
object is not OUR_REF, we set a "has_non_tip" flag, and then
later verify that the requested objects are reachable from
our current tips.
However, we still die immediately when the object is not in
our in-memory hash, and at this point we should only have
loaded our tip objects. So the check_non_tip code path does
not ever actually trigger, as any non-tip objects would
have already caused us to die.
We can fix that by using parse_object instead of
lookup_object, which will load the object from disk if it
has not already been loaded.
We still need to check that parse_object does not return
NULL, though, as it is possible we do not have the object
at all. A more appropriate error message would be "no such
object" rather than "not our ref"; however, we do not want
to leak information about what objects are or are not in
the object database, so we continue to use the same "not
our ref" message that would be produced by an unreachable
object.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When upload-pack receives a "want" line from the client, it
adds it to an object array. We call lookup_object to find
the actual object, which will only check for objects already
in memory. This works because we are expecting to find
objects that we already loaded during the ref advertisement.
We use the resulting object structs for a variety of
purposes. Some of them care only about the object flags, but
others care about the type of the object (e.g.,
ok_to_give_up), or even feed them to the revision parser
(when --depth is used), which assumes that objects it
receives are fully parsed.
Once upon a time, this was OK; any object we loaded into
memory would also have been parsed. But since 435c833
(upload-pack: use peel_ref for ref advertisements,
2012-10-04), we try to avoid parsing objects during the ref
advertisement. This means that lookup_object may return an
object with a type of OBJ_NONE. The resulting mess depends
on the exact set of objects, but can include the revision
parser barfing, or the shallow code sending the wrong set of
objects.
This patch teaches upload-pack to parse each "want" object
as we receive it. We do not replace the lookup_object call
with parse_object, as the current code is careful not to let
just any object appear on a "want" line, but rather only one
we have previously advertised (whereas parse_object would
actually load any arbitrary object from disk).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we receive a "have" line from the client, we want to
load the object pointed to by the sha1. However, we are
careful to do:
o = lookup_object(sha1);
if (!o || !o->parsed)
o = parse_object(sha1);
to avoid loading the object from disk if we have already
seen it. However, since ccdc603 (parse_object: try internal
cache before reading object db), parse_object already does
this optimization internally. We can just call parse_object
directly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the callers of packet_read_line just read into a
static 1000-byte buffer (callers which handle arbitrary
binary data already use LARGE_PACKET_MAX). This works fine
in practice, because:
1. The only variable-sized data in these lines is a ref
name, and refs tend to be a lot shorter than 1000
characters.
2. When sending ref lines, git-core always limits itself
to 1000 byte packets.
However, the only limit given in the protocol specification
in Documentation/technical/protocol-common.txt is
LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in
pack-protocol.txt, and then only describing what we write,
not as a specific limit for readers.
This patch lets us bump the 1000-byte limit to
LARGE_PACKET_MAX. Even though git-core will never write a
packet where this makes a difference, there are two good
reasons to do this:
1. Other git implementations may have followed
protocol-common.txt and used a larger maximum size. We
don't bump into it in practice because it would involve
very long ref names.
2. We may want to increase the 1000-byte limit one day.
Since packets are transferred before any capabilities,
it's difficult to do this in a backwards-compatible
way. But if we bump the size of buffer the readers can
handle, eventually older versions of git will be
obsolete enough that we can justify bumping the
writers, as well. We don't have plans to do this
anytime soon, but there is no reason not to start the
clock ticking now.
Just bumping all of the reading bufs to LARGE_PACKET_MAX
would waste memory. Instead, since most readers just read
into a temporary buffer anyway, let's provide a single
static buffer that all callers can use. We can further wrap
this detail away by having the packet_read_line wrapper just
use the buffer transparently and return a pointer to the
static storage. That covers most of the cases, and the
remaining ones already read into their own LARGE_PACKET_MAX
buffers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The packets sent during ref negotiation are all terminated
by newline; even though the code to chomp these newlines is
short, we end up doing it in a lot of places.
This patch teaches packet_read_line to auto-chomp the
trailing newline; this lets us get rid of a lot of inline
chomping code.
As a result, some call-sites which are not reading
line-oriented data (e.g., when reading chunks of packfiles
alongside sideband) transition away from packet_read_line to
the generic packet_read interface. This patch converts all
of the existing callsites.
Since the function signature of packet_read_line does not
change (but its behavior does), there is a possibility of
new callsites being introduced in later commits, silently
introducing an incompatibility. However, since a later
patch in this series will change the signature, such a
commit would have to be merged directly into this commit,
not to the tip of the series; we can therefore ignore the
issue.
This is an internal cleanup and should produce no change of
behavior in the normal case. However, there is one corner
case to note. Callers of packet_read_line have never been
able to tell the difference between a flush packet ("0000")
and an empty packet ("0004"), as both cause packet_read_line
to return a length of 0. Readers treat them identically,
even though Documentation/technical/protocol-common.txt says
we must not; it also says that implementations should not
send an empty pkt-line.
By stripping out the newline before the result gets to the
caller, we will now treat the newline-only packet ("0005\n")
the same as an empty packet, which in turn gets treated like
a flush packet. In practice this doesn't matter, as neither
empty nor newline-only packets are part of git's protocols
(at least not for the line-oriented bits, and readers who
are not expecting line-oriented packets will be calling
packet_read directly, anyway). But even if we do decide to
care about the distinction later, it is orthogonal to this
patch. The right place to tighten would be to stop treating
empty packets as flush packets, and this change does not
make doing so any harder.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is just write_or_die by another name. The one
distinction is that write_or_die will treat EPIPE specially
by suppressing error messages. That's fine, as we die by
SIGPIPE anyway (and in the off chance that it is disabled,
write_or_die will simulate it).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you set the GIT_DEBUG_SEND_PACK environment variable,
upload-pack will dump lines it receives in the receive_needs
phase to a descriptor. This debugging harness is a strict
subset of what GIT_TRACE_PACKET can do. Let's just drop it
in favor of that.
A few tests used GIT_DEBUG_SEND_PACK to confirm which
objects get sent; we have to adapt them to the new output
format.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the client tells us it has a shallow object via
"shallow <sha1>", we make sure we have the object, mark it
with a flag, then add it to a dynamic array of shallow
objects. This means that a client can get us to allocate
arbitrary amounts of memory just by flooding us with shallow
lines (whether they have the objects or not). You can
demonstrate it easily with:
yes '0035shallow e83c5163316f89bfbde7d9ab23ca2e25604af290' |
git-upload-pack git.git
We already protect against duplicates in want lines by
checking if our flag is already set; let's do the same thing
here. Note that a client can still get us to allocate some
amount of memory by marking every object in the repo as
"shallow" (or "want"). But this at least bounds it with the
number of objects in the repository, which is not under the
control of an upload-pack client.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we receive a line like "shallow <sha1>" from the
client, we feed the <sha1> part to get_sha1. This is a
mistake, as the argument on a shallow line is defined by
Documentation/technical/pack-protocol.txt to contain an
"obj-id". This is never defined in the BNF, but it is clear
from the text and from the other uses that it is meant to be
a hex sha1, not an arbitrary identifier (and that is what
fetch-pack has always sent).
We should be using get_sha1_hex instead, which doesn't allow
the client to request arbitrary junk like "HEAD@{yesterday}".
Because this is just marking shallow objects, the client
couldn't actually do anything interesting (like fetching
objects from unreachable reflog entries), but we should keep
our parsing tight to be on the safe side.
Because get_sha1 is for the most part a superset of
get_sha1_hex, in theory the only behavior change should be
disallowing non-hex object references. However, there is
one interesting exception: get_sha1 will only parse
a 40-character hex sha1 if the string has exactly 40
characters, whereas get_sha1_hex will just eat the first 40
characters, leaving the rest. That means that current
versions of git-upload-pack will not accept a "shallow"
packet that has a trailing newline, even though the protocol
documentation is clear that newlines are allowed (even
encouraged) in non-binary parts of the protocol.
This never mattered in practice, though, because fetch-pack,
contrary to the protocol documentation, does not include a
newline in its shallow lines. JGit follows its lead (though
it correctly is strict on the parsing end about wanting a
hex object id).
We do not adjust fetch-pack to send newlines here, as it
would break communication with older versions of git (and
there is no actual benefit to doing so, except for
consistency with other parts of the protocol).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With uploadpack.allowtipsha1inwant configuration option set, future
versions of "git fetch" that allow an exact object name (likely to
have been obtained out of band) on the LHS of the fetch refspec can
make a request with a "want" line that names an object that may not
have been advertised due to transfer.hiderefs configuration.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>