Create a new unified tracing facility for git. The eventual intent is to
replace the current trace_printf* and trace_performance* routines with a
unified set of git_trace2* routines.
In addition to the usual printf-style API, trace2 provides higer-level
event verbs with fixed-fields allowing structured data to be written.
This makes post-processing and analysis easier for external tools.
Trace2 defines 3 output targets. These are set using the environment
variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT". These may be
set to "1" or to an absolute pathname (just like the current GIT_TRACE).
* GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command
summary data.
* GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
It extends the output with columns for the command process, thread,
repo, absolute and relative elapsed times. It reports events for
child process start/stop, thread start/stop, and per-thread function
nesting.
* GIT_TR2_EVENT is a new structured format. It writes event data as a
series of JSON records.
Calls to trace2 functions log to any of the 3 output targets enabled
without the need to call different trace_printf* or trace_performance*
routines.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a v2 smart-http conversation, the server should reply to our initial
request with a pkt-line saying "version 2". We check that with
starts_with(), but really that should be the only thing in that packet.
A response of "version 20" should not match.
Let's tighten this check to use strcmp(). Note that we don't need to
worry about a trailing newline here, because the ptk-line code will have
chomped it for us already.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After making initial contact with an http server, we have to decide if
the server supports smart-http, and if so, which version. Our rules are
a bit inconsistent:
1. For v0, we require that the content-type indicates a smart-http
response. We also require the response to look vaguely like a
pkt-line starting with "#". If one of those does not match, we fall
back to dumb-http.
But according to our http protocol spec[1]:
Dumb servers MUST NOT return a return type starting with
`application/x-git-`.
If we see the expected content-type, we should consider it
smart-http. At that point we can parse the pkt-line for real, and
complain if it is not syntactically valid.
2. For v2, we do not actually check the content-type. Our v2 protocol
spec says[2]:
When using the http:// or https:// transport a client makes a
"smart" info/refs request as described in `http-protocol.txt`[...]
and the http spec is clear that for a smart-http response[3]:
The Content-Type MUST be `application/x-$servicename-advertisement`.
So it is required according to the spec.
These inconsistencies were easy to miss because of the way the original
code was written as an inline conditional. Let's pull it out into its
own function for readability, and improve a few things:
- we now predicate the smart/dumb decision entirely on the presence of
the correct content-type
- we do a real pkt-line parse before deciding how to proceed (and die
if it isn't valid)
- use skip_prefix() for comparing service strings, instead of
constructing expected output in a strbuf; this avoids dealing with
memory cleanup
Note that this _is_ tightening what the client will allow. It's all
according to the spec, but it's possible that other implementations
might violate these. However, violating these particular rules seems
like an odd choice for a server to make.
[1] Documentation/technical/http-protocol.txt, l. 166-167
[2] Documentation/technical/protocol-v2.txt, l. 63-64
[3] Documentation/technical/http-protocol.txt, l. 247
Helped-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By not setting CURLOPT_FAILONERROR, curl parses the HTTP response
headers even if the response is an error. This makes GIT_CURL_VERBOSE to
show the HTTP headers, which is useful for debugging.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to pass more values for rpc_in, define a struct and pass it as
an additional value.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
curl stops parsing a response when it sees a bad HTTP status code and it
has CURLOPT_FAILONERROR set. This prevents GIT_CURL_VERBOSE to show HTTP
headers on error.
keep_error is an option to receive the HTTP response body for those
error responses. By enabling this option, curl will process the HTTP
response headers, and they're shown if GIT_CURL_VERBOSE is set.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the Git pack protocol definition, an error packet may appear only in
a certain context. However, servers can face a runtime error (e.g. I/O
error) at an arbitrary timing. This patch changes the protocol to allow
an error packet to be sent instead of any packet.
Without this protocol spec change, when a server cannot process a
request, there's no way to tell that to a client. Since the server
cannot produce a valid response, it would be forced to cut a connection
without telling why. With this protocol spec change, the server can be
more gentle in this situation. An old client may see these error packets
as an unexpected packet, but this is not worse than having an unexpected
EOF.
Following this protocol spec change, the error packet handling code is
moved to pkt-line.c. Implementation wise, this implementation uses
pkt-line to communicate with a subprocess. Since this is not a part of
Git protocol, it's possible that a packet that is not supposed to be an
error packet is mistakenly parsed as an error packet. This error packet
handling is enabled only for the Git pack protocol parsing code
considering this.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By using and sharing a packet_reader while handling a Git pack protocol
request, the same reader option is used throughout the code. This makes
it easy to set a reader option to the request parsing code.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When setting
DEVELOPER = 1
DEVOPTS = extra-all
"gcc (Raspbian 6.3.0-18+rpi1+deb9u1) 6.3.0 20170516" errors out with
"comparison is always false due to limited range of data type"
"[-Werror=type-limits]"
It turns out that the function xcurl_off_t() has 2 flavours:
- It gives a warning 32 bit systems, like Linux
- It takes the signed ssize_t as a paramter, but the only caller is using
a size_t (which is typically unsigned these days)
The original motivation of this function is to make sure that sizes > 2GiB
are handled correctly. The curl documentation says:
"For any given platform/compiler curl_off_t must be typedef'ed to a 64-bit
wide signed integral data type"
On a 32 bit system "size_t" can be promoted into a 64 bit signed value
without loss of data, and therefore we may see the
"comparison is always false" warning.
On a 64 bit system it may happen, at least in theory, that size_t is > 2^63,
and then the promotion from an unsigned "size_t" into a signed "curl_off_t"
may be a problem.
One solution to suppress a possible compiler warning could be to remove
the function xcurl_off_t().
However, to be on the very safe side, we keep it and improve it:
- The len parameter is changed from ssize_t to size_t
- A temporally variable "size" is used, promoted int uintmax_t and the compared
with "maximum_signed_value_of_type(curl_off_t)".
Thanks to Junio C Hamano for this hint.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We should not interrupt. sentences in the middle.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configure curl to accept compressed responses when using protocol v2 by
setting `CURLOPT_ENCODING` to "", which indicates that curl should send
an "Accept-Encoding" header with all supported compression encodings.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configure curl to accept all encodings which curl supports instead of
only accepting gzip responses.
This fixes an issue when using an installation of curl which is built
without the "zlib" feature. Since aa90b9697 (Enable info/refs gzip
decompression in HTTP client, 2012-09-19) we end up requesting "gzip"
encoding anyway despite libcurl not being able to decode it. Worse,
instead of getting a clear error message indicating so, we end up
falling back to "dumb" http, producing a confusing and difficult to
debug result.
Since curl doesn't do any checking to verify that it supports the a
requested encoding, instead set the curl option `CURLOPT_ENCODING` with
an empty string indicating that curl should send an "Accept-Encoding"
header containing only the encodings supported by curl.
Reported-by: Anton Golubev <anton.golubev@gmail.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the previous commit, both users of `struct walker` set `get_tree`,
`get_history` and `get_all` to 1. Drop those fields and simplify the
walker implementation accordingly.
Let's hope that any out-of-tree users will not mind this change. They
should notice that the compilation fails as they try to set these
fields. (If they do not set them, note that `get_http_walker()` leaves
them undefined, so the behavior will have been undefined all the time.)
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.
Signed-off-by: Stefan Beller <sbeller@google.com>
In order to be able to ship protocol v2 with only supporting fetch, we
need clients to not issue a request to use protocol v2 when pushing
(since the client currently doesn't know how to push using protocol v2).
This allows a client to have protocol v2 configured in
`protocol.version` and take advantage of using v2 for fetch and falling
back to using v0 when pushing while v2 for push is being designed.
We could run into issues if we didn't fall back to protocol v2 when
pushing right now. This is because currently a server will ignore a request to
use v2 when contacting the 'receive-pack' endpoint and fall back to
using v0, but when push v2 is rolled out to servers, the 'receive-pack'
endpoint will start responding using v2. So we don't want to get into a
state where a client is requesting to push with v2 before they actually
know how to push using v2.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach remote-curl the 'stateless-connect' command which is used to
establish a stateless connection with servers which support protocol
version 2. This allows remote-curl to act as a proxy, allowing the git
client to communicate natively with a remote end, simply using
remote-curl as a pass through to convert requests to http.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When an http info/refs request is made, requesting that protocol v2 be
used, don't send a "# service" line since this line is not part of the
v2 spec.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of always sending the Git-Protocol header with the configured
version with every http request, explicitly send it when discovering
refs and then only send it on subsequent http requests if the server
understood the version requested.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Store the protocol version the server responded with when performing
discovery. This will be used in a future patch to either change the
'Git-Protocol' header sent in subsequent requests or to determine if a
client needs to fallback to using a different protocol version.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make a copy of the service name being requested instead of relying on
the buffer pointed to by the passed in 'const char *' to remain
unchanged.
Currently, all service names are string constants, but a subsequent
patch will introduce service names from external sources.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to prepare for the addition of protocol_v2 push the protocol
version discovery outside of 'get_remote_heads()'. This will allow for
keeping the logic for processing the reference advertisement for
protocol_v1 and protocol_v0 separate from the logic for protocol_v2.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The transport-helper protocol c-style quotes the value of
any options passed to the helper via the "option <key> <value>"
directive. However, remote-curl doesn't actually unquote the
push-option values, meaning that we will send the quoted
version to the other side (whereas git-over-ssh would send
the raw value).
The pack-protocol.txt documentation defines the push-options
as a series of VCHARs, which excludes most characters that
would need quoting. But:
1. You can still see the bug with a valid push-option that
starts with a double-quote (since that triggers
quoting).
2. We do currently handle any non-NUL characters correctly
in git-over-ssh. So even though the spec does not say
that we need to handle most quoted characters, it's
nice if our behavior is consistent between protocols.
There are two new tests: the "direct" one shows that this
already works in the non-http case, and the http one covers
this bugfix.
Reported-by: Jon Simons <jon@jonsimons.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The packet_read_line() function will die if it sees any
protocol or socket errors. But it will return NULL for a
flush packet; some callers which are not expecting this may
dereference NULL if they get an unexpected flush. This would
involve the other side breaking protocol, but we should
flag the error rather than segfault.
Signed-off-by: Jon Simons <jon@jonsimons.org>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fetch to support filters. This is only allowed for the remote
configured in extensions.partialcloneremote.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce fetch-object, providing the ability to fetch one object from a
promisor remote.
This uses fetch-pack. To do this, the transport mechanism has been
updated with 2 flags, "from-promisor" to indicate that the resulting
pack comes from a promisor remote (and thus should be annotated as such
by index-pack), and "no-dependents" to indicate that only the objects
themselves need to be fetched (but fetching additional objects is
nevertheless safe).
Whenever "no-dependents" is used, fetch-pack will refrain from using any
object flags, because it is most likely invoked as part of a dynamic
object fetch by another Git command (which may itself use object flags).
An alternative to this is to leave fetch-pack alone, and instead update
the allocation of flags so that fetch-pack's flags never overlap with
any others, but this will end up shrinking the number of flags available
to nearly every other Git command (that is, every Git command that
accesses objects), so the approach in this commit was used instead.
This will be tested in a subsequent commit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unfortunately, in order to push some large repos where a server does
not support chunked encoding, the http postbuffer must sometimes
exceed two gigabytes. On a 64-bit system, this is OK: we just malloc
a larger buffer.
This means that we need to use CURLOPT_POSTFIELDSIZE_LARGE to set the
buffer size.
Signed-off-by: David Turner <dturner@twosigma.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this structure handles an array of object IDs, rename it to struct
oid_array. Also rename the accessor functions and the initialization
constant.
This commit was produced mechanically by providing non-Documentation
files to the following Perl one-liners:
perl -pi -E 's/struct sha1_array/struct oid_array/g'
perl -pi -E 's/\bsha1_array_/oid_array_/g'
perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the internal storage for struct sha1_array use an array of struct
object_id internally. Update the users of this struct which inspect its
internals.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach remote-curl to understand push options and to be able to convey
them across HTTP.
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.
Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:
1. Mallory may simply redirect all fetch requests to Bob's
server. Git will transparently follow those redirects
and fetch Bob's history, which Alice may believe she
got from Mallory. The subsequent push seems like it is
just feeding Mallory back her own objects, but is
actually leaking Bob's objects. There is nothing in
git's output to indicate that Bob's repository was
involved at all.
The downside (for Mallory) of this attack is that Alice
will have received Bob's entire repository, and is
likely to notice that when building on top of it.
2. If Mallory happens to know the sha1 of some object X in
Bob's repository, she can instead build her own history
that references that object. She then runs a dumb http
server, and Alice's client will fetch each object
individually. When it asks for X, Mallory redirects her
to Bob's server. The end result is that Alice obtains
objects from Bob, but they may be buried deep in
history. Alice is less likely to notice.
Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.
But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.
First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.
Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.
As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The discover_refs() function has a local "options" variable
to hold the http_get_options we pass to http_get_strbuf().
But this shadows the global "struct options" that holds our
program-level options, which cannot be accessed from this
function.
Let's give the local one a more descriptive name so we can
tell the two apart.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the event that a HTTP server closes the connection after giving a
200 but before giving any packets, we don't want to hang forever
waiting for a response that will never come. Instead, we should die
immediately.
One case where this happens is when attempting to fetch a dangling
object by its object name. In this case, the server dies before
sending any data. Prior to this patch, fetch-pack would wait for
data from the server, and remote-curl would wait for fetch-pack,
causing a deadlock.
Despite this patch, there is other possible malformed input that could
cause the same deadlock (e.g. a half-finished pktline, or a pktline but
no trailing flush). There are a few possible solutions to this:
1. Allowing remote-curl to tell fetch-pack about the EOF (so that
fetch-pack could know that no more data is coming until it says
something else). This is tricky because an out-of-band signal would
be required, or the http response would have to be re-framed inside
another layer of pkt-line or something.
2. Make remote-curl understand some of the protocol. It turns out
that in addition to understanding pkt-line, it would need to watch for
ack/nak. This is somewhat fragile, as information about the protocol
would end up in two places. Also, pkt-lines which are already at the
length limit would need special handling.
Both of these solutions would require a fair amount of work, whereas
this hack is easy and solves at least some of the problem.
Still to do: it would be good to give a better error message
than "fatal: The remote end hung up unexpectedly".
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should be part of every program, as otherwise users do
not get translated error messages. However, some external
commands forgot to do so (e.g., git-credential-store). This
fixes them, and eliminates the repeated code in programs
that did remember to use it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Every program which links against libgit.a must call this
function, or risk hitting an assert() in system_path() that
checks whether we have configured argv0_path (though only
when RUNTIME_PREFIX is defined, so essentially only on
Windows).
Looking at the diff, you can see that putting it into the
common main() saves us having to do it individually in each
of the external commands. But what you can't see are the
cases where we _should_ have been doing so, but weren't
(e.g., git-credential-store, and all of the t/helper test
programs).
This has been an accident-waiting-to-happen for a long time,
but wasn't triggered until recently because it involves one
of those programs actually calling system_path(). That
happened with git-credential-store in v2.8.0 with ae5f677
(lazily load core.sharedrepository, 2016-03-11). The
program:
- takes a lock file, which...
- opens a tempfile, which...
- calls adjust_shared_perm to fix permissions, which...
- lazy-loads the config (as of ae5f677), which...
- calls system_path() to find the location of
/etc/gitconfig
On systems with RUNTIME_PREFIX, this means credential-store
reliably hits that assert() and cannot be used.
We never noticed in the test suite, because we set
GIT_CONFIG_NOSYSTEM there, which skips the system_path()
lookup entirely. But if we were to tweak git_config() to
find /etc/gitconfig even when we aren't going to open it,
then the test suite shows multiple failures (for
credential-store, and for some other test helpers). I didn't
include that tweak here because it's way too specific to
this particular call to be worth carrying around what is
essentially dead code.
The implementation is fairly straightforward, with one
exception: there is exactly one caller (git.c) that actually
cares about the result of the function, and not the
side-effect of setting up argv0_path. We can accommodate
that by simply replacing the value of argv[0] in the array
we hand down to cmd_main().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are certain startup tasks that we expect every git
process to do. In some cases this is just to improve the
quality of the program (e.g., setting up gettext()). In
others it is a requirement for using certain functions in
libgit.a (e.g., system_path() expects that you have called
git_extract_argv0_path()).
Most commands are builtins and are covered by the git.c
version of main(). However, there are still a few external
commands that use their own main(). Each of these has to
remember to include the correct startup sequence, and we are
not always consistent.
Rather than just fix the inconsistencies, let's make this
harder to get wrong by providing a common main() that can
run this standard startup.
We basically have two options to do this:
- the compat/mingw.h file already does something like this by
adding a #define that replaces the definition of main with a
wrapper that calls mingw_startup().
The upside is that the code in each program doesn't need
to be changed at all; it's rewritten on the fly by the
preprocessor.
The downside is that it may make debugging of the startup
sequence a bit more confusing, as the preprocessor is
quietly inserting new code.
- the builtin functions are all of the form cmd_foo(),
and git.c's main() calls them.
This is much more explicit, which may make things more
obvious to somebody reading the code. It's also more
flexible (because of course we have to figure out _which_
cmd_foo() to call).
The downside is that each of the builtins must define
cmd_foo(), instead of just main().
This patch chooses the latter option, preferring the more
explicit approach, even though it is more invasive. We
introduce a new file common-main.c, with the "real" main. It
expects to call cmd_main() from whatever other objects it is
linked against.
We link common-main.o against anything that links against
libgit.a, since we know that such programs will need to do
this setup. Note that common-main.o can't actually go inside
libgit.a, as the linker would not pick up its main()
function automatically (it has no callers).
The rest of the patch is just adjusting all of the various
external programs (mostly in t/helper) to use cmd_main().
I've provided a global declaration for cmd_main(), which
means that all of the programs also need to match its
signature. In particular, many functions need to switch to
"const char **" instead of "char **" for argv. This effect
ripples out to a few other variables and functions, as well.
This makes the patch even more invasive, but the end result
is much better. We should be treating argv strings as const
anyway, and now all programs conform to the same signature
(which also matches the way builtins are defined).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We introduce a way to send custom HTTP headers with all requests.
This allows us, for example, to send an extra token from build agents
for temporary access to private repositories. (This is the use case that
triggered this patch.)
This feature can be used like this:
git -c http.extraheader='Secret: sssh!' fetch $URL $REF
Note that `curl_easy_setopt(..., CURLOPT_HTTPHEADER, ...)` takes only
a single list, overriding any previous call. This means we have to
collect _all_ of the headers we want to use into a single list, and
feed it to cURL in one shot. Since we already unconditionally set a
"pragma" header when initializing the curl handles, we can add our new
headers to that list.
For callers which override the default header list (like probe_rpc),
we provide `http_copy_default_headers()` so they can do the same
trick.
Big thanks to Jeff King and Junio Hamano for their outstanding help and
patient reviews.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:
1. It automatically checks the array-size multiplication
for overflow.
2. It always uses sizeof(*array) for the element-size,
so that it can never go out of sync with the declared
type of the array.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many manual argv allocations that predate the
argv_array API. Switching to that API brings a few
advantages:
1. We no longer have to manually compute the correct final
array size (so it's one less thing we can screw up).
2. In many cases we had to make a separate pass to count,
then allocate, then fill in the array. Now we can do it
in one pass, making the code shorter and easier to
follow.
3. argv_array handles memory ownership for us, making it
more obvious when things should be free()d and and when
not.
Most of these cases are pretty straightforward. In some, we
switch from "run_command_v" to "run_command" which lets us
directly use the argv_array embedded in "struct
child_process".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For curl error 35 (CURLE_SSL_CONNECT_ERROR) users need the
additional text stored in CURLOPT_ERRORBUFFER to debug why
the connection did not start. This is curl_errorstr inside
of http.c, so include that in the message if it is non-empty.
Sometimes HTTP response codes aren't yet available, such as
when the SSL setup fails. Don't include HTTP 0 in the message.
Signed-off-by: Shawn Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes it is necessary to force IPv4-only or IPv6-only operation
on networks where name lookups may return a non-routable address and
stall remote operations.
The ssh(1) command has an equivalent switches which we may pass when
we run them. There may be old ssh(1) implementations out there
which do not support these switches; they should report the
appropriate error in that case.
rsync support is untouched for now since it is deprecated and
scheduled to be removed.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago. No useful caller that uses other value has emerged.
By now, it is clear that the interface is overly broad without a
good reason. Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.
This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them. The changes contained in this patch are:
* introduction of these two functions in strbuf.[ch]
* mechanical conversion of all callers to strbuf_getline() with
either '\n' or '\0' as the third parameter to instead call the
respective thin wrapper.
After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller. An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the parse_fetch function to use struct object_id. Remove the
strlen check as get_oid_hex will fail safely on receiving a too-short
NUL-terminated string.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>