If the remote peer upload-pack process supports the include-tag
protocol extension then we can avoid running a second fetch cycle
on the client side by letting the server send us the annotated tags
along with the objects it is packing for us. In the following graph
we can now fetch both "tag1" and "tag2" on the same connection that
we fetched "master" from the remote when we only have L available
on the local side:
T - tag1 S - tag2
/ /
L - o ------ o ------ B
\ \
\ \
origin/master master
The objects for "tag1" are implicitly downloaded without our direct
knowledge. The existing "quickfetch" optimization within git-fetch
discovers that tag1 is complete after the first connection and does
not open a second connection.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We always report to the user the list of refs we got from the first
connection, even if we do multiple connections. But we should always
use each connection's own list of refs in the communication with the
server, in case we got a different server out of DNS rotation or the
timing was surprising or something.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In transport.c, proxy setting (the one from the remote conf) was set through
curl_easy_setopt() call, while http.c already does the same with the
http.proxy setting. We now just use this infrastructure instead, and make
http_init() now take the struct remote as argument so that it can take the
http_proxy setting from there, and any other property that would be added
later.
At the same time, we make get_http_walker() take a struct remote argument
too, and pass it to http_init(), which makes remote defined proxy be used
for more than get_refs_via_curl().
We leave out http-fetch and http-push, which don't use remotes for the
moment, purposefully.
Signed-off-by: Mike Hommey <mh@glandium.org>
Acked-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
curl versions 7.16.3 to 7.18.0 included had a regression in which https
requests following curl_global_cleanup/init sequence would fail with ASN1
parser errors with curl-gnutls. Such sequences happen in some cases such
as git fetch.
We work around this by removing the http_init and http_cleanup calls from
get_refs_via_curl, replacing them with a transport->data initialization
with the http_walker (which does http_init).
While the http_walker is not currently used in get_refs_via_curl, http
and walker code refactor will make it use it.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
curl versions 7.16.3 to 7.18.0 included had a regression in which https
requests following curl_global_cleanup/init sequence would fail with ASN1
parser errors with curl-gnutls. Such sequences happen in some cases such
as git fetch.
We work around this by removing the http_init and http_cleanup calls from
get_refs_via_curl, replacing them with a transport->data initialization
with the http_walker (which does http_init).
While the http_walker is not currently used in get_refs_via_curl, http
and walker code refactor will make it use it.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This shares the connection between getting the remote ref list and
getting objects in the first batch. (A second connection is still used
to follow tags).
When we do not fetch objects (i.e. either ls-remote disconnects after
getting list of refs, or we decide we are already up-to-date), we
clean up the connection properly; otherwise the connection is left
open in need of cleaning up to avoid getting an error message from
the remote end when ssh is used.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A NUL byte at beginning of file, or just after a newline
would provoke an invalid buf[-1] access in a few places.
* builtin-grep.c (cmd_grep): Don't access buf[-1].
* builtin-pack-objects.c (get_object_list): Likewise.
* builtin-rev-list.c (read_revisions_from_stdin): Likewise.
* bundle.c (read_bundle_header): Likewise.
* server-info.c (read_pack_info_file): Likewise.
* transport.c (insert_packed_refs): Likewise.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code calls fetch_pack() to get the list of refs it fetched, and
discards refs and always returns 0 to signal success.
But builtin-fetch-pack.c::fetch_pack() has error cases. The function
returns NULL if error is detected (shallow-support side seems to choose
to die but I suspect that is easily fixable to error out as well).
Make fetch_refs_via_pack() propagate that error to the caller.
Acked-By: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As well as allowing a default http.proxy option, allow it to be set
per-remote.
Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because this function is static and used only by the
http-walker, when NO_CURL is defined, gcc emits a "defined
but not used" warning.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A --verbose option to push should also be passed to the
transport layer, i.e. git-send-pack, git-http-push.
git push is modified to do so.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The list of remote refs in struct transport should be const, because
builtin-fetch will get confused if it changes.
The url in git_connect should be const (and work on a copy) instead of
requiring the caller to copy it.
match_refs doesn't modify the refspecs it gets.
get_fetch_map and get_remote_ref don't change the list they get.
Allow transport get_refs_list methods to modify the struct transport.
Add a function to copy a list of refs, when a function needs a mutable
copy of a const list.
Add a function to check the type of a ref, as per the code in connect.c
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The variable is always set if it is going to be used; gcc just does
not notice it.
Signed-off-by: Blake Ramsdell <blaker@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This prepares the API of git_connect() and finish_connect() to operate on
a struct child_process. Currently, we just use that object as a placeholder
for the pid that we used to return. A follow-up patch will change the
implementation of git_connect() and finish_connect() to make full use
of the object.
Old code had early-return-on-error checks at the calling sites of
git_connect(), but since git_connect() dies on errors anyway, these checks
were removed.
[sp: Corrected style nit of "conn == NULL" to "!conn"]
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the end-user requested a dry-run push we need to pass that flag
over to http-push and additionally make sure it does not actually
upload any changes to the remote server.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the end-user requested a dry-run push we should pass that flag
though to rsync so that the rsync command can show what it would do
(or not do) if push was to be executed without the --dry-run flag.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
There were a few places which did not cope well without curl. This
fixes all of them. We still need to link against the walker.o part
of the library as some parts of transport.o still call into there
even though we don't have HTTP support enabled.
If compiled with NO_CURL=1 we now get the following useful error
message:
$ git-fetch http://www.example.com/git
error: git was compiled without libcurl support.
fatal: Don't know how to fetch from http://www.example.com/git
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This adds a verbosity level below 0 for suppressing default messages
with --quiet, and makes the default for http be verbose instead of
quiet. This matches the behavior of the shell script version of git-fetch.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We lost rsync support when transitioning from shell to C. Support it
again (even if the transport is technically deprecated, some people just
do not have any chance to use anything else).
Also, add a test to t5510. Since rsync transport is not configured by
default on most machines, and especially not such that you can write to
rsync://127.0.0.1$(pwd)/, it is disabled by default; you can enable it by
setting the environment variable TEST_RSYNC.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently alloc_ref() expects the length of the refname plus 1
as its parameter, prepares that much space and returns a "ref"
structure for the caller to fill the refname. One caller in
transport.c::get_refs_from_bundle() however allocated one byte
less.
It may be a good idea to change the calling convention to give
alloc_ref() the length of the refname, but that clean-up can be
done in a separate patch. This patch only fixes the bug and
makes all callers consistent.
There was also one overallocation in connect.c, which would not
hurt but was wasteful. This patch fixes it as well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most transport implementations tend to allocate a data buffer
in the struct transport instance during transport_get() so we
need to free that data buffer when we disconnect it.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only way to configure the unpacking limit is currently through
the .git/config (or ~/.gitconfig) mechanism as we have no existing
command line option interface to control this threshold on a per
invocation basis. This was intentional by design as the storage
policy of the repository should be a repository-wide decision and
should not be subject to variations made on individual command
executions.
Earlier builtin-fetch was bypassing the unpacking limit chosen by
the user through the configuration file as it did not reread the
configuration options through fetch_pack_config if we called the
internal fetch_pack() API directly. We now ensure we always run the
config file through fetch_pack_config at least once in this process,
thereby setting our unpackLimit properly.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Copying the arguments from a fetch_pack_args into static globals
within the builtin-fetch-pack module is error-prone and may lead
rise to cases where arguments supplied via the struct from the
new fetch_pack() API may not be honored by the implementation.
Here we reorganize all of the static globals into a single static
struct fetch_pack_args instance and use memcpy() to move the data
from the caller supplied structure into the globals before we
execute our pack fetching implementation. This strategy is more
robust to additions and deletions of properties.
As keep_pack is a single bit we have also introduced lock_pack to
mean not only download and store the packfile via index-pack but
also to lock it against repacking by creating a .keep file when
the packfile itself is stored. The caller must remove the .keep
file when it is safe to do so.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aside from reducing the code by 20 lines this refactoring removes
a level of indirection when trying to access the operations of a
given transport "instance", making the code clearer and easier to
follow.
It also has the nice effect of giving us the benefits of C99 style
struct initialization (namely ".fetch = X") without requiring that
level of language support from our compiler. We don't need to worry
about new operation methods being added as they will now be NULL'd
out automatically by the xcalloc() we use to create the new struct
transport we supply to the caller.
This pattern already exists in struct walker, so we already have
a precedent for it in Git. We also don't really need to worry
about any sort of performance decreases that may occur as a result
of filling out 4-8 op pointers when we make a "struct transport".
The extra few CPU cycles this requires over filling in the "struct
transport_ops" is killed by the time it will take Git to actually
*use* one of those functions, as most transport operations are
going over the wire or will be copying object data locally between
two directories.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a transport doesn't support an option we already are telling
the higher level application (fetch or push) that the option is not
valid by sending back a >0 return value from transport_set_option
so there's not a strong motivation to have the function perform the
output itself. Instead we should let the higher level application
do the output if it is necessary. This avoids always telling the
user that depth isn't supported on HTTP urls even when they did
not pass a --depth option to git-fetch.
If the user passes an option and the option value is invalid we now
properly die in git-fetch instead of just spitting out a message
and running anyway. This mimics prior behavior better where
incorrect/malformed options are not accepted by the process.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We don't actually need to know at the time of transport_get if the
caller wants to fetch, push, or do both on the returned object.
It is easier to just delay the initialization of the HTTP walker
until we know we will need it by providing a CURL specific fetch
function in the curl_transport that makes sure the walker instance
is initialized before use.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We always allocate and return a struct transport* right now as every
URL is considered to be a native Git transport if it is not rsync,
http/https/ftp or a bundle. So we can simplify the initialization
of a new transport object by performing one xcalloc call and filling
in only the attributes required.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the walker API within builtin-fetch we don't allow
it to update refs locally; instead that action is reserved for
builtin-fetch's own main loop once the objects have actually
been downloaded.
Passing NULL here will bypass the unnecessary malloc/free of a
string buffer within the walker API. That buffer is never used
because the prior argument (the refs to update) is also NULL.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch_pack() can call remove_duplicates() on its input array and
this will possibly overwrite an earlier entry with a later one if
there are any duplicates in the input array. In such a case the
caller here might then attempt to free an item multiple times as
it goes through its cleanup.
I also forgot to free the heads array we pass down into fetch_pack()
when I introduced the allocation of it in this function during my
builtin-fetch cleanup series. Better free it while we are here
working on related memory management fixes.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are using a native packfile to perform a git-fetch invocation
and the received packfile contained more than the configured limits
of fetch.unpackLimit/transfer.unpackLimit then index-pack will output
a single line saying "keep\t$sha1\n" to stdout. This line needs to
be captured and retained so we can delete the corresponding .keep
file ("$GIT_DIR/objects/pack/pack-$sha1.keep") once all refs have
been safely updated.
This trick has long been in use with git-fetch.sh and its lower level
helper git-fetch--tool as a way to allow index-pack to save the new
packfile before the refs have been updated and yet avoid a race with
any concurrently running git-repack process. It was unfortunately
lost when git-fetch.sh was converted to pure C and fetch--tool was
no longer being invoked.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit walkers need to know the SHA-1 name of any objects they
have been asked to fetch while the native pack transport only
wants to know the names of the remote refs as the remote side
must do the name->SHA-1 translation.
Since we only have three fetch implementations and one of them
(bundle) doesn't even need the name information we can reduce
the code required to perform a fetch by having just one function
and passing of the filtered list of refs to be fetched. Each
transport can then obtain the information it needs from that ref
array to construct its own internal operation state.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Conflicts:
transport.c
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ALLOC_GROW macro is a shorter way to implement an array that
grows upon demand as additional items are added to it. We have
mostly standardized upon its use within git and transport.c is
not an exception.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This moves the code to call push backends into a library that can be
extended to make matching fetch and push decisions based on the URL it
gets, and which could be changed to have built-in implementations
instead of calling external programs.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>