The git-remote-curl backend detects if the remote server supports
the git-upload-pack service, and if so, runs git-fetch-pack locally
in a pipe to generate the want/have commands.
The advertisements from the server that were obtained during the
discovery are passed into git-fetch-pack before the POST request
starts, permitting server capability discovery and enablement.
Common objects that are discovered are appended onto the request as
have lines and are sent again on the next request. This allows the
remote side to reinitialize its in-memory list of common objects
during the next request.
Because all requests are relatively short, below git-remote-curl's
1 MiB buffer limit, requests will use the standard Content-Length
header and be valid HTTP/1.0 POST requests. This makes the fetch
client more tolerant of proxy servers which don't support HTTP/1.1
or the chunked transfer encoding.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When multi_ack_detailed is enabled the ACK continue messages returned
by the remote upload-pack are broken out to describe the different
states within the peer. This permits the client to better understand
the server's in-memory state.
The fetch-pack/upload-pack protocol now looks like:
NAK
---------------------------------
Always sent in response to "done" if there was no common base
selected from the "have" lines (or no have lines were sent).
* no multi_ack or multi_ack_detailed:
Sent when the client has sent a pkt-line flush ("0000") and
the server has not yet found a common base object.
* either multi_ack or multi_ack_detailed:
Always sent in response to a pkt-line flush.
ACK %s
-----------------------------------
* no multi_ack or multi_ack_detailed:
Sent in response to "have" when the object exists on the remote
side and is therefore an object in common between the peers.
The argument is the SHA-1 of the common object.
* either multi_ack or multi_ack_detailed:
Sent in response to "done" if there are common objects.
The argument is the last SHA-1 determined to be common.
ACK %s continue
-----------------------------------
* multi_ack only:
Sent in response to "have".
The remote side wants the client to consider this object as
common, and immediately stop transmitting additional "have"
lines for objects that are reachable from it. The reason
the client should stop is not given, but is one of the two
cases below available under multi_ack_detailed.
ACK %s common
-----------------------------------
* multi_ack_detailed only:
Sent in response to "have". Both sides have this object.
Like with "ACK %s continue" above the client should stop
sending have lines reachable for objects from the argument.
ACK %s ready
-----------------------------------
* multi_ack_detailed only:
Sent in response to "have".
The client should stop transmitting objects which are reachable
from the argument, and send "done" soon to get the objects.
If the remote side has the specified object, it should
first send an "ACK %s common" message prior to sending
"ACK %s ready".
Clients may still submit additional "have" lines if there are
more side branches for the client to explore that might be added
to the common set and reduce the number of objects to transfer.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 41cb7488 Linus moved this function to connect.c for reuse inside
of the git-clone-pack command. That was 2005, but in 2006 Junio
retired git-clone-pack in commit efc7fa53. Since then the only
caller has been fetch-pack. Since this ACK/NAK exchange is only
used by the fetch-pack/upload-pack protocol we should move it back
to be a private detail of fetch-pack.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change is being offered as a refactoring to make later
commits in the smart HTTP series easier.
By changing the enabled capabilities to be formatted in a strbuf
it is easier to add a new capability to the set of supported
capabilities.
By formatting the want portion of the request into a strbuf and
writing it as a whole block we can later decide to hold onto
the req_buf (instead of releasing it) to recycle in stateless
communications.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack runs the sideband demultiplexer using start_async(). This
facility requires that the asynchronously executed function closes the
output file descriptor (see Documentation/technical/api-run-command.txt).
But the sideband demultiplexer did not do that. This fixes it.
In certain error situations this could lock up a fetch operation on
Windows because the asynchronous function is run in a thread; by not
closing the output fd the reading end never got EOF and waited for more
data indefinitely. On Unix this is not a problem because the asynchronous
function is run in a separate process, which exits after the function ends
and so implicitly closes the output.
Since the pack that is sent over the wire encodes the number of objects in
the stream, during normal operation the reading end knows when the stream
ends and terminates by itself, and does not lock up.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the local receiving repository has disabled the use of delta base
offset, for example to retain compatibility with older versions of
Git that predate OFS_DELTA, we shouldn't ask for ofs-delta support
when we obtain a pack from the remote server.
[ issue noticed by Shawn Pearce ]
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Essentially; s/type* /type */ as per the coding guidelines.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helps to notice when something's going wrong, especially on
systems which lock open files.
I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
called from such a function
- it is in a static function, returning void and the function is only
called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e1afca4fd "write_index(): update index_state->timestamp after
flushing to disk" on 2009-02-23 used stat.ctime to record the
timestamp of the index-file. This is wrong, so fix this and use the
correct stat.mtime timestamp instead.
Commit 110c46a909 "Not all systems use st_[cm]tim field for ns
resolution file timestamp" on 2009-03-08, has a similar bug for the
builtin-fetch-pack.c file.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This removes the last parameter of recv_sideband, by which the callers
told which channel bands #2 and #3 should be written to.
Sayeth Shawn Pearce:
The definition of the streams in the current sideband protocol
are rather well defined for the one protocol that uses it,
fetch-pack/receive-pack:
stream #1: pack data
stream #2: stderr messages, progress, meant for tty
stream #3: abort message, remote is dead, goodbye!
Since both callers of the function passed 2 for the parameter, we hereby
remove it and send bands #2 and #3 to stderr explicitly using fprintf.
This has the nice side-effect that these two streams pass through our
ANSI emulation layer on Windows.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some codepaths do not still use the ST_[CM]TIME_NSEC() pair of macros
introduced by the previous commit but assumes all systems use st_mtim
and st_ctim fields in "struct stat" to record nanosecond resolution part
of the file timestamps.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These variables were unused and can be removed safely:
builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
builtin-fetch-pack.c::find_common(): len
builtin-remote.c::mv(): symref
diff.c::show_stats():show_stats(): total
diffcore-break.c::should_break(): base_size
fast-import.c::validate_raw_date(): date, sign
fsck.c::fsck_tree(): o_sha1, sha1
xdiff-interface.c::parse_num(): read_some
Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, the lack of USE_NSEC meant "do not record nor use the
nanosecond resolution part of the file timestamps". To avoid problems on
filesystems that lose the ns part when the metadata is flushed to the disk
and then later read back in, disabling USE_NSEC has been a good idea in
general.
If you are on a filesystem without such an issue, it does not hurt to read
and store them in the cached stat data in the index entries even if your
git is compiled without USE_NSEC. The index left with such a version of
git can be read by git compiled with USE_NSEC and it can make use of the
nanosecond part to optimize the check to see if the path on the filesystem
hsa been modified since we last looked at.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'struct cache' does not have a 'usec' member, but a 'unsigned int
nsec' member. Simmilar 'struct stat' does not have a 'st_mtim.usec'
member, and we should instead use 'st_mtim.tv_nsec'.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
memcpy() may only be used for disjoint memory areas, but when invoked
from cmd_fetch_pack(), we have my_args == &args. (The argument cannot
be removed entirely because transport.c invokes with its own
variable.)
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This changes the "die_on_error" boolean parameter to a mere "flags", and
changes the existing callers of hold_lock_file_for_update/append()
functions to pass LOCK_DIE_ON_ERROR.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git push" enhancement allows the receiving end to report not only its own
refs but refs in repositories it borrows from via the alternate object
store mechanism. By telling the sender that objects reachable from these
extra refs are already complete in the receiving end, the number of
objects that need to be transfered can be cut down.
These entries are sent over the wire with string ".have", instead of the
actual names of the refs. This string was chosen so that they are ignored
by older programs at the sending end. If we sent some random but valid
looking refnames for these entries, "matching refs" rule (triggered when
running "git push" without explicit refspecs, where the sender learns what
refs the receiver has, and updates only the ones with the names of the
refs the sender also has) and "delete missing" rule (triggered when "git
push --mirror" is used, where the sender tells the receiver to delete the
refs it itself does not have) would try to update/delete them, which is
not what we want.
This prepares the send-pack (and "push" that runs native protocol) to
accept extended existing ref information and make use of it. The ".have"
entries are excluded from ref matching rules, and are exempt from deletion
rule while pushing with --mirror option, but are still used for pack
generation purposes by providing more "bottom" range commits.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
User notifications are presented as 'git cmd', and code comments
are presented as '"cmd"' or 'git's cmd', rather than 'git-cmd'.
Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some systems (like e.g. OpenSolaris) define pid_t as long,
therefore all our sprintf that use %i/%d cause a compiler warning
beacuse of the implicit long->int cast. To make sure that
we fit the limits, we display pids as PRIuMAX and cast them explicitly
to uintmax_t.
Signed-off-by: David Soria Parra <dsp@php.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a mechanical conversion of all '*.c' files with:
s/((?:die|error|warning)\("git)-(\S+:)/$1 $2/;
The result was manually inspected and no false positive was found.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the repo is empty, it is obvious that there are no common commits
when fetching from _anywhere_.
So there is no use in saying it in that case, and it can even be
annoying. Therefore suppress the message unilaterally if the repository
is empty prior to the fetch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you misuse a git command, you are shown the usage string.
But this is currently shown in the dashed form. So if you just
copy what you see, it will not work, when the dashed form
is no longer supported.
This patch makes git commands show the dash-less version.
For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh
generates a dash-less usage string now.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When printing valuds of type uint32_t, we should use PRIu32, and should
not assume that it is unsigned int. On 32-bit platforms, it could be
defined as unsigned long. The same caution applies to ntohl().
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the repo is empty, it is obvious that there are no common commits
when fetching from _anywhere_.
So there is no use in saying it in that case, and it can even be
annoying. Therefore suppress the message unilaterally if the repository
is empty prior to the fetch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Immediately after fetching a pack, we should call reprepare_packed_git() to
make sure the objects in the pack are reachable. Otherwise, we will fail to
look up objects that are present only in the fetched pack.
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_config() only had a function parameter, but no callback data
parameter. This assumes that all callback functions only modify
global variables.
With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When I applied Linus's patch from the list by hand somehow I ended
up reversing the logic by mistake. This fixes it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f3ec549 (fetch-pack: check parse_commit/object results, 2008-03-03)
broke common ancestor computation by stopping traversal when it sees
an already parsed commit. This should fix it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before the second fetch-pack connection in the same process, unmark
all of the objects marked in the first connection, in order that we'll
list them as things we have instead of thinking we've already
mentioned them.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new protocol extension "include-tag" allows the client side
of the connection (fetch-pack) to request that the server side of the
native git protocol (upload-pack / pack-objects) use --include-tag
as it prepares the packfile, thus ensuring that an annotated tag object
will be included in the resulting packfile if the object it refers to
was also included into the packfile.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By setting .in, .out, or .err members of struct child_process to -1, the
callers of start_command() can request that a pipe is allocated that talks
to the child process and one end is returned by replacing -1 with the
file descriptor.
Previously, a flag was set (for .in and .out, but not .err) to signal
finish_command() to close the pipe end that start_command() had handed out,
so it was optional for callers to close the pipe, and many already do so.
Now we make it mandatory to close the pipe.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This shares the connection between getting the remote ref list and
getting objects in the first batch. (A second connection is still used
to follow tags).
When we do not fetch objects (i.e. either ls-remote disconnects after
getting list of refs, or we decide we are already up-to-date), we
clean up the connection properly; otherwise the connection is left
open in need of cleaning up to avoid getting an error message from
the remote end when ssh is used.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_pack() receives a pair of file descriptors that communicate with
upload-pack at the remote end. In order to support the case where the
side-band demultiplexer runs in a thread, and, hence, in the same process
as the main routine, we must not close the readable file descriptor early.
The handling of the readable fd is changed in the case where upload-pack
supports side-band communication: The old code closed the fd after it was
inherited to the side-band demultiplexer process. Now we do not close it.
The caller (do_fetch_pack) will close it later anyway. The demultiplexer
is the only reader, it does not matter that the fd remains open in the
main process as well as in unpack-objects/index-pack, which inherits it.
The writable fd is not needed in get_pack(), hence, the old code closed
the fd. For symmetry with the readable fd, we now do not close it; the
caller (do_fetch_pack) will close it later anyway. Therefore, the new
behavior is that the channel now remains open during the entire
conversation, but this has no ill effects because upload-pack does not read
from it once it has begun to send the pack data. For the same reason it
does not matter that the writable fd is now inherited to the demultiplexer
and unpack-objects/index-pack processes.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The field in the args was being ignored in favor of a static constant
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Thanked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We run the sideband demultiplexer in an asynchronous function.
Note that earlier there was a check in the child process that closed
xd[1] only if it was different from xd[0]; this test is no longer needed
because git_connect() always returns two different file descriptors
(see ec587fde0a).
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This prepares the API of git_connect() and finish_connect() to operate on
a struct child_process. Currently, we just use that object as a placeholder
for the pid that we used to return. A follow-up patch will change the
implementation of git_connect() and finish_connect() to make full use
of the object.
Old code had early-return-on-error checks at the calling sites of
git_connect(), but since git_connect() dies on errors anyway, these checks
were removed.
[sp: Corrected style nit of "conn == NULL" to "!conn"]
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The only way to configure the unpacking limit is currently through
the .git/config (or ~/.gitconfig) mechanism as we have no existing
command line option interface to control this threshold on a per
invocation basis. This was intentional by design as the storage
policy of the repository should be a repository-wide decision and
should not be subject to variations made on individual command
executions.
Earlier builtin-fetch was bypassing the unpacking limit chosen by
the user through the configuration file as it did not reread the
configuration options through fetch_pack_config if we called the
internal fetch_pack() API directly. We now ensure we always run the
config file through fetch_pack_config at least once in this process,
thereby setting our unpackLimit properly.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Copying the arguments from a fetch_pack_args into static globals
within the builtin-fetch-pack module is error-prone and may lead
rise to cases where arguments supplied via the struct from the
new fetch_pack() API may not be honored by the implementation.
Here we reorganize all of the static globals into a single static
struct fetch_pack_args instance and use memcpy() to move the data
from the caller supplied structure into the globals before we
execute our pack fetching implementation. This strategy is more
robust to additions and deletions of properties.
As keep_pack is a single bit we have also introduced lock_pack to
mean not only download and store the packfile via index-pack but
also to lock it against repacking by creating a .keep file when
the packfile itself is stored. The caller must remove the .keep
file when it is safe to do so.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A long time ago Junio added this line to always ensure that the
output array created by remove_duplicates() had a NULL as its
terminating node. Today none of the downstream consumers of this
array care about a NULL terminator; they only pay attention to the
size of the array (as indicated by nr_heads). In (nearly?) all
cases passing a NULL element will cause SIGSEGV failures. So this
NULL terminal is not actually necessary.
Unfortunately we cannot continue to NULL terminate the array at
this point as the array may only have been allocated large enough
to match the input of nr_heads. If there are no duplicates than
we would be trying to store NULL into heads[nr_heads] and that may
be outside of the array.
My recent series to cleanup builtin-fetch changed the allocation of
the heads array from 256 entries to exactly nr_heads thus ensuring
we were always overstepping the array and causing memory corruption.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are using a native packfile to perform a git-fetch invocation
and the received packfile contained more than the configured limits
of fetch.unpackLimit/transfer.unpackLimit then index-pack will output
a single line saying "keep\t$sha1\n" to stdout. This line needs to
be captured and retained so we can delete the corresponding .keep
file ("$GIT_DIR/objects/pack/pack-$sha1.keep") once all refs have
been safely updated.
This trick has long been in use with git-fetch.sh and its lower level
helper git-fetch--tool as a way to allow index-pack to save the new
packfile before the refs have been updated and yet avoid a race with
any concurrently running git-repack process. It was unfortunately
lost when git-fetch.sh was converted to pure C and fetch--tool was
no longer being invoked.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The older git-fetch client did not produce all of this debugging
information to stdout. Most end-users and Porcelain (e.g. StGIT,
git-gui, qgit) do not want to see these low-level details on the
console so they should be removed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time. There are a few files that need
to have trailing whitespaces (most notably, test vectors). The results
still passes the test, and build result in Documentation/ area is unchanged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make git notify the user about host resolution/connection attempts.
This is useful both as a progress indicator on slow links, and helps
reassure the user there are no firewall problems.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>