|
|
|
git-fetch-pack(1)
|
|
|
|
=================
|
|
|
|
|
|
|
|
NAME
|
|
|
|
----
|
|
|
|
git-fetch-pack - Receive missing objects from another repository
|
|
|
|
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
--------
|
|
|
|
[verse]
|
|
|
|
'git fetch-pack' [--all] [--quiet|-q] [--keep|-k] [--thin] [--include-tag]
|
|
|
|
[--upload-pack=<git-upload-pack>]
|
|
|
|
[--depth=<n>] [--no-progress]
|
|
|
|
[-v] <repository> [<refs>...]
|
|
|
|
|
|
|
|
DESCRIPTION
|
|
|
|
-----------
|
|
|
|
Usually you would want to use 'git fetch', which is a
|
|
|
|
higher level wrapper of this command, instead.
|
|
|
|
|
|
|
|
Invokes 'git-upload-pack' on a possibly remote repository
|
|
|
|
and asks it to send objects missing from this repository, to
|
|
|
|
update the named heads. The list of commits available locally
|
docs: don't talk about $GIT_DIR/refs/ everywhere
It is misleading to say that we pull refs from $GIT_DIR/refs/*, because we
may also consult the packed refs mechanism. These days we tend to treat
the "refs hierarchy" as more of an abstract namespace that happens to be
represented as $GIT_DIR/refs. At best, this is a minor inaccuracy, but at
worst it can confuse users who then look in $GIT_DIR/refs and find that it
is missing some of the refs they expected to see.
This patch drops most uses of "$GIT_DIR/refs/*", changing them into just
"refs/*", under the assumption that users can handle the concept of an
abstract refs namespace. There are a few things to note:
- most cases just dropped the $GIT_DIR/ portion. But for cases where
that left _just_ the word "refs", I changed it to "refs/" to help
indicate that it was a hierarchy. I didn't do the same for longer
paths (e.g., "refs/heads" remained, instead of becoming
"refs/heads/").
- in some cases, no change was made, as the text was explicitly about
unpacked refs (e.g., the discussion in git-pack-refs).
- In some cases it made sense instead to note the existence of packed
refs (e.g., in check-ref-format and rev-parse).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
15 years ago
|
|
|
is found out by scanning the local refs/ hierarchy and sent to
|
|
|
|
'git-upload-pack' running on the other end.
|
|
|
|
|
|
|
|
This command degenerates to download everything to complete the
|
|
|
|
asked refs from the remote side when the local side does not
|
|
|
|
have a common ancestor commit.
|
|
|
|
|
|
|
|
|
|
|
|
OPTIONS
|
|
|
|
-------
|
|
|
|
--all::
|
|
|
|
Fetch all remote refs.
|
|
|
|
|
fetch-pack: new --stdin option to read refs from stdin
If a remote repo has too many tags (or branches), cloning it over the
smart HTTP transport can fail because remote-curl.c puts all the refs
from the remote repo on the fetch-pack command line. This can make the
command line longer than the global OS command line limit, causing
fetch-pack to fail.
This is especially a problem on Windows where the command line limit is
orders of magnitude shorter than Linux. There are already real repos out
there that msysGit cannot clone over smart HTTP due to this problem.
Here is an easy way to trigger this problem:
git init too-many-refs
cd too-many-refs
echo bla > bla.txt
git add .
git commit -m test
sha=$(git rev-parse HEAD)
tag=$(perl -e 'print "bla" x 30')
for i in `seq 50000`; do
echo $sha refs/tags/$tag-$i >> .git/packed-refs
done
Then share this repo over the smart HTTP protocol and try cloning it:
$ git clone http://localhost/.../too-many-refs/.git
Cloning into 'too-many-refs'...
fatal: cannot exec 'fetch-pack': Argument list too long
50k tags is obviously an absurd number, but it is required to
demonstrate the problem on Linux because it has a much more generous
command line limit. On Windows the clone fails with as little as 500
tags in the above loop, which is getting uncomfortably close to the
number of tags you might see in real long lived repos.
This is not just theoretical, msysGit is already failing to clone our
company repo due to this. It's a large repo converted from CVS, nearly
10 years of history.
Four possible solutions were discussed on the Git mailing list (in no
particular order):
1) Call fetch-pack multiple times with smaller batches of refs.
This was dismissed as inefficient and inelegant.
2) Add option --refs-fd=$n to pass a an fd from where to read the refs.
This was rejected because inheriting descriptors other than
stdin/stdout/stderr through exec() is apparently problematic on Windows,
plus it would require changes to the run-command API to open extra
pipes.
3) Add option --refs-from=$tmpfile to pass the refs using a temp file.
This was not favored because of the temp file requirement.
4) Add option --stdin to pass the refs on stdin, one per line.
In the end this option was chosen as the most efficient and most
desirable from scripting perspective.
There was however a small complication when using stdin to pass refs to
fetch-pack. The --stateless-rpc option to fetch-pack also uses stdin for
communication with the remote server.
If we are going to sneak refs on stdin line by line, it would have to be
done very carefully in the presence of --stateless-rpc, because when
reading refs line by line we might read ahead too much data into our
buffer and eat some of the remote protocol data which is also coming on
stdin.
One way to solve this would be to refactor get_remote_heads() in
fetch-pack.c to accept a residual buffer from our stdin line parsing
above, but this function is used in several places so other callers
would be burdened by this residual buffer interface even when most of
them don't need it.
In the end we settled on the following solution:
If --stdin is specified without --stateless-rpc, fetch-pack would read
the refs from stdin one per line, in a script friendly format.
However if --stdin is specified together with --stateless-rpc,
fetch-pack would read the refs from stdin in packetized format
(pkt-line) with a flush packet terminating the list of refs. This way we
can read the exact number of bytes that we need from stdin, and then
get_remote_heads() can continue reading from the same fd without losing
a single byte of remote protocol data.
This way the --stdin option only loses generality and scriptability when
used together with --stateless-rpc, which is not easily scriptable
anyway because it also uses pkt-line when talking to the remote server.
Signed-off-by: Ivan Todoroski <grnch@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 years ago
|
|
|
--stdin::
|
|
|
|
Take the list of refs from stdin, one per line. If there
|
|
|
|
are refs specified on the command line in addition to this
|
|
|
|
option, then the refs from stdin are processed after those
|
|
|
|
on the command line.
|
|
|
|
+
|
|
|
|
If `--stateless-rpc` is specified together with this option then
|
fetch-pack: new --stdin option to read refs from stdin
If a remote repo has too many tags (or branches), cloning it over the
smart HTTP transport can fail because remote-curl.c puts all the refs
from the remote repo on the fetch-pack command line. This can make the
command line longer than the global OS command line limit, causing
fetch-pack to fail.
This is especially a problem on Windows where the command line limit is
orders of magnitude shorter than Linux. There are already real repos out
there that msysGit cannot clone over smart HTTP due to this problem.
Here is an easy way to trigger this problem:
git init too-many-refs
cd too-many-refs
echo bla > bla.txt
git add .
git commit -m test
sha=$(git rev-parse HEAD)
tag=$(perl -e 'print "bla" x 30')
for i in `seq 50000`; do
echo $sha refs/tags/$tag-$i >> .git/packed-refs
done
Then share this repo over the smart HTTP protocol and try cloning it:
$ git clone http://localhost/.../too-many-refs/.git
Cloning into 'too-many-refs'...
fatal: cannot exec 'fetch-pack': Argument list too long
50k tags is obviously an absurd number, but it is required to
demonstrate the problem on Linux because it has a much more generous
command line limit. On Windows the clone fails with as little as 500
tags in the above loop, which is getting uncomfortably close to the
number of tags you might see in real long lived repos.
This is not just theoretical, msysGit is already failing to clone our
company repo due to this. It's a large repo converted from CVS, nearly
10 years of history.
Four possible solutions were discussed on the Git mailing list (in no
particular order):
1) Call fetch-pack multiple times with smaller batches of refs.
This was dismissed as inefficient and inelegant.
2) Add option --refs-fd=$n to pass a an fd from where to read the refs.
This was rejected because inheriting descriptors other than
stdin/stdout/stderr through exec() is apparently problematic on Windows,
plus it would require changes to the run-command API to open extra
pipes.
3) Add option --refs-from=$tmpfile to pass the refs using a temp file.
This was not favored because of the temp file requirement.
4) Add option --stdin to pass the refs on stdin, one per line.
In the end this option was chosen as the most efficient and most
desirable from scripting perspective.
There was however a small complication when using stdin to pass refs to
fetch-pack. The --stateless-rpc option to fetch-pack also uses stdin for
communication with the remote server.
If we are going to sneak refs on stdin line by line, it would have to be
done very carefully in the presence of --stateless-rpc, because when
reading refs line by line we might read ahead too much data into our
buffer and eat some of the remote protocol data which is also coming on
stdin.
One way to solve this would be to refactor get_remote_heads() in
fetch-pack.c to accept a residual buffer from our stdin line parsing
above, but this function is used in several places so other callers
would be burdened by this residual buffer interface even when most of
them don't need it.
In the end we settled on the following solution:
If --stdin is specified without --stateless-rpc, fetch-pack would read
the refs from stdin one per line, in a script friendly format.
However if --stdin is specified together with --stateless-rpc,
fetch-pack would read the refs from stdin in packetized format
(pkt-line) with a flush packet terminating the list of refs. This way we
can read the exact number of bytes that we need from stdin, and then
get_remote_heads() can continue reading from the same fd without losing
a single byte of remote protocol data.
This way the --stdin option only loses generality and scriptability when
used together with --stateless-rpc, which is not easily scriptable
anyway because it also uses pkt-line when talking to the remote server.
Signed-off-by: Ivan Todoroski <grnch@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 years ago
|
|
|
the list of refs must be in packet format (pkt-line). Each ref must
|
|
|
|
be in a separate packet, and the list must end with a flush packet.
|
|
|
|
|
|
|
|
-q::
|
|
|
|
--quiet::
|
|
|
|
Pass `-q` flag to 'git unpack-objects'; this makes the
|
|
|
|
cloning process less verbose.
|
|
|
|
|
|
|
|
-k::
|
|
|
|
--keep::
|
|
|
|
Do not invoke 'git unpack-objects' on received data, but
|
|
|
|
create a single packfile out of it instead, and store it
|
|
|
|
in the object database. If provided twice then the pack is
|
|
|
|
locked against repacking.
|
|
|
|
|
|
|
|
--thin::
|
|
|
|
Fetch a "thin" pack, which records objects in deltified form based
|
|
|
|
on objects not included in the pack to reduce network traffic.
|
|
|
|
|
|
|
|
--include-tag::
|
|
|
|
If the remote side supports it, annotated tags objects will
|
|
|
|
be downloaded on the same connection as the other objects if
|
|
|
|
the object the tag references is downloaded. The caller must
|
|
|
|
otherwise determine the tags this option made available.
|
|
|
|
|
|
|
|
--upload-pack=<git-upload-pack>::
|
|
|
|
Use this to specify the path to 'git-upload-pack' on the
|
|
|
|
remote side, if is not found on your $PATH.
|
|
|
|
Installations of sshd ignores the user's environment
|
|
|
|
setup scripts for login shells (e.g. .bash_profile) and
|
|
|
|
your privately installed git may not be found on the system
|
|
|
|
default $PATH. Another workaround suggested is to set
|
|
|
|
up your $PATH in ".bashrc", but this flag is for people
|
|
|
|
who do not want to pay the overhead for non-interactive
|
|
|
|
shells by having a lean .bashrc file (they set most of
|
|
|
|
the things up in .bash_profile).
|
|
|
|
|
|
|
|
--exec=<git-upload-pack>::
|
|
|
|
Same as --upload-pack=<git-upload-pack>.
|
|
|
|
|
|
|
|
--depth=<n>::
|
|
|
|
Limit fetching to ancestor-chains not longer than n.
|
|
|
|
'git-upload-pack' treats the special depth 2147483647 as
|
|
|
|
infinite even if there is an ancestor-chain that long.
|
|
|
|
|
|
|
|
--shallow-since=<date>::
|
|
|
|
Deepen or shorten the history of a shallow'repository to
|
|
|
|
include all reachable commits after <date>.
|
|
|
|
|
|
|
|
--shallow-exclude=<revision>::
|
|
|
|
Deepen or shorten the history of a shallow repository to
|
|
|
|
exclude commits reachable from a specified remote branch or tag.
|
|
|
|
This option can be specified multiple times.
|
|
|
|
|
fetch, upload-pack: --deepen=N extends shallow boundary by N commits
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9 years ago
|
|
|
--deepen-relative::
|
|
|
|
Argument --depth specifies the number of commits from the
|
|
|
|
current shallow boundary instead of from the tip of each
|
|
|
|
remote branch history.
|
|
|
|
|
|
|
|
--no-progress::
|
|
|
|
Do not show the progress.
|
|
|
|
|
|
|
|
--check-self-contained-and-connected::
|
|
|
|
Output "connectivity-ok" if the received pack is
|
|
|
|
self-contained and connected.
|
|
|
|
|
|
|
|
-v::
|
|
|
|
Run verbosely.
|
|
|
|
|
|
|
|
<repository>::
|
|
|
|
The URL to the remote repository.
|
|
|
|
|
|
|
|
<refs>...::
|
|
|
|
The remote heads to update from. This is relative to
|
|
|
|
$GIT_DIR (e.g. "HEAD", "refs/heads/master"). When
|
|
|
|
unspecified, update from all heads the remote side has.
|
|
|
|
+
|
|
|
|
If the remote has enabled the options `uploadpack.allowTipSHA1InWant` or
|
|
|
|
`uploadpack.allowReachableSHA1InWant`, they may alternatively be 40-hex
|
|
|
|
sha1s present on the remote.
|
|
|
|
|
|
|
|
SEE ALSO
|
|
|
|
--------
|
|
|
|
linkgit:git-fetch[1]
|
|
|
|
|
|
|
|
GIT
|
|
|
|
---
|
|
|
|
Part of the linkgit:git[1] suite
|