This changes the pattern matching code to not store the required final
/ before the *, and then to require each side to be a valid ref (or
empty). In particular, any refspec that looks like it should be a
pattern but doesn't quite meet the requirements will be found to be
invalid as a fallback non-pattern.
This was cherry picked from commit ef00d15 (Tighten refspec processing,
2008-03-17), and two fix-up commits 46220ca (remote.c: Fix overtight
refspec validation, 2008-03-20) and 7d19da4 (refspec: allow colon-less
wildcard "refs/category/*", 2008-03-25) squashed in.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We tightened the refspec validation code in an earlier commit ef00d15
(Tighten refspec processing, 2008-03-17) per my suggestion, but the
suggestion was misguided to begin with and it broke this usage:
$ git push origin HEAD~12:master
The syntax of push refspecs and fetch refspecs are similar in that they
are both colon separated LHS and RHS (possibly prefixed with a + to
force), but the similarity ends there. For example, LHS in a push refspec
can be anything that evaluates to a valid object name at runtime (except
when colon and RHS is missing, or it is a glob), while it must be a
valid-looking refname in a fetch refspec. To validate them correctly, the
caller needs to be able to say which kind of refspecs they are. It is
unreasonable to keep a single interface that cannot tell which kind it is
dealing with, and ask it to behave sensibly.
This commit separates the parsing of the two into different functions, and
clarifies the code to implement the parsing proper (i.e. splitting into
two parts, making sure both sides are wildcard or neither side is).
This happens to also allow pushing a commit named with the esoteric "look
for that string" syntax:
$ git push ../test.git ':/remote.c: Fix overtight refspec:master'
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to commit 8320199 (Rewrite builtin-fetch option parsing to use
parse_options().), we understood '-n' as a short option to mean "don't
fetch tags from the remote". This patch reinstates behaviour similar,
but not identical to the pre commit 8320199 times.
Back then, -n always overrode --tags, so if both --tags and -n was
given on command-line, no tags were fetched regardless of argument
ordering. Now we use a "last entry wins" strategy, so '-n --tags'
means "fetch tags".
Since it's patently absurd to say both --tags and --no-tags, this
shouldn't matter in practice.
Spotted-by: Artem Zolochevskiy <azol@altlinux.org>
Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Tested-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the remote peer upload-pack process supports the include-tag
protocol extension then we can avoid running a second fetch cycle
on the client side by letting the server send us the annotated tags
along with the objects it is packing for us. In the following graph
we can now fetch both "tag1" and "tag2" on the same connection that
we fetched "master" from the remote when we only have L available
on the local side:
T - tag1 S - tag2
/ /
L - o ------ o ------ B
\ \
\ \
origin/master master
The objects for "tag1" are implicitly downloaded without our direct
knowledge. The existing "quickfetch" optimization within git-fetch
discovers that tag1 is complete after the first connection and does
not open a second connection.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the situation is the following on the remote and L is the common
base between both sides:
T - tag1 S - tag2
/ /
L - A - O - O - B
\ \
origin/master master
and we have decided to fetch "master" to acquire the range L..B we
can also nab tag S at the same time during the first connection,
as we can clearly see from the refs advertised by upload-pack that
S^{} = B and master = B.
Unfortunately we still cannot nab T at the same time as we are not
able to see that T^{} will also be in the range implied by L..B.
Such computations must be performed on the remote side (not yet
supported) or on the client side as post-processing (the current
behavior).
This optimization is an extension of the previous one in that it
helps on projects which tend to publish both a new commit and a
new tag, then lay idle for a while before publishing anything else.
Most followers are able to download both the new commit and the new
tag in one connection, rather than two. git.git tends to follow
such patterns with its roughly once-daily updates from Junio.
A protocol extension and additional server side logic would be
necessary to also ensure T is grabbed on the first connection.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If autofollowing of tags is enabled, we see a new tag on the remote
that we don't have, and we already have the SHA-1 object that the
tag is peeled to, then we can fetch the tag while we are fetching
the other objects on the first connection.
This is a slight optimization for projects that have a habit of
tagging a release commit after most users have already seen and
downloaded that commit object through a prior fetch session. In
such cases the users may still find new objects in branch heads,
but the new tag will now also be part of the first pack transfer
and the subsequent connection to autofollow tags is not required.
Currently git.git does not benefit from this optimization as any
release usually gets a new commit at the same time that it gets a
new release tag, however git-gui.git and many other projects are
in the habit of tagging fairly old commits.
Users who did not already have the tagged commit still require
opening a second connection to autofollow the tag, as we are unable
to determine on the client side if $tag^{} will be sent to the
client during the first transfer or not. Such computation must be
performed on the remote side of the connection and is deferred to
another series of changes.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To support calling find_non_local_tags() more than once in a single
git-fetch process we need the existing_refs to be stack-allocated
so it resets on the second call. We also should free the path
lists to avoid unnecessary memory leaking.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By allowing the function to append onto the end of an existing list
we can do more interesting things, like join the list of tags we
want to fetch into the first fetch, rather than the second.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we ever decided to append onto the end of this list the tail
pointer must be looking at the right memory cell at the end of
the HEAD ref_map.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can free this ref_map as soon as the fetch is complete. It is not
used for the automatic tag following, nor is it used to disconnect the
transport. This avoids some confusion about why we are holding onto
these refs while following tags.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apparently fetch_map is passed through, but is not actually used.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This shares the connection between getting the remote ref list and
getting objects in the first batch. (A second connection is still used
to follow tags).
When we do not fetch objects (i.e. either ls-remote disconnects after
getting list of refs, or we decide we are already up-to-date), we
clean up the connection properly; otherwise the connection is left
open in need of cleaning up to avoid getting an error message from
the remote end when ssh is used.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This gets a little tricky because of the way --tags and --no-tags
are handled, and the "tag <name>" syntax needs a little hand-holding too.
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back in e3c6f240fd Junio taught
git-fetch to avoid copying objects when we are fetching from
a repository that is already registered as an alternate object
database. In such a case there is no reason to copy any objects
as we can already obtain them through the alternate.
However we need to ensure the objects are all reachable, so we
run `git rev-list --objects $theirs --not --all` to verify this.
If any object is missing or unreadable then we need to fetch/copy
the objects from the remote. When a missing object is detected
the git-rev-list process will exit with a non-zero exit status,
making this condition quite easy to detect.
Although git-fetch is currently a builtin (and so is rev-list)
we cannot invoke the traverse_objects() API at this point in the
transport code. The object walker within traverse_objects() calls
die() as soon as it finds an object it cannot read. If that happens
we want to resume the fetch process by calling do_fetch_pack().
To get around this we spawn git-rev-list into a background process
to prevent a die() from killing the foreground fetch process,
thus allowing the fetch process to resume into do_fetch_pack()
if copying is necessary.
We aren't interested in the output of rev-list (a list of SHA-1
object names that are reachable) or its errors (a "spurious" error
about an object not being found as we need to copy it) so we redirect
both stdout and stderr to /dev/null.
We run this git-rev-list based check before any fetch as we may
already have the necessary objects local from a prior fetch. If we
don't then its very likely the first $theirs object listed on the
command line won't exist locally and git-rev-list will die very
quickly, allowing us to start the network transfer. This test even
on remote URLs may save bandwidth if someone runs `git pull origin`,
sees a merge conflict, resets out, then redoes the same pull just
a short time later. If the remote hasn't changed between the two
pulls and the local repository hasn't had git-gc run in it then
there is probably no need to perform network transfer as all of
the objects are local.
Documentation for the new quickfetch function was suggested and
written by Junio, based on his original comment in git-fetch.sh.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Previously git-fetch.sh used `git cat-file -t` to determine if an
object referenced by a tag exists, and if so fetch that tag locally.
This was subtly broken during the port to C based builtin-fetch as
lookup_object() only works to locate an object if it was previously
accessed by the transport. Not all transports will access all
objects in this way, so tags were not always being fetched.
The rsync transport never loads objects into the internal object
table so automated tag following didn't work if rsync was used.
Automated tag following also didn't work on the native transport
if the new tag was behind the common point(s) negotiated between
the two ends of the connection as the tag's referrant would not
be loaded into the internal object table. Further the automated
tag following was broken with the HTTP commit walker if the new
tag's referrant was behind an existing ref, as the walker would
stop before loading the tag's referrant into the object table.
Switching to has_sha1_file() restores the original behavior from
the shell script by checking if the object exists in the ODB,
without relying on the state left behind by a transport.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
"-q" is the very first option described in the git-fetch manpage, and it
isn't supported.
Signed-off-by: Steven Grimm <koreth@midwinter.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The list of remote refs in struct transport should be const, because
builtin-fetch will get confused if it changes.
The url in git_connect should be const (and work on a copy) instead of
requiring the caller to copy it.
match_refs doesn't modify the refspecs it gets.
get_fetch_map and get_remote_ref don't change the list they get.
Allow transport get_refs_list methods to modify the struct transport.
Add a function to copy a list of refs, when a function needs a mutable
copy of a const list.
Add a function to check the type of a ref, as per the code in connect.c
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes the fetch output much more terse and prettier on a 80 column
display, based on a consensus reached on the mailing list. Here's an
example output:
Receiving objects: 100% (5439/5439), 1.60 MiB | 636 KiB/s, done.
Resolving deltas: 100% (4604/4604), done.
From git://git.kernel.org/pub/scm/git/git
! [rejected] html -> origin/html (non fast forward)
136e631..f45e867 maint -> origin/maint (fast forward)
9850e2e..44dd7e0 man -> origin/man (fast forward)
3e4bb08..e3d6d56 master -> origin/master (fast forward)
fa3665c..536f64a next -> origin/next (fast forward)
+ 4f6d9d6...768326f pu -> origin/pu (forced update)
* [new branch] todo -> origin/todo
Some portions of this patch have been extracted from earlier proposals
by Jeff King and Shawn Pearce.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the branch named with branch.$name.merge is not covered by
the fetch configuration for the remote repository named with
branch.$name.remote, we automatically add that branch to the set
of branches to be fetched. However, if the remote repository
does not have that branch (e.g. it used to exist, but got
removed), this is not a reason to fail the git-fetch itself.
The situation however will be noticed if git-fetch was called by
git-pull, as the resulting FETCH_HEAD would not have any entry
that is marked for merging.
Acked-By: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the same bug as 42a32174b6.
The warning "Object $X is a tree, not a commit" is bogus and is
not relevant here. If its not a commit we just need to make sure
we don't mark it for merge as we fill out FETCH_HEAD.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When doing "git fetch <remote>" on a remote that does not have the
branch referenced in branch.<current-branch>.merge, git fetch failed.
It failed because it tried to add the "merge" ref to the refs to be
fetched.
Fix that. And add a test case.
Incidentally, this unconvered a bug in our own test suite, where
"git pull <some-path>" was expected to merge the ref given in the
defaults, even if not pulling from the default remote.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If multiple refspecs matched the same ref, the update would be
processed multiple times. Now having the same destination for the same
source has no additional effect, and having the same destination for
different sources is an error.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This adds a verbosity level below 0 for suppressing default messages
with --quiet, and makes the default for http be verbose instead of
quiet. This matches the behavior of the shell script version of git-fetch.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The config item for a refspec side and the ref name that it matches
aren't necessarily character-for-character identical. We actually want
to merge a ref by default if: there is no per-branch config, it is the
found result of looking for the match for the first refspec, and the
first refspec is not a pattern. Beyond that, anything that
get_fetch_map() thinks matches is fine.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Aside from reducing the code by 20 lines this refactoring removes
a level of indirection when trying to access the operations of a
given transport "instance", making the code clearer and easier to
follow.
It also has the nice effect of giving us the benefits of C99 style
struct initialization (namely ".fetch = X") without requiring that
level of language support from our compiler. We don't need to worry
about new operation methods being added as they will now be NULL'd
out automatically by the xcalloc() we use to create the new struct
transport we supply to the caller.
This pattern already exists in struct walker, so we already have
a precedent for it in Git. We also don't really need to worry
about any sort of performance decreases that may occur as a result
of filling out 4-8 op pointers when we make a "struct transport".
The extra few CPU cycles this requires over filling in the "struct
transport_ops" is killed by the time it will take Git to actually
*use* one of those functions, as most transport operations are
going over the wire or will be copying object data locally between
two directories.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Anyplace we talk about the address of a remote repository we always
refer to it as a URL, especially in the configuration file and
.git/remotes where we call it "remote.$n.url" or start the first
line with "URL:". Calling this value a uri within the internal C
code just doesn't jive well with our commonly accepted terms.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a transport doesn't support an option we already are telling
the higher level application (fetch or push) that the option is not
valid by sending back a >0 return value from transport_set_option
so there's not a strong motivation to have the function perform the
output itself. Instead we should let the higher level application
do the output if it is necessary. This avoids always telling the
user that depth isn't supported on HTTP urls even when they did
not pass a --depth option to git-fetch.
If the user passes an option and the option value is invalid we now
properly die in git-fetch instead of just spitting out a message
and running anyway. This mimics prior behavior better where
incorrect/malformed options are not accepted by the process.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
My prior bug fix for git-push titled "Don't configure remote "." to
fetch everything to itself" actually broke t5520 as we were unable
to evaluate a branch configuration of:
[branch "copy"]
remote = .
merge = refs/heads/master
as remote "." did not have a "remote...fetch" configuration entry to
offer up refs/heads/master as a possible candidate available to be
fetched and merged. In shell script git-fetch and prior to the above
mentioned commit this was hardcoded for a url of "." to be the set of
local branches.
Chasing down this bug led me to the conclusion that our prior behavior
with regards to branch.$name.merge was incorrect. In the shell script
based git-fetch implementation we only fetched and merged a branch if
it appeared both in branch.$name.merge *and* in remote.$r.fetch, where
$r = branch.$name.remote. In other words in the following config file:
[remote "origin"]
url = git://git.kernel.org/pub/scm/git/git.git
fetch = refs/heads/master:refs/remotes/origin/master
[branch "master"]
remote = origin
merge = refs/heads/master
[branch "pu"]
remote = origin
merge = refs/heads/pu
Attempting to run `git pull` while on branch "pu" would always give
the user "Already up-to-date" as git-fetch did not fetch pu and thus
did not mark it for merge in .git/FETCH_HEAD. The configured merge
would always be ignored and the user would be left scratching her
confused head wondering why merge did not work on "pu" but worked
fine on "master".
If we are using the "default fetch" specification for the current
branch and the current branch has a branch.$name.merge configured
we now union it with the list of refs in remote.$r.fetch. This
way the above configuration does what the user expects it to do,
which is to fetch only "master" by default but when on "pu" to
fetch both "master" and "pu".
This uncovered some breakage in the test suite where old-style Cogito
branches (.git/branches/$r) did not fetch the branches listed in
.git/config for merging and thus did not actually merge them if the
user tried to use `git pull` on that branch. Junio and I discussed
it on list and felt that the union approach here makes more sense to
DWIM for the end-user than silently ignoring their configured request
so the test vectors for t5515 have been updated to include for-merge
lines in .git/FETCH_HEAD where they have been configured for-merge
in .git/config.
Since we are now performing a union of the fetch specification and
the merge specification and we cannot allow a branch to be listed
twice (otherwise it comes out twice in .git/FETCH_HEAD) we need to
perform a double loop here over all of the branch.$name.merge lines
and try to set their merge flag if we have already schedule that
branch for fetching by remote.$r.fetch. If no match is found then
we must add new specifications to fetch the branch but not store it
as no local tracking branch has been designated.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Older git-fetch.sh doesn't print "ref: X" when invoked as
`git fetch $url X" so we shouldn't do that now in the new
builtin version.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are running fetch in a repository that has a detached HEAD
then there is no current_branch available. In such a case any ref
that the fetch might update by definition cannot also be the current
branch so we should always bypass the "don't update HEAD" test.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't actually need to know at the time of transport_get if the
caller wants to fetch, push, or do both on the returned object.
It is easier to just delay the initialization of the HTTP walker
until we know we will need it by providing a CURL specific fetch
function in the curl_transport that makes sure the walker instance
is initialized before use.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are using a native transport and the transport chose to
save the packfile it may have created a .keep file to protect
the packfile from a concurrently running git-repack process.
In such a case the git-fetch process should make sure it will
unlink the .keep file even if it fails to update any refs as
otherwise the newly downloaded packfile's diskspace will never
be reclaimed if the objects are not actually referenced.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are using a native packfile to perform a git-fetch invocation
and the received packfile contained more than the configured limits
of fetch.unpackLimit/transfer.unpackLimit then index-pack will output
a single line saying "keep\t$sha1\n" to stdout. This line needs to
be captured and retained so we can delete the corresponding .keep
file ("$GIT_DIR/objects/pack/pack-$sha1.keep") once all refs have
been safely updated.
This trick has long been in use with git-fetch.sh and its lower level
helper git-fetch--tool as a way to allow index-pack to save the new
packfile before the refs have been updated and yet avoid a race with
any concurrently running git-repack process. It was unfortunately
lost when git-fetch.sh was converted to pure C and fetch--tool was
no longer being invoked.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit walkers need to know the SHA-1 name of any objects they
have been asked to fetch while the native pack transport only
wants to know the names of the remote refs as the remote side
must do the name->SHA-1 translation.
Since we only have three fetch implementations and one of them
(bundle) doesn't even need the name information we can reduce
the code required to perform a fetch by having just one function
and passing of the filtered list of refs to be fetched. Each
transport can then obtain the information it needs from that ref
array to construct its own internal operation state.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Conflicts:
transport.c
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Never referenced. This should actually be handled down inside
of builtin-fetch-pack, not up here in the generic user frontend.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The older git-fetch client did not produce all of this debugging
information to stdout. Most end-users and Porcelain (e.g. StGIT,
git-gui, qgit) do not want to see these low-level details on the
console so they should be removed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are adding a space between each argument in the sprintf above
so we must account for this as we update our position within the
reflog message and append in any remaining arguments.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are fetching to a local reference (the so called peer_ref) and
the refspec that created this ref/peer_ref association had started
with '+' we are supposed to allow a non-fast-forward update during
fetch, even if --force was not supplied on the command line. The
builtin-fetch implementation was not honoring this setting as it
was copied from the wrong struct ref instance.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Thanks to Johannes Schindelin for review and fixes, and Julian
Phillips for the original C translation.
This changes a few small bits of behavior:
branch.<name>.merge is parsed as if it were the lhs of a fetch
refspec, and does not have to exactly match the actual lhs of a
refspec, so long as it is a valid abbreviation for the same ref.
branch.<name>.merge is no longer ignored if the remote is configured
with a branches/* file. Neither behavior is useful, because there can
only be one ref that gets fetched, but this is more consistant.
Also, fetch prints different information to standard out.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>