The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
"!#/bin/sh" line which can cause issues with portability. Instead
create the hook using the 'write_script' function which includes the
proper "#!" line.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
http.c supports HTTP redirects of the form
http://foo/info/refs?service=git-upload-pack
-> http://anything
-> http://bar/info/refs?service=git-upload-pack
(that is to say, as long as the Git part of the path and the query
string is preserved in the final redirect destination, the intermediate
steps can have any URL). However, if one of the intermediate steps
results in an HTTP exception, a confusing "unable to update url base
from redirection" message is printed instead of a Curl error message
with the HTTP exception code.
This was introduced by 2 commits. Commit c93c92f ("http: update base
URLs when we see redirects", 2013-09-28) introduced a best-effort
optimization that required checking if only the "base" part of the URL
differed between the initial request and the final redirect destination,
but it performed the check before any HTTP status checking was done. If
something went wrong, the normal code path was still followed, so this
did not cause any confusing error messages until commit 6628eb4 ("http:
always update the base URL for redirects", 2016-12-06), which taught
http to die if the non-"base" part of the URL differed.
Therefore, teach http to check the HTTP status before attempting to
check if only the "base" part of the URL differed. This commit teaches
http_request_reauth to return early without updating options->base_url
upon an error; the only invoker of this function that passes a non-NULL
"options" is remote-curl.c (through "http_get_strbuf"), which only uses
options->base_url for an informational message in the situations that
this commit cares about (that is, when the return value is not HTTP_OK).
The included test checks that the redirect scheme at the beginning of
this commit message works, and that returning a 502 in the middle of the
redirect scheme produces the correct result. Note that this is different
from the test in commit 6628eb4 ("http: always update the base URL for
redirects", 2016-12-06) in that this commit tests that a Git-shaped URL
(http://.../info/refs?service=git-upload-pack) works, whereas commit
6628eb4 tests that a non-Git-shaped URL
(http://.../info/refs/foo?service=git-upload-pack) does not work (even
though Git is processing that URL) and is an error that is fatal, not
silently swallowed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To push from or fetch to the current repository, remote helpers need
to know what repository that is. Accordingly, Git sets the GIT_DIR
environment variable to the path to the current repository when
invoking remote helpers.
There is a special case it does not handle: "git ls-remote" and "git
archive --remote" can be run to inspect a remote repository without
being run from any local repository. GIT_DIR is not useful in this
scenario:
- if we are not in a repository, we don't need to set GIT_DIR to
override an existing GIT_DIR value from the environment. If GIT_DIR
is present then we would be in a repository if it were valid and
would have called die() if it weren't.
- not setting GIT_DIR may cause a helper to do the usual discovery
walk to find the repository. But we know we're not in one, or we
would have found it ourselves. So in the worst case it may expend
a little extra effort to try to find a repository and fail (for
example, remote-curl would do this to try to find repository-level
configuration).
So leave GIT_DIR unset in this case. This makes GIT_DIR easier to
understand for remote helper authors and makes transport code less of
a special case for repository discovery.
Noticed using b1ef400e (setup_git_env: avoid blind fall-back to
".git", 2016-10-20) from 'next':
$ cd /tmp
$ git ls-remote https://kernel.googlesource.com/pub/scm/git/git
fatal: BUG: setup_git_env called without repository
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The http-walker may fetch the http-alternates (or
alternates) file from a remote in order to find more
objects. This should count as a "not from the user" use of
the protocol. But because we implement the redirection
ourselves and feed the new URL to curl, it will use the
CURLOPT_PROTOCOLS rules, not the more restrictive
CURLOPT_REDIR_PROTOCOLS.
The ideal solution would be for each curl request we make to
know whether or not is directly from the user or part of an
alternates redirect, and then set CURLOPT_PROTOCOLS as
appropriate. However, that would require plumbing that
information through all of the various layers of the http
code.
Instead, let's check the protocol at the source: when we are
parsing the remote http-alternates file. The only downside
is that if there's any mismatch between what protocol we
think it is versus what curl thinks it is, it could violate
the policy.
To address this, we'll make the parsing err on the picky
side, and only allow protocols that it can parse
definitively. So for example, you can't elude the "http"
policy by asking for "HTTP://", even though curl might
handle it; we would reject it as unknown. The only unsafe
case would be if you have a URL that starts with "http://"
but curl interprets as another protocol. That seems like an
unlikely failure mode (and we are still protected by our
base CURLOPT_PROTOCOL setting, so the worst you could do is
trigger one of https, ftp, or ftps).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit made HTTP redirects more obvious and
tightened up the default behavior. However, there's another
way for a server to ask a git client to fetch arbitrary
content: by having an http-alternates file (or a regular
alternates file, which is used as a backup).
Similar to the HTTP redirect case, a malicious server can
claim to have refs pointing at object X, return a 404 when
the client asks for X, but point to some other URL via
http-alternates, which the client will transparently fetch.
The end result is that it looks from the user's perspective
like the objects came from the malicious server, as the
other URL is not mentioned at all.
Worse, because we feed the new URL to curl ourselves, the
usual protocol restrictions do not kick in (neither curl's
default of disallowing file://, nor the protocol
whitelisting in f4113cac0 (http: limit redirection to
protocol-whitelist, 2015-09-22).
Let's apply the same rules here as we do for HTTP redirects.
Namely:
- unless http.followRedirects is set to "always", we will
not follow remote redirects from http-alternates (or
alternates) at all
- set CURLOPT_PROTOCOLS alongside CURLOPT_REDIR_PROTOCOLS
restrict ourselves to a known-safe set and respect any
user-provided whitelist.
- mention alternate object stores on stderr so that the
user is aware another source of objects may be involved
The first item may prove to be too restrictive. The most
common use of alternates is to point to another path on the
same server. While it's possible for a single-server
redirect to be an attack, it takes a fairly obscure setup
(victim and evil repository on the same host, host speaks
dumb http, and evil repository has access to edit its own
http-alternates file).
So we could make the checks more specific, and only cover
cross-server redirects. But that means parsing the URLs
ourselves, rather than letting curl handle them. This patch
goes for the simpler approach. Given that they are only used
with dumb http, http-alternates are probably pretty rare.
And there's an escape hatch: the user can allow redirects on
a specific server by setting http.<url>.followRedirects to
"always".
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.
Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:
1. Mallory may simply redirect all fetch requests to Bob's
server. Git will transparently follow those redirects
and fetch Bob's history, which Alice may believe she
got from Mallory. The subsequent push seems like it is
just feeding Mallory back her own objects, but is
actually leaking Bob's objects. There is nothing in
git's output to indicate that Bob's repository was
involved at all.
The downside (for Mallory) of this attack is that Alice
will have received Bob's entire repository, and is
likely to notice that when building on top of it.
2. If Mallory happens to know the sha1 of some object X in
Bob's repository, she can instead build her own history
that references that object. She then runs a dumb http
server, and Alice's client will fetch each object
individually. When it asks for X, Mallory redirects her
to Bob's server. The end result is that Alice obtains
objects from Bob, but they may be buried deep in
history. Alice is less likely to notice.
Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.
But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.
First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.
Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.
As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Generally remote-curl would never see a URL that did not
have "proto:" at the beginning, as that is what tells git to
run the "git-remote-proto" helper (and git-remote-http, etc,
are aliases for git-remote-curl).
However, the special syntax "proto::something" will run
git-remote-proto with only "something" as the URL. So a
malformed URL like:
http::/example.com/repo.git
will feed the URL "/example.com/repo.git" to
git-remote-http. The resulting URL has no protocol, but the
code added by 372370f (http: use credential API to handle
proxy authentication, 2016-01-26) does not handle this case
and segfaults.
For the purposes of this code, we don't really care what the
exact protocol; only whether or not it is https. So let's
just assume that a missing protocol is not, and curl will
handle the real error (which is that the URL is nonsense).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the new GIT_TRACE_CURL environment variable instead
of the deprecated GIT_CURL_VERBOSE.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before 14111fc (git: submodule honor -c credential.* from
command line, 2016-02-29), it was sufficient for code which
spawned a process in a submodule to just set the child
process's "env" field to "local_repo_env" to clear the
environment of any repo-specific variables.
That commit introduced a more complicated procedure, in
which we clear most variables but allow through sanitized
config. For C code, we used that procedure only for cloning,
but not for any of the programs spawned by submodule.c. As a
result, things like "git fetch --recurse-submodules" behave
differently than "git clone --recursive"; the former will
not pass through the sanitized config.
We can fix this by using prepare_submodule_repo_env()
everywhere in submodule.c.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 14111fc (git: submodule honor -c credential.* from
command line, 2016-02-29) taught git-submodule.sh to save
the sanitized value of $GIT_CONFIG_PARAMETERS when clearing
the environment for a submodule. However, it failed to
export the result, meaning that it had no effect for any
sub-programs.
We didn't catch this in our initial tests because we checked
only the "clone" case, which does not go through the shell
script at all. Provoking "git submodule update" to do a
fetch demonstrates the bug.
Noticed-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now we test only the cloning case, but there are other
interesting cases (e.g., fetching). Let's pull the setup
bits into their own test, which will make things flow more
logically once we start adding more tests which use the
setup.
Let's also introduce some whitespace to the clone-test to
split the two parts: making sure it fails without our
cmdline config, and that it succeeds with it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 14111fc (git: submodule honor -c credential.* from
command line, 2016-02-29) accidentally wrote $HTTP_URL. It
happened to work because we ended up with "credential..helper",
which we treat the same as "credential.helper", applying it
to all URLs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Due to the way that the git-submodule code works, it clears all local
git environment variables before entering submodules. This is normally
a good thing since we want to clear settings such as GIT_WORKTREE and
other variables which would affect the operation of submodule commands.
However, GIT_CONFIG_PARAMETERS is special, and we actually do want to
preserve these settings. However, we do not want to preserve all
configuration as many things should be left specific to the parent
project.
Add a git submodule--helper function, sanitize-config, which shall be
used to sanitize GIT_CONFIG_PARAMETERS, removing all key/value pairs
except a small subset that are known to be safe and necessary.
Replace all the calls to clear_local_git_env with a wrapped function
that filters GIT_CONFIG_PARAMETERS using the new helper and then
restores it to the filtered subset after clearing the rest of the
environment.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.
The backquoted form is the traditional method for command
substitution, and is supported by POSIX. However, all but the
simplest uses become complicated quickly. In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.
The patch was generated by:
for _f in $(find . -name "*.sh")
do
perl -i -pe 'BEGIN{undef $/;} s/`(.+?)`/\$(\1)/smg' "${_f}"
done
and then carefully proof-read.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we built git without curl, we can't actually test against
an http server. In fact, all of the test scripts which
include lib-httpd.sh already perform this check, with one
exception: t5540. For those scripts, this is a noop, and for
t5540, this is a bugfix (it used to fail when built with
NO_CURL, though it could go unnoticed if you had a stale
git-remote-https in your build directory).
Noticed-by: Junio C Hamano <junio@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests run diff or grep to produce an output, and then
compare the output to an expected value. We know the exit
code we expect these processes to have (e.g., grep yields 0
if it produced output and 1 otherwise), so it would not make
the test wrong to look for it. But the difference between
their output and the expected output (e.g., shown by
test_cmp) is much more useful to somebody debugging the test
than the test just bailing out.
These tests break the &&-chain to skip the exit-code check
of the process. However, we can get the same effect by using
test_might_fail. Note that in some cases the test did use
"|| return 1", which meant the test was not wrong, but it
did fool --chain-lint.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are tests which are missing a link in their &&-chain,
but during a setup phase. We may fail to notice failure in
commands that build the test environment, but these are
typically not expected to fail at all (but it's still good
to double-check that our test environment is what we
expect).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an Accept-Language header which indicates the user's preferred
languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
Examples:
LANGUAGE= -> ""
LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"
This gives git servers a chance to display remote error messages in
the user's preferred language.
Limit the number of languages to 1,000 because q-value must not be
smaller than 0.001, and limit the length of Accept-Language header to
4,000 bytes for some HTTP servers which cannot accept such long header.
Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time, dumb http always fetched .idx files
directly into their final location, and then checked their
validity with parse_pack_index. This was refactored in
commit 750ef42 (http-fetch: Use temporary files for
pack-*.idx until verified, 2010-04-19), which uses the
following logic:
1. If we have the idx already in place, see if it's
valid (using parse_pack_index). If so, use it.
2. Otherwise, fetch the .idx to a tempfile, check
that, and if so move it into place.
3. Either way, fetch the pack itself if necessary.
However, it got step 1 wrong. We pass a NULL path parameter
to parse_pack_index, so an existing .idx file always looks
broken. Worse, we do not treat this broken .idx as an
opportunity to re-fetch, but instead return an error,
ignoring the pack entirely. This can lead to a dumb-http
fetch failing to retrieve the necessary objects.
This doesn't come up much in practice, because it must be a
packfile that we found out about (and whose .idx we stored)
during an earlier dumb-http fetch, but whose packfile we
_didn't_ fetch. I.e., we did a partial clone of a
repository, didn't need some packfiles, and now a followup
fetch needs them.
Discovery and tests by Charles Bailey <charles@hashpling.org>.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
extract_content_type() could not extract a charset parameter if the
parameter is not the first one and there is a whitespace and a following
semicolon just before the parameter. For example:
text/plain; format=fixed ;charset=utf-8
And it also could not handle correctly some other cases, such as:
text/plain; charset=utf-8; format=fixed
text/plain; some-param="a long value with ;semicolons;"; charset=utf-8
Thanks-to: Jeff King <peff@peff.net>
Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently recognize an error message with a content-type
"text/plain; charset=utf-16" as text, but we ignore the
charset parameter entirely. Let's encode it to
log_output_encoding, which is presumably something the
user's terminal can handle.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we get a content-type from curl, we get the whole
header line, including any parameters, and without any
normalization (like downcasing or whitespace) applied.
If we later try to match it with strcmp() or even
strcasecmp(), we may get false negatives.
This could cause two visible behaviors:
1. We might fail to recognize a smart-http server by its
content-type.
2. We might fail to relay text/plain error messages to
users (especially if they contain a charset parameter).
This patch teaches the http code to extract and normalize
just the type/subtype portion of the string. This is
technically passing out less information to the callers, who
can no longer see the parameters. But none of the current
callers cares, and a future patch will add back an
easier-to-use method for accessing those parameters.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since commit 426e70d (remote-curl: show server content on
http errors, 2013-04-05), we relay any text/plain error
messages from the remote server to the user. However, we
never tested it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make clear which one is for dumb protocol, which one is for smart from
their file name.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We set the default apache port for each of the httpd tests
to the 4-digit test number of the test script. We want these
to remain unique so that the tests do not conflict with each
other when run in parallel.
Instead of doing it manually in each test script, let's just
set it from the test name at run time. This is simpler, and
is one less thing to be updated when test scripts are
renamed (e.g., when being re-rolled or when conflicting
after being merged with another topic).
Incidentally, this fixes a case where t5537 and t5538 used
the same port number (5537), and could conflict with each
other when run in parallel.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The httpd server we set up to test git's http client code
knows about a single account, in which both the username and
password are "user@host" (the unusual use of the "@" here is
to verify that we handle the character correctly when URL
escaped).
This means that we may miss a certain class of errors in
which the username and password are mixed up internally by
git. We can make our tests more robust by having distinct
values for the username and password.
In addition to tweaking the server passwd file and the
client URL, we must teach the "askpass" harness to accept
multiple values. As a bonus, this makes the setup of some
tests more obvious; when we are expecting git to ask
only about the password, we can seed the username askpass
response with a bogus value.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests just want a bit-for-bit identical copy; they do not need
even -H (there is no symbolic link involved) nor -p (there is no
funny permission or ownership issues involved).
Just use "cp -R" instead.
Signed-off-by: Ben Walton <bdwalton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The t5550 script sets up a nice askpass helper for
simulating user input and checking what git prompted for.
Let's make it available to other http scripts by migrating
it to lib-httpd.
We can use this immediately in t5540 to make our tests more
robust (previously, we did not check at all that hitting the
password-protected repo actually involved a password).
Unfortunately, we end up failing the test because the
current code erroneously prompts twice (once for
git-remote-http, and then again when the former spawns
git-http-push).
More importantly, though, it will let us easily add
smart-http authentication tests in t5541 and t5551; we
currently do not test smart-http authentication at all.
As part of making it generic, let's always look for and
store auxiliary askpass files at the top-level trash
directory; this makes it compatible with t5540, which runs
some tests from sub-repositories. We can abstract away the
ugliness with a short helper function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most of our tests, we put repos to be accessed by dumb
protocols in /dumb, and repos to be accessed by smart
protocols in /smart. In our test apache setup, the whole
/auth hierarchy requires authentication. However, we don't
bother to split it by smart and dumb here because we are not
currently testing smart-http authentication at all.
That will change in future patches, so let's be explicit
that we are interested in testing dumb access here. This
also happens to match what t5540 does for the push tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HTTP authentication is currently handled by get_refs and fetch_ref, but
not by fetch_object, fetch_pack or fetch_alternates. In the
single-threaded case, this is not an issue, since get_refs is always
called first. It recognigzes the 401 and prompts the user for
credentials, which will then be used subsequently.
If the curl multi interface is used, however, only the multi handle used
by get_refs will have credentials configured. Requests made by other
handles fail with an authentication error.
Fix this by setting CURLOPT_USERPWD whenever a slot is requested.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a repo with multiple loose objects in order to demonstrate http
authentication breakage.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subsequently we assume that there is only one pack. Currently this is
true only by accident. Pass '-a -d' to repack in order to guarantee that
assumption to hold true.
The prune-packed command is now redundant since repack -d already calls
it.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing a URL into a credential struct, we carefully
record each part of the URL, including the path on the
remote host, and use the result as part of the credential
context.
This had two practical implications:
1. Credential helpers which store a credential for later
access are likely to use the "path" portion as part of
the storage key. That means that a request to
https://example.com/foo.git
would not use the same credential that was stored in an
earlier request for:
https://example.com/bar.git
2. The prompt shown to the user includes all relevant
context, including the path.
In most cases, however, users will have a single password
per host. The behavior in (1) will be inconvenient, and the
prompt in (2) will be overly long.
This patch introduces a config option to toggle the
relevance of http paths. When turned on, we use the path as
before. When turned off, we drop the path component from the
context: helpers don't see it, and it does not appear in the
prompt.
This is nothing you couldn't do with a clever credential
helper at the start of your stack, like:
[credential "http://"]
helper = "!f() { grep -v ^path= ; }; f"
helper = your_real_helper
But doing this:
[credential]
useHttpPath = false
is way easier and more readable. Furthermore, since most
users will want the "off" behavior, that is the new default.
Users who want it "on" can set the variable (either for all
credentials, or just for a subset using
credential.*.useHttpPath).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Credential helpers can help users avoid having to type their
username and password over and over. However, some users may
not want a helper for their password, or they may be running
a helper which caches for a short time. In this case, it is
convenient to provide the non-secret username portion of
their credential via config.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functionality for credential storage helpers is already
there; we just need to give the users a way to turn it on.
This patch provides a "credential.helper" configuration
variable which allows the user to provide one or more helper
strings.
Rather than simply matching credential.helper, we will also
compare URLs in subsection headings to the current context.
This means you can apply configuration to a subset of
credentials. For example:
[credential "https://example.com"]
helper = foo
would match a request for "https://example.com/foo.git", but
not one for "https://kernel.org/foo.git".
This is overkill for the "helper" variable, since users are
unlikely to want different helpers for different sites (and
since helpers run arbitrary code, they could do the matching
themselves anyway).
However, future patches will add new config variables where
this extra feature will be more useful.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This didn't have an impact, because it was just setting up
an "expect" file that happened to be identical to the one in
the test before it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test 5550 was apparently using the default port number by mistake.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Until now, a request for an http password looked like:
Username:
Password:
Now it will look like:
Username for 'example.com':
Password for 'example.com':
Picked-from: Jeff King <peff@peff.net>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These just checked that we could clone a repository when the
username and password were given in the URL; we should also
check that git will prompt when no or partial credentials
are given.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the authentification initialisation to percent-decode username
and password for HTTP URLs.
Signed-off-by: Gabriel Corona <gabriel.corona@enst-bretagne.fr>
Acked-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test for HTTP authentication and proper percent-decoding of the
userinfo (username and password) part of the URL.
Signed-off-by: Gabriel Corona <gabriel.corona@enst-bretagne.fr>
Acked-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
SKIP messages are now part of the TAP plan. A TAP harness now knows
why a particular test was skipped and can report that information. The
non-TAP harness built into Git's test-lib did nothing special with
these messages, and is unaffected by these changes.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Verify that a downloaded pack-*.idx file is consistent and valid
as an index file before we rename it into its final destination.
This prevents a corrupt index file from later being treated as a
usable file, confusing readers.
Check that we do not have the pack index file before invoking
fetch_pack_index(); that way, we can do without the has_pack_index()
check in fetch_pack_index().
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To ensure we don't leave a corrupt pack file positioned as though
it were a valid pack file, run index-pack on the temporary pack
before we rename it to its final name. If index-pack crashes out
when it discovers file corruption (e.g. GitHub's error HTML at the
end of the file), simply delete the temporary files to cleanup.
By waiting until the pack has been validated before we move it
to its final name, we eliminate a race condition where another
concurrent reader might try to access the pack at the same time
that we are still trying to verify its not corrupt.
Switching from verify-pack to index-pack is a change in behavior,
but it should turn out better for users. The index-pack algorithm
tries to minimize disk seeks, as well as the number of times any
given object is inflated, by organizing its work along delta chains.
The verify-pack logic does not attempt to do this, thrashing the
delta base cache and the filesystem cache.
By recreating the index file locally, we also can automatically
upgrade from a v1 pack table of contents to v2. This makes the
CRC32 data available for use during later repacks, even if the
server didn't have them on hand.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Acked-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change into the server repository's directory using a subshell,
so we can return back to the top of the trash directory before
doing anything more in the test script.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The top level directory "/smart/" of the test Apache server is mapped
through our git-http-backend CGI, but uses the same underlying
repository space as the server's document root. This is the most
simple installation possible.
Server logs are checked to verify the client has accessed only the
smart URLs during the test. During fetch testing the headers are
also logged from libcurl to ensure we are making a reasonably sane
HTTP request, and getting back reasonably sane response headers
from the CGI.
When validating the request headers used during smart fetch we munge
away the actual Content-Length and replace it with the placeholder
"xxx". This avoids unnecessary varability in the test caused by
an unrelated change in the requested capabilities in the first want
line of the request. However, we still want to look for and verify
that Content-Length was used, because smaller payloads should be
using Content-Length and not "Transfer-Encoding: chunked".
When validating the server response headers we must discard both
Content-Length and Transfer-Encoding, as Apache2 can use either
format to return our response.
During development of this test I observed Apache returning both
forms, depending on when the processes got CPU time. If our CGI
returned the pack data quickly, Apache just buffered the whole
thing and returned a Content-Length. If our CGI took just a bit
too long to complete, Apache flushed its buffer and instead used
"Transfer-Encoding: chunked".
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>