The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. Better use a
semicolon instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop using `the_repository` in the "credential" subsystem by passing in
a repository when filling, approving or rejecting credentials.
Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After dumb-http downloads the remote info/refs file, it adds an extra
HEAD ref struct to our list by downloading the remote symref and finding
the matching ref within our list. If either of those fails, we throw
away the ref struct. But we do so with free(), when we should use
free_one_ref() to catch any embedded allocations (in particular, if
fetching the remote HEAD succeeded but the branch is unborn, its
ref->symref field will be populated but we'll still throw it all away).
This leak is triggered by t5550 (but we still have a little more work to
mark it leak-free).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.
* jk/remote-wo-url:
remote: drop checks for zero-url case
remote: always require at least one url in a remote
t5801: test remote.*.vcs config
t5801: make remote-testgit GIT_DIR setup more robust
remote: allow resetting url list
config: document remote.*.url/pushurl interaction
remote: simplify url/pushurl selection
remote: use strvecs to store remote url/pushurl
remote: transfer ownership of memory in add_url(), etc
remote: refactor alias_url() memory ownership
archive: fix check for missing url
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.
* ps/use-the-repository:
hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
t/helper: remove dependency on `the_repository` in "proc-receive"
t/helper: fix segfault in "oid-array" command without repository
t/helper: use correct object hash in partial-clone helper
compat/fsmonitor: fix socket path in networked SHA256 repos
replace-object: use hash algorithm from passed-in repository
protocol-caps: use hash algorithm from passed-in repository
oidset: pass hash algorithm when parsing file
http-fetch: don't crash when parsing packfile without a repo
hash-ll: merge with "hash.h"
refs: avoid include cycle with "repository.h"
global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
hash: require hash algorithm in `empty_tree_oid_hex()`
hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
hash: make `is_null_oid()` independent of `the_repository`
hash: convert `oidcmp()` and `oideq()` to compare whole hash
global: ensure that object IDs are always padded
hash: require hash algorithm in `oidread()` and `oidclr()`
hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
Building with "-Werror -Wwrite-strings" is now supported.
* ps/no-writable-strings: (27 commits)
config.mak.dev: enable `-Wwrite-strings` warning
builtin/merge: always store allocated strings in `pull_twohead`
builtin/rebase: always store allocated string in `options.strategy`
builtin/rebase: do not assign default backend to non-constant field
imap-send: fix leaking memory in `imap_server_conf`
imap-send: drop global `imap_server_conf` variable
mailmap: always store allocated strings in mailmap blob
revision: always store allocated strings in output encoding
remote-curl: avoid assigning string constant to non-const variable
send-pack: always allocate receive status
parse-options: cast long name for OPTION_ALIAS
http: do not assign string constant to non-const field
compat/win32: fix const-correctness with string constants
pretty: add casts for decoration option pointers
object-file: make `buf` parameter of `index_mem()` a constant
object-file: mark cached object buffers as const
ident: add casts for fallback name and GECOS
entry: refactor how we remove items for delayed checkouts
line-log: always allocate the output prefix
line-log: stop assigning string constant to file parent buffer
...
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.
It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.
Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit
For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).
Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the url/pushurl fields of "struct remote" own their strings, we
can switch from bare arrays to strvecs. This has a few advantages:
- push/clear are now one-liners
- likewise the free+assigns in alias_all_urls() can use
strvec_replace()
- we now use size_t for storage, avoiding possible overflow
- this will enable some further cleanups in future patches
There's quite a bit of fallout in the code that reads these fields, as
it tends to access these arrays directly. But it's mostly a mechanical
replacement of "url_nr" with "url.nr", and "url[i]" with "url.v[i]",
with a few variations (e.g. "*url" could become "*url.v", but I used
"url.v[0]" for consistency).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When processing remote options, we split the option line into two by
searching for a space. If there is one, we replace the space with '\0',
otherwise we implicitly assume that the value is "true" and thus assign
a string constant.
As the return value of strchr(3P) weirdly enough is a `char *` even
though it gets a `const char *` as input, the assigned-to variable also
is a non-constant. This is fine though because the argument is in fact
an allocated string, and thus we are allowed to modify it. But this will
break once we enable `-Wwrite-strings`.
Refactor the code stop splitting the fields with '\0' altogether.
Instead, we can pass the length of the option name to `set_option()` and
then use strncmp(3P) instead of strcmp(3P).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before discovering the repository details, We used to assume SHA-1
as the "default" hash function, which has been corrected. Hopefully
this will smoke out codepaths that rely on such an unwarranted
assumptions.
* ps/undecided-is-not-necessarily-sha1:
repository: stop setting SHA1 as the default object hash
oss-fuzz/commit-graph: set up hash algorithm
builtin/shortlog: don't set up revisions without repo
builtin/diff: explicitly set hash algo when there is no repo
builtin/bundle: abort "verify" early when there is no repository
builtin/blame: don't access potentially unitialized `the_hash_algo`
builtin/rev-parse: allow shortening to more than 40 hex characters
remote-curl: fix parsing of detached SHA256 heads
attr: fix BUG() when parsing attrs outside of repo
attr: don't recompute default attribute source
parse-options-cb: only abbreviate hashes when hash algo is known
path: move `validate_headref()` to its only user
path: harden validation of HEAD with non-standard hashes
The credential helper protocol, together with the HTTP layer, have
been enhanced to support authentication schemes different from
username & password pair, like Bearer and NTLM.
* bc/credential-scheme-enhancement:
credential: add method for querying capabilities
credential-cache: implement authtype capability
t: add credential tests for authtype
credential: add support for multistage credential rounds
t5563: refactor for multi-stage authentication
docs: set a limit on credential line length
credential: enable state capability
credential: add an argument to keep state
http: add support for authtype and credential
docs: indicate new credential protocol fields
credential: add a field called "ephemeral"
credential: gate new fields on capability
credential: add a field for pre-encoded credentials
http: use new headers for each object request
remote-curl: reset headers on new request
credential: add an authtype field
The dumb HTTP transport tries to read the remote HEAD reference by
downloading the "HEAD" file and then parsing it via `http_fetch_ref()`.
This function will either parse the file as an object ID in case it is
exactly `the_hash_algo->hexsz` long, or otherwise it will check whether
the reference starts with "ref :" and parse it as a symbolic ref.
This is broken when parsing detached HEADs of a remote SHA256 repository
because we never update `the_hash_algo` to the discovered remote object
hash. Consequently, `the_hash_algo` will always be the fallback SHA1
hash algorithm, which will cause us to fail parsing HEAD altogteher when
it contains a SHA256 object ID.
Fix this issue by setting up `the_hash_algo` via `repo_set_hash_algo()`.
While at it, let's make the expected SHA1 fallback explicit in our code,
which also addresses an upcoming issue where we are going to remove the
SHA1 fallback for `the_hash_algo`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We support the new credential and authtype fields, but we lack a way to
indicate to a credential helper that we'd like them to be used. Without
some sort of indication, the credential helper doesn't know if it should
try to provide us a username and password, or a pre-encoded credential.
For example, the helper might prefer a more restricted Bearer token if
pre-encoded credentials are possible, but might have to fall back to
more general username and password if not.
Let's provide a simple way to indicate whether Git (or, for that matter,
the helper) is capable of understanding the authtype and credential
fields. We send this capability when we generate a request, and the
other side may reply to indicate to us that it does, too.
For now, don't enable sending capabilities for the HTTP code. In a
future commit, we'll introduce appropriate handling for that code,
which requires more in-depth work.
The logic for determining whether a capability is supported may seem
complex, but it is not. At each stage, we emit the capability to the
following stage if all preceding stages have declared it. Thus, if the
caller to git credential fill didn't declare it, then we won't send it
to the helper, and if fill's caller did send but the helper doesn't
understand it, then we won't send it on in the response. If we're an
internal user, then we know about all capabilities and will request
them.
For "git credential approve" and "git credential reject", we set the
helper capability before calling the helper, since we assume that the
input we're getting from the external program comes from a previous call
to "git credential fill", and thus we'll invoke send a capability to the
helper if and only if we got one from the standard input, which is the
correct behavior.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we retry a post_rpc request, we currently reuse the same headers as
before. In the future, we'd like to be able to modify them based on the
result we get back, so let's reset them on each retry so we can avoid
sending potentially duplicate headers if the values change.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix was added to work around a regression in libcURL 8.7.0 (which has
already been fixed in their tip of the tree).
* jk/libcurl-8.7-regression-workaround:
remote-curl: add Transfer-Encoding header only for older curl
INSTALL: bump libcurl version to 7.21.3
http: reset POSTFIELDSIZE when clearing curl handle
As of curl 7.66.0, we don't need to manually specify a "chunked"
Transfer-Encoding header. Instead, modern curl deduces the need for it
in a POST that has a POSTFIELDSIZE of -1 and uses READFUNCTION rather
than POSTFIELDS.
That version is recent enough that we can't just drop the header; we
need to do so conditionally. Since it's only a single line, it seems
like the simplest thing would just be to keep setting it unconditionally
(after all, the #ifdefs are much longer than the actual code). But
there's another wrinkle: HTTP/2.
Curl may choose to use HTTP/2 under the hood if the server supports it.
And in that protocol, we do not use the chunked encoding for streaming
at all. Most versions of curl handle this just fine by recognizing and
removing the header. But there's a regression in curl 8.7.0 and 8.7.1
where it doesn't, and large requests over HTTP/2 are broken (which t5559
notices). That regression has since been fixed upstream, but not yet
released.
Make the setting of this header conditional, which will let Git work
even with those buggy curl versions. And as a bonus, it serves as a
reminder that we can eventually clean up the code as we bump the
supported curl versions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation in gitremote-helpers.txt claims that helpers should
accept an object-format option from Git whose value is either:
1. "true", in which case the helper is merely told that Git
understands the special ":object-format" response, and will send it
2. an algorithm name that the helper should use
However, Git has never sent the second form, and it's not clear if it
would ever be useful.
When interacting with a remote Git repository, we generally discover
what _their_ object format is, and then decide what to do with a
mismatch (where that is currently just "bail out", but could eventually
be on-the-fly conversion and interop). And that is true for native
protocols, but also for transport helpers like remote-curl that talk to
remote Git repositories. There we send back an ":object-format" line
telling Git what remote-curl detected on the other side.
And this is true even for pushes (since we get it via receive-pack's
advertisement). And it is even true for dumb-http, as we guess at the
algorithm based on the hash size, due to ac093d0790 (remote-curl: detect
algorithm for dumb HTTP by size, 2020-06-19).
The one case where it _isn't_ true is dumb-http talking to an empty
repository. There we have no clue what the remote hash is, so
remote-curl just sends back its default. If we kept the "object-format
<algo>" form then in theory Git could say "object-format sha256" to
change that default. But it doesn't really accomplish anything. We still
may or may not be mis-matched with the other side. For a fetch that's
OK, since it's by definition a noop. For a push into an empty
repository, it might matter (though the dumb http-push DAV code seems
happy to clobber a remote sha256 info/refs and corrupt the repository).
If we want to pursue making this work, I think we'd be better off
improving detection of the object format of empty repositories over
dumb-http (e.g., an "info/object-format" file).
But what about helpers that _aren't_ talking to another Git repo?
Consider something like git-cinnabar, which is converting on the fly
to/from hg. Most of the heavy lifting is done by fast-import/export, but
some oids may still pass between Git and the helper. Could
"object-format <algo>" be useful to tell the helper what oids we expect
to see?
Possibly, but in practice this isn't necessary. Git-cinnabar for example
already peeks at the local-repo .git/config to check its object-format
(and currently just bails if it is sha256).
So I think the "object-format" extension really is only useful for the
helper telling Git what object-format it found, and not the other way
around.
Note that this patch can't break any remote helpers; we're not changing
the code on the Git side at all, but just bringing the documentation in
line with what Git has always done. It does remove the receiving support
in remote-curl.c, but that code was never actually triggered.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git archive --remote=<remote>" learned to talk over the smart
http (aka stateless) transport.
* jx/remote-archive-over-smart-http:
transport-helper: call do_take_over() in process_connect
transport-helper: call do_take_over() in connect_helper
http-backend: new rpc-service for git-upload-archive
transport-helper: protocol v2 supports upload-archive
remote-curl: supports git-upload-archive service
transport-helper: no connection restriction in connect_helper
Add new service (git-upload-archive) support in remote-curl, so we can
support remote archive over HTTP/HTTPS protocols. Differences between
git-upload-archive and other services:
1. The git-archive program does not expect to see protocol version and
capabilities when connecting to remote-helper, so do not send them
in remote-curl for the git-upload-archive service.
2. We need to detect protocol version by calling discover_refs().
Fallback to use the git-upload-pack service (which, like
git-upload-archive, is a read-only operation) to discover protocol
version.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove unused header "#include".
* en/header-cleanup:
treewide: remove unnecessary includes in source files
treewide: add direct includes currently only pulled in transitively
trace2/tr2_tls.h: remove unnecessary include
submodule-config.h: remove unnecessary include
pkt-line.h: remove unnecessary include
line-log.h: remove unnecessary include
http.h: remove unnecessary include
fsmonitor--daemon.h: remove unnecessary includes
blame.h: remove unnecessary includes
archive.h: remove unnecessary include
treewide: remove unnecessary includes in source files
treewide: remove unnecessary includes from header files
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though. Have those
source files explicitly include the headers they need.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these were checked with
gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).
...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file. These cases were:
* builtin/credential-cache.c
* builtin/pull.c
* builtin/send-pack.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable format encodes the hash function used by the repository
inside of its tables. The reftable backend thus needs to be initialized
with the correct hash function right from the start, or otherwise we may
end up writing tables with the wrong hash function. But git-clone(1)
initializes the reference database before learning about the hash
function used by the remote repository, which has never been a problem
with the reffiles backend.
To fix this, we'll have to change git-clone(1) to be more careful and
only create the reference backend once it learned about the remote hash
function. This creates a problem for git-remote-curl(1), which will then
be spawned at a time where the repository is not yet fully-initialized.
Consequentially, git-remote-curl(1) will fail to detect the repository,
which eventually causes it to error out once it is asked to fetch remote
objects.
We can address this issue by trying to re-discover the Git repository in
case none was detected at startup time. With this change, the clone will
look as following:
1. git-clone(1) sets up the initial repository, excluding the
reference database.
2. git-clone(1) spawns git-remote-curl(1), which will be unable to
detect the repository due to a missing "HEAD".
3. git-clone(1) asks git-remote-curl(1) to list remote references.
This works just fine as this step does not require a local
repository
4. git-clone(1) creates the reference database as it has now learned
about the hash function.
5. git-clone(1) asks git-remote-curl(1) to fetch the remote packfile.
The latter notices that it doesn't have a repository available, but
it now knows to try and re-discover it.
If the re-discovery succeeds in the last step we can continue with the
clone.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Further shuffling of declarations across header files to streamline
file dependencies.
* cw/compat-util-header-cleanup:
git-compat-util: move alloc macros to git-compat-util.h
treewide: remove unnecessary includes for wrapper.h
kwset: move translation table from ctype
sane-ctype.h: create header for sane-ctype macros
git-compat-util: move wrapper.c funcs to its header
git-compat-util: move strbuf.c funcs to its header
hex2chr() takes care not to run over the end of a NUL-terminated string.
It's used in packet_length(), but both callers of that function pass a
four-byte buffer, making NUL-checks unnecessary. packet_length() could
accidentally be used with a pointer to a buffer of unknown size at new
call-sites, though, and the compiler wouldn't complain.
Add a size parameter plus check, and remove the NUL-checks by calling
hexval() directly. This trades three NUL checks against one size check
and the ability to report the use of a short buffer at runtime.
If any of the four bytes is NUL or -- more generally -- not a
hexadecimal digit, then packet_length() still returns a negative value.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Header clean-up.
* en/header-split-cache-h: (24 commits)
protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
mailmap, quote: move declarations of global vars to correct unit
treewide: reduce includes of cache.h in other headers
treewide: remove double forward declaration of read_in_full
cache.h: remove unnecessary includes
treewide: remove cache.h inclusion due to pager.h changes
pager.h: move declarations for pager.c functions from cache.h
treewide: remove cache.h inclusion due to editor.h changes
editor: move editor-related functions and declarations into common file
treewide: remove cache.h inclusion due to object.h changes
object.h: move some inline functions and defines from cache.h
treewide: remove cache.h inclusion due to object-file.h changes
object-file.h: move declarations for object-file.c functions from cache.h
treewide: remove cache.h inclusion due to git-zlib changes
git-zlib: move declarations for git-zlib functions from cache.h
treewide: remove cache.h inclusion due to object-name.h changes
object-name.h: move declarations for object-name.c functions from cache.h
treewide: remove unnecessary cache.h inclusion
treewide: be explicit about dependence on mem-pool.h
treewide: be explicit about dependence on oid-array.h
...
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h. This made it more difficult
to find which files could remove a dependence on cache.h. Make C files
explicitly include trace.h or trace2.h if they are using them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split key function and data structure definitions out of cache.h to
new header files and adjust the users.
* en/header-split-cleanup:
csum-file.h: remove unnecessary inclusion of cache.h
write-or-die.h: move declarations for write-or-die.c functions from cache.h
treewide: remove cache.h inclusion due to setup.h changes
setup.h: move declarations for setup.c functions from cache.h
treewide: remove cache.h inclusion due to environment.h changes
environment.h: move declarations for environment.c functions from cache.h
treewide: remove unnecessary includes of cache.h
wrapper.h: move declarations for wrapper.c functions from cache.h
path.h: move function declarations for path.c functions from cache.h
cache.h: remove expand_user_path()
abspath.h: move absolute path functions from cache.h
environment: move comment_line_char from cache.h
treewide: remove unnecessary cache.h inclusion from several sources
treewide: remove unnecessary inclusion of gettext.h
treewide: be explicit about dependence on gettext.h
treewide: remove unnecessary cache.h inclusion from a few headers
Transports that do not support protocol v2 did not correctly fall
back to protocol v0 under certain conditions, which has been
corrected.
* jk/fix-proto-downgrade-to-v0:
git_connect(): fix corner cases in downgrading v2 to v0
Dozens of files made use of gettext functions, without explicitly
including gettext.h. This made it more difficult to find which files
could remove a dependence on cache.h. Make C files explicitly include
gettext.h if they are using it.
However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's code in git_connect() that checks whether we are doing a push
with protocol_v2, and if so, drops us to protocol_v0 (since we know
how to do v2 only for fetches). But it misses some corner cases:
1. it checks the "prog" variable, which is actually the path to
receive-pack on the remote side. By default this is just
"git-receive-pack", but it could be an arbitrary string (like
"/path/to/git receive-pack", etc). We'd accidentally stay in v2
mode in this case.
2. besides "receive-pack" and "upload-pack", there's one other value
we'd expect: "upload-archive" for handling "git archive --remote".
Like receive-pack, this doesn't understand v2, and should use the
v0 protocol.
In practice, neither of these causes bugs in the real world so far. We
do send a "we understand v2" probe to the server, but since no server
implements v2 for anything but upload-pack, it's simply ignored. But
this would eventually become a problem if we do implement v2 for those
endpoints, as older clients would falsely claim to understand it,
leading to a server response they can't parse.
We can fix (1) by passing in both the program path and the "name" of the
operation. I treat the name as a string here, because that's the pattern
set in transport_connect(), which is one of our callers (we were simply
throwing away the "name" value there before).
We can fix (2) by allowing only known-v2 protocols ("upload-pack"),
rather than blocking unknown ones ("receive-pack" and "upload-archive").
That will mean whoever eventually implements v2 push will have to adjust
this list, but that's reasonable. We'll do the safe, conservative thing
(sticking to v0) by default, and anybody working on v2 will quickly
realize this spot needs to be updated.
The new tests cover the receive-pack and upload-archive cases above, and
re-confirm that we allow v2 with an arbitrary "--upload-pack" path (that
already worked before this patch, of course, but it would be an easy
thing to break if we flipped the allow/block logic without also handling
"name" separately).
Here are a few miscellaneous implementation notes, since I had to do a
little head-scratching to understand who calls what:
- transport_connect() is called only for git-upload-archive. For
non-http git remotes, that resolves to the virtual connect_git()
function (which then calls git_connect(); confused yet?). So
plumbing through "name" in connect_git() covers that.
- for regular fetches and pushes, callers use higher-level functions
like transport_fetch_refs(). For non-http git remotes, that means
calling git_connect() under the hood via connect_setup(). And that
uses the "for_push" flag to decide which name to use.
- likewise, plumbing like fetch-pack and send-pack may call
git_connect() directly; they each know which name to use.
- for remote helpers (including http), we already have separate
parameters for "name" and "exec" (another name for "prog"). In
process_connect_service(), we feed the "name" to the helper via
"connect" or "stateless-connect" directives.
There's also a "servpath" option, which can be used to tell the
helper about the "exec" path. But no helpers we implement support
it! For http it would be useless anyway (no reasonable server
implementation will allow you to send a shell command to run the
server). In theory it would be useful for more obscure helpers like
remote-ext, but even there it is not implemented.
It's tempting to get rid of it simply to reduce confusion, but we
have publicly documented it since it was added in fa8c097cc9
(Support remote helpers implementing smart transports, 2009-12-09),
so it's possible some helper in the wild is using it.
- So for v2, helpers (again, including http) are mainly used via
stateless-connect, driven by the main program. But they do still
need to decide whether to do a v2 probe. And so there's similar
logic in remote-curl.c's discover_refs() that looks for
"git-receive-pack". But it's not buggy in the same way. Since it
doesn't support servpath, it is always dealing with a "service"
string like "git-receive-pack". And since it doesn't support
straight "connect", it can't be used for "upload-archive".
So we could leave that spot alone. But I've updated it here to match
the logic we're changing in connect_git(). That seems like the least
confusing thing for somebody who has to touch both of these spots
later (say, to add v2 push support). I didn't add a new test to make
sure this doesn't break anything; we already have several tests (in
t5551 and elsewhere) that make sure we are using v2 over http.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us to replace includes of cache.h with includes of the much
smaller alloc.h in many places. It does mean that we also need to add
includes of alloc.h in a number of C files.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.35:
Git 2.35.7
Git 2.34.7
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
Git 2.33.7
Git 2.32.6
Git 2.31.7
Git 2.30.8
apply: fix writing behind newly created symbolic links
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
* maint-2.34:
Git 2.34.7
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
Git 2.33.7
Git 2.32.6
Git 2.31.7
Git 2.30.8
apply: fix writing behind newly created symbolic links
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future change will want a way to download a file over HTTP(S) using
the simplest of download mechanisms. We do not want to assume that the
server on the other side understands anything about the Git protocol but
could be a simple static web server.
Create the new 'get' capability for the remote helpers which advertises
that the 'get' command is avalable. A caller can send a line containing
'get <url> <path>' to download the file at <url> into the file at
<path>.
Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git server end's ability to accept Accept-Language header was introduced
in f18604bbf2 (http: add Accept-Language header if possible, 2015-01-28),
but this is only used by very early phase of the transfer, which is HTTP
GET request to discover references. For other phases, like POST request
in the smart HTTP, the server does not know what language the client
speaks.
Teach git client to learn end-user's preferred language and throw
accept-language header to the server side. Once the server gets this header,
it has the ability to talk to end-user with language they understand.
This would be very helpful for many non-English speakers.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch --refetch" learned to fetch everything without telling
the other side what we already have, which is useful when you
cannot trust what you have in the local object store.
* rc/fetch-refetch:
docs: mention --refetch fetch option
fetch: after refetch, encourage auto gc repacking
t5615-partial-clone: add test for fetch --refetch
fetch: add --refetch option
builtin/fetch-pack: add --refetch option
fetch-pack: add refetch
fetch-negotiator: add specific noop initializer
Add a refetch option to fetch-pack to force a full fetch. Use when
applying a new partial clone filter to refetch all matching objects.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug a trivial memory leak in code added in a2d725b7bd (Use an
external program to implement fetching with curl, 2009-08-05).
To do this have the cmd_main() use a "goto cleanup" pattern, and to
return an error of 1 unless we can fall through to the http_cleanup()
at the end.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>