Fetch over http from a repository that uses alternates to borrow
from neighbouring repositories were quite broken, apparently for
some time now.
We parse input and count bytes to allocate the new buffer, and
when we copy into that buffer we know exactly how many bytes we
want to copy from where. Using strlcpy for it was simply
stupid, and the code forgot to take it into account that strlcpy
terminated the string with NUL.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fetch over http from a repository that uses alternates to borrow
from neighbouring repositories were quite broken, apparently for
some time now.
We parse input and count bytes to allocate the new buffer, and
when we copy into that buffer we know exactly how many bytes we
want to copy from where. Using strlcpy for it was simply
stupid, and the code forgot to take it into account that strlcpy
terminated the string with NUL.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.
I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.
[jc: removed the part that deals with last_XXX, which I am
finding more and more dubious these days.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This abstracts away the size of the hash values when copying them
from memory location to memory location, much as the introduction
of hashcmp abstracted away hash value comparsion.
A few call sites were using char* rather than unsigned char* so
I added the cast rather than open hashcpy to be void*. This is a
reasonable tradeoff as most call sites already use unsigned char*
and the existing hashcmp is also declared to be unsigned char*.
[jc: Splitted the patch to "master" part, to be followed by a
patch for merge-recursive.c which is not in "master" yet.
Fixed the cast in the latter hunk to combine-diff.c which was
wrong in the original.
Also converted ones left-over in combine-diff.c, diff-lib.c and
upload-pack.c ]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Introduces global inline:
hashcmp(const unsigned char *sha1, const unsigned char *sha2)
Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
[jc: I needed to hand merge the changes to the updated codebase,
so the result needs to be checked.]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
As Fredrik points out the current interface of has_extension() is
potentially confusing. Its parameters include both a nul-terminated
string and a length-limited string.
This patch drops the length argument, requiring two nul-terminated
strings; all callsites are updated. I checked that all of them indeed
provide nul-terminated strings. Filenames need to be nul-terminated
anyway if they are to be passed to open() etc. The performance penalty
of the additional strlen() is negligible compared to the system calls
which inevitably surround has_extension() calls.
Additionally, change has_extension() to use size_t inside instead of
int, as that is the exact type strlen() returns and memcmp() expects.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The little helper has_extension() documents through its name what we are
trying to do and makes sure we don't forget the underrun check.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The function pull() in fetch.c calls write_ref_sha1(), which may
need committer identity to update the ref-log, so they need to
call setup_ident() before calling git_config() function.
Acked-by: Shawn Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
pull() now takes an array of arguments instead of just one of each kind.
Currently, no users use the new capability, but that'll change.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Currently it's a bit weird that pull() takes a single argument
describing the commit but takes the write_ref from a global variable.
This makes it take that as a parameter as well, which might be nicer
for the libification in the future, but especially it will make for
nicer code when we implement pull()ing multiple commits at once.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a really ancient remnant of the short era of delta objects stored
directly in the object database.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This doesn't make the code uglier or harder to read, yet it makes the
code more portable. This also simplifies checking for other potential
incompatibilities. "gcc -std=c89 -pedantic" can flag many incompatible
constructs as warnings, but C99 comments will cause it to emit an error.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This cleans up the use of safe_strncpy() even more. Since it has the
same semantics as strlcpy() use this name instead. Also move the
definition from inside path.c to its own file compat/strlcpy.c, and use
it conditionally at compile time, since some platforms already has
strlcpy(). It's included in the same way as compat/setenv.c.
Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
ANSI C99 doesn't allow void-pointer arithmetic. This patch fixes this in
various ways. Usually the strategy that required the least changes was used.
Signed-off-by: Florian Forster <octo@verplant.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Though very nice and readable, the "case 'a'...'z':" construct is not ANSI C99
compliant. This patch unfolds the range in `quote.c' and substitutes the
switch-statement with an if-statement in `http-fetch.c' and `http-push.c'.
Signed-off-by: Florian Forster <octo@verplant.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Initialize an object request's slot to a safe value. A non-NULL value
can cause a segfault if the request is aborted before it starts.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Free the curl string lists after running http_cleanup to
avoid an occasional segfault in the curl library. Seems
to only occur if the website returns a 405 error.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If a ref is changed by http-fetch, local-fetch or ssh-fetch
record the change and the remote URL/name in the log for the ref.
This requires loading the config file to check logAllRefUpdates.
Also fixed a bug in the ref lock generation; the log file name was
not being produced right due to a bad prefix length.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If git is not built with NO_EXPAT, this patch changes git-http-fetch to
attempt using DAV to get a list of remote packs and fall back to using
objects/info/packs if the DAV request fails.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When a repository otherwise properly prepared is served by a
dumb HTTP server that sends "No such page" output with 200
status for human consumption to a request for a page that does
not exist, the users will get an alarming "File X corrupt" error
message. Hint that they might be dealing with such a server at
the end and suggest running fsck-objects to check if the result
is OK (the pack-fallback code does the right thing in this case
so unless a loose object file was actually corrupt the result
should check OK).
Signed-off-by: Junio C Hamano <junkio@cox.net>
When fetching alternates, http-fetch may reuse the slot to fetch non-http
alternates if http-alternates does not exist. When doing so, it now needs
to update the slot's finished status so run_active_slot waits for the
non-http alternates request to finish.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
In fetch_object, there's a call to release an object request if the
object mysteriously arrived, say in a pack. Unfortunately, the fetch
attempt for this object might already be in progress, and we'll leak the
descriptor. Instead, try to tidy away the request.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
move_temp_to_file returns 0 or -1. This is not a good thing to pass to
strerror(3). Fortunately, someone already reported the error, so don't
worry too much.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
In fill_active_slots() -- if we find an object which has already arrived,
say as part of a pack, /don't/ remove it from the list. It's already been
prefetched and someone will ask for it later. Just label it as done and
carry blithely on. (As it was, the code would dereference a freed object
to continue through the list anyway.)
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
There's no need for these structures to be static, and it could potentially
cause problems down the road.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add a way to store the results of an HTTP request when a slot finishes
so the results can be processed after the slot has been reused.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Recognize missing files when using http-fetch with file:// URLs
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It failed to register the last pack in the objects/info/packs
file. Also it had an independent overrun error.
Signed-off-by: Junio C Hamano <junkio@cox.net>
These are whole-tree operations and there is not much point
making them operable from within a subdirectory, but it is easy
to do so, and using setup_git_directory() upfront helps git://
proxy specification picked up from the correct place.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Better response handling for pack list requests - a 404 means we do have
the list but it happens to be empty.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Rename object request functions and data to make it more clear which type
of request is being processed - this is a response to the introduction of
slot callbacks and the definition of different types of requests such as
alternates_request.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Move shared HTTP request functionality out of http-fetch and http-push,
and replace the two fwrite_buffer/fwrite_buffer_dynamic functions with
one fwrite_buffer function that does dynamic buffering. Use slot
callbacks to process responses to fetch object transfer requests and
push transfer requests, and put all of http-push into an #ifdef check
for curl multi support.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The current http-fetch is rather careless about fd leakage, causing
problems while fetching large repositories. This patch does not reserve
exhaustiveness, but I covered everything I spotted. I also left some
safeguards in place in case I missed something, so that we get to know,
sooner or later.
Reported by Becky Bruce <becky.bruce@freescale.com>.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Added a call to finish_request to clean up resources if the server
returned a 404 and there are no alternates left to try.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Stop additional alternates requests from starting if one is already in
progress. This adds an optional callback which is processed after a slot
has finished running.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Otherwise, git-clone silently failed to clone a remote
repository where redirections (ie. a response with a
"Location" header line) are used.
This includes the fixes from Nick Hengeveld.
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Make a bunch of needlessly global functions static, and replace two
K&R-style declarations.
Signed-off-by: Peter Hagervall <hager@cs.umu.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When curl_message is released using curl_multi_remove_handle(), it's
contents are undefined. Therefore, get the information before releasing it.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
After using cg-update to pull, empty files named *.temp are left in
the various subdirectories of .git/objects/. These are created by
git-http-fetch to hold data as it's being fetched from the remote
repository. They are left behind after a transfer error so that the
next time git-http-fetch runs it can pick up where it left off. If
they're empty though, it would make more sense to delete them rather
than leaving them behind for the next attempt.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-http-fetch spits out curl 404 error message when unable to fetch an object,
but that's confusing since no error really happened and the object is usually
found in a pack it tries right after that. And if the object still cannot be
retrieved, it will say another error message anyway. OTOH other HTTP errors
(403 etc) are likely fatal and the user should be still informed about them.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>