Placing `struct lock_file`s on the stack used to be a bad idea, because
the temp- and lockfile-machinery would keep a pointer into the struct.
But after 076aa2cbd (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we can safely have lockfiles on the stack. (This applies
even if a user returns early, leaving a locked lock behind.)
These `struct lock_file`s are local to their respective functions and we
can drop their staticness.
For good measure, I have inspected these sites and come to believe that
they always release the lock, with the possible exception of bailing out
using `die()` or `exit()` or by returning from a `cmd_foo()`.
As pointed out by Jeff King, it would be bad if someone held on to a
`struct lock_file *` for some reason. After some grepping, I agree with
his findings: no-one appears to be doing that.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).
The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.
Let's just convert all remaining ones in one fell swoop.
This trick was performed by this invocation:
sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert one use of EMPTY_TREE_SHA1_HEX to use empty_tree_oid_hex to
avoid a dependency on a given hash algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the GIT_SHA1_RAWSZ and GIT_SHA1_HEXSZ macros instead of hard-coding
the constants 20 and 40. Switch one use of 20 with a format specifier
for a hex value to use the hex constant instead, as the original appears
to have been a typo.
At this point, avoid converting the hard-coded use of SHA-1 to use
the_hash_algo. SHA-1, even if not collision resistant, is secure in the
context in which it is used here, and the hash algorithm of the repo
need not match what is used here. When we adopt a new hash algorithm,
we can simply adopt the new algorithm wholesale here, as the nonce is
opaque and its length and validity are entirely controlled by the
server. Consequently, defer updating this code until that point.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.
Signed-off-by: Stefan Beller <sbeller@google.com>
See previous patch for explanation.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert find_unique_abbrev and find_unique_abbrev_r to each take a
pointer to struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the definition and declaration of write_sha1_file to
struct object_id and adjust usage of this function.
This commit also converts static function write_sha1_file_prepare, as it
is closely related.
Rename these functions to write_object_file and
write_object_file_prepare respectively.
Replace sha1_to_hex, hashcpy and hashclr with their oid equivalents
wherever possible.
Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach upload-pack and receive-pack to understand and respond using
protocol version 1, if requested.
Protocol version 1 is simply the original and current protocol (what I'm
calling version 0) with the addition of a single packet line, which
precedes the ref advertisement, indicating the protocol version being
spoken.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert check_connected and the callbacks it takes to use struct
object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the ref transaction code to use struct object_id. Remove one
NULL pointer check which was previously inserted around a dereference;
since we now pass a pointer to struct object_id directly through, the
code we're calling handles this for us.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us to get rid of several write-only variables.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us to get rid of some write-only variables, among them seven
SHA1 buffers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).
So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:
1. It can do a funny signed/unsigned comparison. If your
"len" is signed (e.g., a size_t) then the compiler will
promote the "-1" to its unsigned variant.
This works out for "!= len" (unless you really were
trying to write the maximum size_t bytes), but is a
bug if you check "< len" (an example of which was fixed
recently in config.c).
We should avoid promoting the mental model that you
need to check the length at all, so that new sites are
not tempted to copy us.
2. Checking for a negative value is shorter to type,
especially when the length is an expression.
3. Linus says so. In d34cf19b89 (Clean up write_in_full()
users, 2007-01-11), right after the write_in_full()
semantics were changed, he wrote:
I really wish every "write_in_full()" user would just
check against "<0" now, but this fixes the nasty and
stupid ones.
Appeals to authority aside, this makes it clear that
writing it this way does not have an intentional
benefit. It's a historical curiosity that we never
bothered to clean up (and which was undoubtedly
cargo-culted into new sites).
So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).
[1] A careful reader may notice there is one way that
write_in_full() can return a different value. If we ask
write() to write N bytes and get a return value that is
_larger_ than N, we could return a larger total. But
besides the fact that this would imply a totally broken
version of write(), it would already invoke undefined
behavior. Our internal remaining counter is an unsigned
size_t, which means that subtracting too many byte will
wrap it around to a very large number. So we'll instantly
begin reading off the end of the buffer, trying to write
gigabytes (or petabytes) of data.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function close_pack_fd() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We set old_oid to NULL if we found out that it's a corrupt reference.
In that case don't try to access the hash member and pass NULL to
ref_transaction_delete() instead.
Found with Clang's UBSan.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert code that divides and rounds up to use DIV_ROUND_UP to make the
intent clearer and reduce the number of magic constants.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit f6a4e61 ("push: accept push options", 2016-07-14), send-pack
was taught to include push options both within the signed cert (if the
push is a signed push) and outside the signed cert; however,
receive-pack ignores push options within the cert, only handling push
options outside the cert.
Teach receive-pack, in the case that push options are provided for a
signed push, to verify that the push options both within the cert and
outside the cert are consistent.
This sets in stone the requirement that send-pack redundantly send its
push options in 2 places, but I think that this is better than the
alternatives. Sending push options only within the cert is
backwards-incompatible with existing Git servers (which read push
options only from outside the cert), and sending push options only
outside the cert means that the push options are not signed for.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert register_shallow and unregister_shallow to take struct
object_id. register_shallow is a caller of lookup_commit, which we will
convert later. It doesn't make sense for the registration and
unregistration functions to have incompatible interfaces, so convert
them both.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.
There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
away from the ill-specified `unsigned long` data type.
So let's introduce the pseudo format "PRItime" (currently simply being
defined to "lu") to make it easier to change the data type used for
timestamps.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, Git's source code represents all timestamps as `unsigned
long`. In preparation for using a more appropriate data type, let's
introduce a symbol `parse_timestamp` (currently being defined to
`strtoul`) where appropriate, so that we can later easily switch to,
say, use `strtoull()` instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the full hostname doesn't fit in the buffer supplied to
gethostname, POSIX does not specify whether the buffer will be
null-terminated, so to be safe, we should do it ourselves. Introduce
new function, xgethostname, which ensures that there is always a \0
at the end of the buffer.
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
POSIX limits the length of host names to HOST_NAME_MAX. Export the
fallback definition from daemon.c and use this constant to make all
buffers used with gethostname(2) big enough for any possible result
and a terminating NUL.
Inspired-by: David Turner <dturner@twosigma.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: David Turner <dturner@twosigma.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 722ff7f87 (receive-pack: quarantine objects until
pre-receive accepts, 2016-10-03), we have to feed the
pre-receive hook the tmp_objdir environment, so that git
programs run from the hook know where to find the objects.
That commit modified run_update_hook() to do the same, but
there it is a noop. By the time we get to the update hooks,
we have already migrated the objects from quarantine, and so
tmp_objdir_env() will always return NULL. We can drop this
useless call.
Note that the ordering here and the lack of support for the
update hook is intentional. The update hook calls are
interspersed with actual ref updates, and we must migrate
the objects before any refs are updated (since otherwise
those refs would appear broken to outside processes). So the
only other options are:
- remain in quarantine for the _first_ ref, but not the
others. This is sufficiently confusing that it can be
rejected outright.
- run all the individual update hooks first, then migrate,
then update all the refs. But this changes the repository
state that the update hooks see (i.e., whether or not
refs from the same push are updated yet or not).
So the functionality is fine and remains unchanged with this
patch; we're just cleaning up a useless and confusing line
of code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this structure handles an array of object IDs, rename it to struct
oid_array. Also rename the accessor functions and the initialization
constant.
This commit was produced mechanically by providing non-Documentation
files to the following Perl one-liners:
perl -pi -E 's/struct sha1_array/struct oid_array/g'
perl -pi -E 's/\bsha1_array_/oid_array_/g'
perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make sha1_array_for_each_unique take a callback using struct object_id.
Since one of these callbacks is an argument to for_each_abbrev, convert
those as well. Rename various functions, replacing "sha1" with "oid".
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the callers to pass struct object_id by changing the function
declaration and definition and applying the following semantic patch:
@@
expression E1, E2;
@@
- sha1_array_append(E1, E2.hash)
+ sha1_array_append(E1, &E2)
@@
expression E1, E2;
@@
- sha1_array_append(E1, E2->hash)
+ sha1_array_append(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After receive-pack reads the pack header from the client, it
feeds the already-read part to index-pack and unpack-objects
via their --pack-header command-line options. To do so, we
format it into a fixed buffer, then duplicate it into the
child's argv_array.
Our buffer is long enough to handle any possible input, so
this isn't wrong. But it's more complicated than it needs to
be; we can just argv_array_pushf() the final value and avoid
the intermediate copy. This drops the magic number and is
more efficient, too.
Note that we need to push to the argv_array in order, which
means we can't do the push until we are in the "unpack-objects
versus index-pack" conditional. Rather than duplicate the
slightly complicated format specifier, I pushed it into a
helper function.
Signed-off-by: Jeff King <peff@peff.net>
Make the internal storage for struct sha1_array use an array of struct
object_id internally. Update the users of this struct which inspect its
internals.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert some hardcoded constants into uses of parse_oid_hex.
Additionally, convert all uses of struct command, and miscellaneous
other functions necessary for that. This work is necessary to be able
to convert sha1_array_append later on.
To avoid needing to specify a constant, reject shallow lines with the
wrong length instead of simply ignoring them.
Note that in queue_command we are guaranteed to have a NUL-terminated
buffer or at least one byte of overflow that we can safely read, so the
linelen check can be elided. We would die in such a case, but not read
invalid memory.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we had already processed the last newline in a push certificate, we
would end up subtracting NULL from the end-of-certificate pointer when
computing the length of the line. This would have resulted in an
absurdly large length, and possibly a buffer overflow. Instead,
subtract the beginning-of-certificate pointer from the
end-of-certificate pointer, which is what's expected.
Note that this situation should never occur, since not only do we
require the certificate to be newline terminated, but the signature will
only be read from the beginning of a line. Nevertheless, it seems
prudent to correct it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we will likely be introducing a new hash function at some point,
and that hash function might be longer than 20 bytes, use the constant
GIT_MAX_RAWSZ, which is designed to be suitable for allocations, instead
of GIT_SHA1_RAWSZ. This will ease the transition down the line by
distinguishing between places where we need to allocate memory suitable
for the largest hash from those where we need to handle the current
hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of counting the arguments to see if there are any and then
building the full command use a single loop and add the hook command
just before the first argument. This reduces duplication and overall
code size.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The err_fd descriptor passed to the unpack() function is
intended to be handed off to the child index-pack, and our
async muxer will read until it gets EOF. However, if we
encounter an error before handing off the descriptor, we
must manually close(err_fd). Otherwise we will be waiting
for our muxer to finish, while the muxer is waiting for EOF
on err_fd.
We fixed an identical deadlock already in 49ecfa13f
(receive-pack: close sideband fd on early pack errors,
2013-04-19). But since then, the function grew a new
early-return in 722ff7f87 (receive-pack: quarantine objects
until pre-receive accepts, 2016-10-03), when we fail to
create a temporary directory. This return needs the same
treatment.
Reported-by: Horst Schirmeier <horst@schirmeier.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few leaf functions in various files that call
resolve_refdup. Convert these functions to use struct object_id
internally to prepare for transitioning resolve_refdup itself.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We de-duplicate ".have" refs among themselves, but never
check if they are duplicates of our local refs. It's not
unreasonable that they would be if we are a "--shared" or
"--reference" clone of a similar repository; we'd have all
the same tags.
We can handle this by inserting our local refs into the
oidset, but obviously not suppressing duplicates (since the
refnames are important).
Note that this also switches the order in which we advertise
refs, processing ours first and then any alternates. The
order shouldn't matter (and arguably showing our refs first
makes more sense).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Namely, de-duplicate them. We use the same set as the
alternates, since we call them both ".have" (i.e., there is
no value in showing one versus the other).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comment claims that we handle alternate ".have" lines
through this function, but that hasn't been the case since
85f251045 (write_head_info(): handle "extra refs" locally,
2012-01-06).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you have an alternate object store with a very large
number of refs, the peak memory usage of the sha1_array can
grow high, even if most of them are duplicates that end up
not being printed at all.
The similar for_each_alternate_ref() code-paths in
fetch-pack solve this by using flags in "struct object" to
de-duplicate (and so are relying on obj_hash at the core).
But we don't have a "struct object" at all in this case. We
could call lookup_unknown_object() to get one, but if our
goal is reducing memory footprint, it's not great:
- an unknown object is as large as the largest object type
(a commit), which is bigger than an oidset entry
- we can free the memory after our ref advertisement, but
"struct object" entries persist forever (and the
receive-pack may hang around for a long time, as the
bottleneck is often client upload bandwidth).
So let's use an oidset. Note that unlike a sha1-array it
doesn't sort the output as a side effect. However, our
output is at least stable, because for_each_alternate_ref()
will give us the sha1s in ref-sorted order.
In one particularly pathological case with an alternate that
has 60,000 unique refs out of 80 million total, this reduced
the peak heap usage of "git receive-pack . </dev/null" from
13GB to 14MB.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Breaking down the fields in the interface makes it easier to
change the backend of for_each_alternate_ref to something
that doesn't use "struct ref" internally.
The only field that callers actually look at is the oid,
anyway. The refname is kept in the interface as a plausible
thing for future code to want.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
string_list_clear() handles empty lists just fine, so remove the
redundant check.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some code paths want to format multiple abbreviated sha1s in
the same output line. Because we use a single static buffer
for our return value, they have to either break their output
into several calls or allocate their own arrays and use
find_unique_abbrev_r().
Intead, let's mimic sha1_to_hex() and use a ring of several
buffers, so that the return value stays valid through
multiple calls. This shortens some of the callers, and makes
it harder to for them to make a silly mistake.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>