In load_idx(), we check that the .idx file is sized appropriately for
the number of objects it claims to have. We recently fixed the case
where the number of objects caused our expected size to overflow a
32-bit unsigned int, and we switched to size_t.
On a 64-bit system, this is fine; our size_t covers any expected size.
On a 32-bit system, though, it won't. The file may claim to have 2^31
objects, which will overflow even a size_t.
This doesn't hurt us at all for a well-formed idx file. A 32-bit system
would already have failed to mmap such a file, since it would be too
big. But an .idx file which _claims_ to have 2^31 objects but is
actually much smaller would fool our check.
This is a broken file, and for the most part we don't care that much
what happens. But:
- it's a little friendlier to notice up front "woah, this file is
broken" than it is to get nonsense results
- later access of the data assumes that the loading function
sanity-checked that we have at least enough bytes for the regular
object-id table. A malformed .idx file could lead to an
out-of-bounds read.
So let's use our overflow-checking functions to make sure that we're not
fooled by a malformed file.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The block-sha1 implementation takes an "unsigned long" for the length of
a buffer to hash, but our hash algorithm wrappers take a size_t, as do
other implementations we support like openssl or sha1dc. On many
systems, including Linux, these two are equivalent, but they are not on
Windows (where only a "long long" is 64 bits). As a result, passing
large chunks to a single the_hash_algo->update_fn() would produce wrong
answers there.
Note that we don't need to update any other sizes outside of the
function interface. We store the cumulative size in a "long long" (which
we must do since we hash things bigger than 4GB, like packfiles, even on
32-bit platforms). And internally, we break that size_t len down into
64-byte blocks to feed into the guts of the algorithm.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When checking the trailing checksum hash of a .idx file, we pass the
whole buffer (minus the trailing hash) into a single call to
the_hash_algo->update_fn(). But we cast it to an "unsigned int". This
comes from c4001d92be (Use off_t when we really mean a file offset.,
2007-03-06). That commit started storing the index_size variable as an
off_t, but our mozilla-sha1 implementation from the time was limited to
a smaller size. Presumably the cast was a way of annotating that we
expected .idx files to be small, and so we didn't need to loop (as we do
for arbitrarily-large .pack files). Though as an aside it was still
wrong, because the mozilla function actually took a signed int.
These days our hash-update functions are defined to take a size_t, so we
can pass the whole buffer in directly. The cast is actually causing a
buggy truncation!
While we're here, though, let's drop the confusing off_t variable in the
first place. We're getting the size not from the filesystem anyway, but
from p->index_size, which is a size_t. In fact, we can make the code a
bit more readable by dropping our local variable duplicating
p->index_size, and instead have one that stores the size of the actual
index data, minus the trailing hash.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We sometimes store the offset into a pack .idx file as an "unsigned
long", but the mmap'd size of a pack .idx file can exceed 4GB. This is
sufficient on LP64 systems like Linux, but will be too small on LLP64
systems like Windows, where "unsigned long" is still only 32 bits. Let's
use size_t, which is a better type for an offset into a memory buffer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A pack and its matching .idx file are limited to 2^32 objects, because
the pack format contains a 32-bit field to store the number of objects.
Hence we use uint32_t in the code.
But the byte count of even a .idx file can be much larger than that,
because it stores at least a hash and an offset for each object. So
using SHA-1, a v2 .idx file will cross the 4GB boundary at 153,391,650
objects. This confuses load_idx(), which computes the minimum size like
this:
unsigned long min_size = 8 + 4*256 + nr*(hashsz + 4 + 4) + hashsz + hashsz;
Even though min_size will be big enough on most 64-bit platforms, the
actual arithmetic is done as a uint32_t, resulting in a truncation. We
actually exceed that min_size, but then we do:
unsigned long max_size = min_size;
if (nr)
max_size += (nr - 1)*8;
to account for the variable-sized table. That computation doesn't
overflow quite so low, but with the truncation for min_size, we end up
with a max_size that is much smaller than our actual size. So we
complain that the idx is invalid, and can't find any of its objects.
We can fix this case by casting "nr" to a size_t, which will do the
multiplication in 64-bits (assuming you're on a 64-bit platform; this
will never work on a 32-bit system since we couldn't map the whole .idx
anyway). Likewise, we don't have to worry about further additions,
because adding a smaller number to a size_t will convert the other side
to a size_t.
A few notes:
- obviously we could just declare "nr" as a size_t in the first place
(and likewise, packed_git.num_objects). But it's conceptually a
uint32_t because of the on-disk format, and we correctly treat it
that way in other contexts that don't need to compute byte offsets
(e.g., iterating over the set of objects should and generally does
use a uint32_t). Switching to size_t would make all of those other
cases look wrong.
- it could be argued that the proper type is off_t to represent the
file offset. But in practice the .idx file must fit within memory,
because we mmap the whole thing. And the rest of the code (including
the idx_size variable we're comparing against) uses size_t.
- we'll add the same cast to the max_size arithmetic line. Even though
we're adding to a larger type, which will convert our result, the
multiplication is still done as a 32-bit value and can itself
overflow. I didn't check this with my test case, since it would need
an even larger pack (~530M objects), but looking at compiler output
shows that it works this way. The standard should agree, but I
couldn't find anything explicit in 6.3.1.8 ("usual arithmetic
conversions").
The case in load_idx() was the most immediate one that I was able to
trigger. After fixing it, looking up actual objects (including the very
last one in sha1 order) works in a test repo with 153,725,110 objects.
That's because bsearch_hash() works with uint32_t entry indices, and the
actual byte access:
int cmp = hashcmp(table + mi * stride, sha1);
is done with "stride" as a size_t, causing the uint32_t "mi" to be
promoted to a size_t. This is the way most code will access the index
data.
However, I audited all of the other byte-wise accesses of
packed_git.index_data, and many of the others are suspect (they are
similar to the max_size one, where we are adding to a properly sized
offset or directly to a pointer, but the multiplication in the
sub-expression can overflow). I didn't trigger any of these in practice,
but I believe they're potential problems, and certainly adding in the
cast is not going to hurt anything here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2.29, "--committer-date-is-author-date" option of "rebase" and
"am" subcommands lost the e-mail address by mistake, which has been
corrected.
* jk/committer-date-is-author-date-fix:
rebase: fix broken email with --committer-date-is-author-date
am: fix broken email with --committer-date-is-author-date
t3436: check --committer-date-is-author-date result more carefully
Commit 7573cec52c (rebase -i: support --committer-date-is-author-date,
2020-08-17) copied the committer ident-parsing code from builtin/am.c.
And in doing so, it copied a bug in which we always set the email to an
empty string. We fixed the version in git-am in the previous commit;
this commit fixes the copied code.
Reported-by: VenomVendor <info@venomvendor.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e8cbe2118a (am: stop exporting GIT_COMMITTER_DATE, 2020-08-17)
rewrote the code for setting the committer date to use fmt_ident(),
rather than setting an environment variable and letting commit_tree()
handle it. But it introduced two bugs:
- we use the author email string instead of the committer email
- when parsing the committer ident, we used the wrong variable to
compute the length of the email, resulting in it always being a
zero-length string
This commit fixes both, which causes our test of this option via the
rebase "apply" backend to now succeed.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After running "rebase --committer-date-is-author-date", we confirm that
the committer date is the same as the author date. However, we don't
look at any other parts of the committer ident line to make sure we
didn't screw them up. And indeed, there are a few bugs here. Depending
on the rebase backend in use, we may accidentally use the author email
instead of the committer's, or even an empty string.
Let's teach our test_ctime_is_atime helper to check the committer name
and email, which reveals several failing tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea of the `SKIP_DASHED_BUILT_INS` option is to stop hard-linking
the built-in commands as separate executables. The patches to do that
specifically excluded the three commands `receive-pack`,
`upload-archive` and `upload-pack`, though: these commands are expected
to be present in the `PATH` in their dashed form on the server side of
any fetch/push.
However, due to an oversight by myself, even if those commands were
still hard-linked, they were not installed into `bin/`.
Noticed-by: Michael Forney <mforney@mforney.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name of the tool is 'git-filter-repo' not
'git-repo-filter'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* https://github.com/prati0100/git-gui:
git-gui: blame: prevent tool tips from sticking around after Command-Tab
git-gui: improve dark mode support
git-gui: fix mixed tabs and spaces; prefer tabs
Make sure `git gui blame` tooltips are destroyed once the window loses
focus on MacOS.
* sh/blame-tooltip:
git-gui: blame: prevent tool tips from sticking around after Command-Tab
On Mac, tooltips are not automatically removed when a window loses
focus. Furthermore, mouse-move events are only dispatched to the active
window, which means that if we Command-tab to another application while
a tool tip is showing, the tool tip will stay there forever (in front of
other applications). So we must hide it manually when we lose focus.
Do this unconditionally here (i.e. without if {[is_MacOSX]}); it
shouldn't hurt on other platforms, even though they don't seem to have
this problem.
Signed-off-by: Stefan Haller <stefan@haller-berlin.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
I am excited. Because I like a lot languages, and because I believe this
is the way to contribute to a large number of Portuguese speaking
person.
Jiang Xin and last Portuguese team gave me the lead. Thank you very
much. Honored to be a part of such a project.
Signed-off-by: Daniel Santos <hello@brighterdan.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAl9/7WYACgkQsLXohpav
5svR8hAA4Nl1pAudCyXGHr9tWO91wt6zFLAMMFDLsjAGhq4IRK9l/XyPO3QBrac2
3nvcDUS+wIFLWC7O92QJXbxtANlJsH0sTr99DkjY/c38WqY9JSoUtxsyxp0xJ0R4
kMlzPQXopbu8xIJtlhLfaj7hD4C9YcVoGKkLNZin0vAiwpfCxDNHBH29txQmKWuk
sX3K7yS6ZoL93DKSt1+dUiFc+Kn6dEc/x5QAeNYiIE5JENcEdNdn6ecwOGVH3Y5C
2P393QNcZqKw4OB2FPJYUOvXcbjzn7DTUVh40yrKplgTxCqKiuog13Iotd7cM4Bk
1Gk1n2KNVPTnMv7WXPi1aWOXEpNjEENJOPBsZzQjQoDnu+2OWf38URuMXSy9NOQP
ZQj67x0tixVfp2Zfg9gv+XPHAvfX4tN3qWEjlowOLZhhXBmwTJzfPWSOuPueRt91
tLJe/9Bw/gOY4omgLuUSUZ3XGI7MxJV66nqqV329cGaWOaWuhsV6bp0ZvQGCFl18
fjLh1e/XZn/6S+GvrSBR0Ky0YqYcyF7dw1zU+hdCnOBv6FWBXLMWbSCn3gpD638S
CPihupHLo5Qy/R5J3R6BItb+IYSUvpHVfaDRa3pcD/RHEnbO5KpETaq813BXU7Vw
SzhpnhjqronTlPE8hhBSSq83IFUZ0t3AWvFHHjewz6FPfXq4r14=
=NqjO
-----END PGP SIGNATURE-----
Merge tag 'v2.29.0-rc1' of github.com:git/git
Git 2.29-rc1
* tag 'v2.29.0-rc1' of github.com:git/git:
Git 2.29-rc1
doc: fix the bnf like style of some commands
doc: git-remote fix ups
doc: use linkgit macro where needed.
git-bisect-lk2009: make continuation of list indented
ci: do not skip tagged revisions in GitHub workflows
ci: skip GitHub workflow runs for already-tested commits/trees
tests: avoid using the branch name `main`
t1415: avoid using `main` as ref name
Makefile: ASCII-sort += lists
help: do not expect built-in commands to be hardlinked
index-pack: make get_base_data() comment clearer
index-pack: drop type_cas mutex
index-pack: restore "resolving deltas" progress meter
compat/mingw.h: drop extern from function declaration
GitHub workflow: automatically follow minor updates of setup-msbuild
t5534: split stdout and stderr redirection
Test preparation for the switch of default branch name continues.
* js/default-branch-name-part-3:
tests: avoid using the branch name `main`
t1415: avoid using `main` as ref name
The logic to skip testing on the tagged commit and the tag itself
was not quite consistent which led to failure of Windows test
tasks. It has been revamped to consistently skip revisions that
have already been tested, based on the tree object of the revision.
* js/ci-ghwf-dedup-tests:
ci: do not skip tagged revisions in GitHub workflows
ci: skip GitHub workflow runs for already-tested commits/trees
Doc fixes.
* ja/misc-doc-fixes:
doc: fix the bnf like style of some commands
doc: git-remote fix ups
doc: use linkgit macro where needed.
git-bisect-lk2009: make continuation of list indented
Hotfix and clean-up for the jt/threaded-index-pack topic that has
graduated to v2.29-rc0.
* jk/index-pack-hotfixes:
index-pack: make get_base_data() comment clearer
index-pack: drop type_cas mutex
index-pack: restore "resolving deltas" progress meter