Commit Graph

75503 Commits (e03d2a9ccb88c7ff42237f5890a05e071497f8ae)

Author SHA1 Message Date
Jeff King 63aca3f7f1 dumb-http: store downloaded pack idx as tempfile
This patch fixes a regression in b1b8dfde69 (finalize_object_file():
implement collision check, 2024-09-26) where fetching a v1 pack idx file
over the dumb-http protocol would cause the fetch to fail.

The core of the issue is that dumb-http stores the idx we fetch from the
remote at the same path that will eventually hold the idx we generate
from "index-pack --stdin". The sequence is something like this:

  0. We realize we need some object X, which we don't have locally, and
     nor does the other side have it as a loose object.

  1. We download the list of remote packs from objects/info/packs.

  2. For each entry in that file, we download each pack index and store
     it locally in .git/objects/pack/pack-$hash.idx (the $hash is not
     something we can verify yet and is given to us by the remote).

  3. We check each pack index we got to see if it has object X. When we
     find a match, we download the matching .pack file from the remote
     to a tempfile. We feed that to "index-pack --stdin", which
     reindexes the pack, rather than trusting that it has what the other
     side claims it does. In most cases, this will end up generating the
     exact same (byte-for-byte) pack index which we'll store at the same
     pack-$hash.idx path, because the index generation and $hash id are
     computed based on what's in the packfile. But:

       a. The other side might have used other options to generate the
          index. For instance we use index v2 by default, but long ago
          it was v1 (and you can still ask for v1 explicitly).

       b. The other side might even use a different mechanism to
          determine $hash. E.g., long ago it was based on the sorted
          list of objects in the packfile, but we switched to using the
          pack checksum in 1190a1acf8 (pack-objects: name pack files
          after trailer hash, 2013-12-05).

The regression we saw in the real world was (3a). A recent client
fetching from a server with a v1 index downloaded that index, then
complained about trying to overwrite it with its own v2 index. This
collision is otherwise harmless; we know we want to replace the remote
version with our local one, but the collision check doesn't realize
that.

There are a few options to fix it:

  - we could teach index-pack a command-line option to ignore only pack
    idx collisions, and use it when the dumb-http code invokes
    index-pack. This would be an awkward thing to expose users to and
    would involve a lot of boilerplate to get the option down to the
    collision code.

  - we could delete the remote .idx file right before running
    index-pack. It should be redundant at that point (since we've just
    downloaded the matching pack). But it feels risky to delete
    something from our own .git/objects based on what the other side has
    said. I'm not entirely positive that a malicious server couldn't lie
    about which pack-$hash.idx it has and get us to delete something
    precious.

  - we can stop co-mingling the downloaded idx files in our local
    objects directory. This is a slightly bigger change but I think
    fixes the root of the problem more directly.

This patch implements the third option. The big design questions are:
where do we store the downloaded files, and how do we manage their
lifetimes?

There are some additional quirks to the dumb-http system we should
consider. Remember that in step 2 we downloaded every pack index, but in
step 3 we may only download some of the matching packs. What happens to
those other idx files now? They sit in the .git/objects/pack directory,
possibly waiting to be used at a later date. That may save bandwidth for
a subsequent fetch, but it also creates a lot of weird corner cases:

  - our local object directory now has semi-untrusted .idx files sitting
    around, without their matching .pack

  - in case 3b, we noted that we might not generate the same hash as the
    other side. In that case even if we download the matching pack,
    our index-pack invocation will store it in a different
    pack-$hash.idx file. And the unmatched .idx will sit there forever.

  - if the server repacks, it may delete the old packs. Now we have
    these orphaned .idx files sitting around locally that will never be
    used (nor deleted).

  - if we repack locally we may delete our local version of the server's
    pack index and not realize we have it. So we'll download it again,
    even though we have all of the objects it mentions.

I think the right solution here is probably some more complex cache
management system: download the remote .idx files to their own storage
directory, mark them as "seen" when we get their matching pack (to avoid
re-downloading even if we repack), and then delete them when the
server's objects/info/refs no longer mentions them.

But since the dumb http protocol is so ancient and so inferior to the
smart http protocol, I don't think it's worth spending a lot of time
creating such a system. For this patch I'm just downloading the idx
files to .git/objects/tmp_pack_*, and marking them as tempfiles to be
deleted when we exit (and due to the name, any we miss due to a crash,
etc, should eventually be removed by "git gc" runs based on timestamps).

That is slightly worse for one case: if we download an idx but not the
matching pack, we won't retain that idx for subsequent runs. But the
flip side is that we're making other cases better (we never hold on to
useless idx files forever). I suspect that worse case does not even come
up often, since it implies that the packs are generated to match
distinct parts of history (i.e., in practice even in a repo with many
packs you're going to end up grabbing all of those packs to do a clone).
If somebody really cares about that, I think the right path forward is a
managed cache directory as above, and this patch is providing the first
step in that direction anyway (by moving things out of the objects/pack/
directory).

There are two test changes. One demonstrates the broken v1 index case
(it double-checks the resulting clone with fsck to be careful, but prior
to this patch it actually fails at the clone step). The other tweaks the
expectation for a test that covers the "slightly worse" case to
accommodate the extra index download.

The code changes are fairly simple. We stop using finalize_object_file()
to copy the remote's index file into place, and leave it as a tempfile.
We give the tempfile a real ".idx" name, since the packfile code expects
that, and thus we make sure it is out of the usual packs/ directory (so
we'd never mistake it for a real local .idx).

We also have to change parse_pack_index(), which creates a temporary
packed_git to access our index (we need this because all of the pack idx
code assumes we have that struct). It reads the index data from the
tempfile, but prior to this patch would speculatively write the
finalized name into the packed_git struct using the pack-$hash we expect
to use.

I was mildly surprised that this worked at all, since we call
verify_pack_index() on the packed_git which mentions the final name
before moving the file into place! But it works because
parse_pack_index() leaves the mmap-ed data in the struct, so the
lazy-open in verify_pack_index() never triggers, and we read from the
tempfile, ignoring the filename in the struct completely. Hacky, but it
works.

After this patch, parse_pack_index() now uses the index filename we pass
in to derive a matching .pack name. This is OK to change because there
are only two callers, both in the dumb http code (and the other passes
in an existing pack-$hash.idx name, so the derived name is going to be
pack-$hash.pack, which is what we were using anyway).

I'll follow up with some more cleanups in that area, but this patch is
sufficient to fix the regression.

Reported-by: fox <fox.gbr@townlong-yak.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
Jeff King 019b21d402 t5550: count fetches in "previously-fetched .idx" test
We have a test in t5550 that looks at index fetching over dumb http. It
creates two branches, each of which is completely stored in its own
pack, then fetches the branches independently. What should (and does)
happen is that the first fetch grabs both .idx files and one .pack file,
and then the fetch of the second branch re-uses the previously
downloaded .idx files (fetching none) and grabs the now-required .pack
file.

Since the next few patches will be touching this area of the code, let's
beef up the test a little by checking that we're downloading the
expected items at each step.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
Jeff King 8b5763e8fa midx: avoid duplicate packed_git entries
When we scan a pack directory to load the idx entries we find into the
packed_git list, we skip any of them that are contained in a midx. We
then load them later lazily if we actually need to access the
corresponding pack, referencing them both from the midx struct and the
packed_git list.

The lazy-load in the midx code checks to see if the midx already
mentions the pack, but doesn't otherwise check the packed_git list. This
makes sense, since we should have added any pack to both lists.

But there's a loophole! If we call close_object_store(), that frees the
midx entirely, but _not_ the packed_git structs, which we must keep
around for Reasons[1]. If we then try to look up more objects, we'll
auto-load the midx again, which won't realize that we've already loaded
those packs, and will create duplicate entries in the packed_git list.

This is possibly inefficient, because it means we may open and map the
pack redundantly. But it can also lead to weird user-visible behavior.
The case I found is in "git repack", which closes and reopens the midx
after repacking and then calls update_server_info(). We end up writing
the duplicate entries into objects/info/packs.

We could obviously de-dup them while writing that file, but it seems
like a violation of more core assumptions that we end up with these
duplicate structs at all.

We can avoid the duplicates reasonably efficiently by checking their
names in the pack_map hash. This annoyingly does require a little more
than a straight hash lookup due to the naming conventions, but it should
only happen when we are about to actually open a pack. I don't think one
extra malloc will be noticeable there.

[1] I'm not entirely sure of all the details, except that we generally
    assume the packed_git structs never go away. We noted this
    restriction in the comment added by 6f1e9394e2 (object: fix leaking
    packfiles when closing object store, 2024-08-08), but it's somewhat
    vague. At any rate, if you try freeing the structs in
    close_object_store(), you can observe segfaults all over the test
    suite. So it might be fixable, but it's not trivial.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00
Taylor Blau 6a11438f43 The fifth batch 2024-10-25 14:11:13 -04:00
Taylor Blau 55d12c24d7 Merge branch 'wm/shortlog-hash'
Teaches 'shortlog' to explicitly use SHA-1 when operating outside of
a repository.

* wm/shortlog-hash:
  builtin/shortlog: explicitly set hash algo when there is no repo
2024-10-25 14:02:49 -04:00
Taylor Blau fcaac14abf Merge branch 'sk/msvc-warnings'
Fixes compile time warnings with 64-bit MSVC.

* sk/msvc-warnings:
  mingw.c: Fix complier warnings for a 64 bit msvc
2024-10-25 14:02:44 -04:00
Taylor Blau 0ab43ed95c Merge branch 'jc/a-commands-without-the-repo'
Commands that can also work outside Git have learned to take the
repository instance "repo" when we know we are in a repository, and
NULL when we are not, in a parameter.  The uses of the_repository
variable in a few of them have been removed using the new calling
convention.

* jc/a-commands-without-the-repo:
  archive: remove the_repository global variable
  annotate: remove usage of the_repository global
  git: pass in repo to builtin based on setup_git_directory_gently
2024-10-25 14:02:36 -04:00
Taylor Blau dca32b8288 Merge branch 'pb/clar-build-fix'
Build fix.

* pb/clar-build-fix:
  Makefile: fix dependency for $(UNIT_TEST_DIR)/clar/clar.o
2024-10-25 14:02:25 -04:00
Taylor Blau 448022a7fb Merge branch 'bf/t-readme-mention-reftable'
Doc update.

* bf/t-readme-mention-reftable:
  t/README: add missing value for GIT_TEST_DEFAULT_REF_FORMAT
2024-10-25 14:02:21 -04:00
Taylor Blau f25bb60393 Merge branch 'ak/typofix'
More typofixes.

* ak/typofix:
  t: fix typos
2024-10-25 14:02:08 -04:00
Taylor Blau 4d334e5205 Merge branch 'ak/typofixes'
Typofixes.

* ak/typofixes:
  t: fix typos
  t/helper: fix a typo
  t/perf: fix typos
  t/unit-tests: fix typos
  contrib: fix typos
  compat: fix typos
2024-10-25 14:02:04 -04:00
Taylor Blau 55bc7d54ab Merge branch 'ps/ci-gitlab-windows'
Enable Windows-based CI in GitLab.

* ps/ci-gitlab-windows:
  gitlab-ci: exercise Git on Windows
  gitlab-ci: introduce stages and dependencies
  ci: handle Windows-based CI jobs in GitLab CI
  ci: create script to set up Git for Windows SDK
  t7300: work around platform-specific behaviour with long paths on MinGW
2024-10-25 14:01:21 -04:00
Taylor Blau 6cbcc68ea7 Merge branch 'db/submodule-fetch-with-remote-name-fix'
A "git fetch" from the superproject going down to a submodule used
a wrong remote when the default remote names are set differently
between them.

* db/submodule-fetch-with-remote-name-fix:
  submodule: correct remote name with fetch
2024-10-25 14:01:09 -04:00
Usman Akinyemi e226ba81a2 imap: replace atoi() with strtol_i() for UIDVALIDITY and UIDNEXT parsing
Replace unsafe uses of atoi() with strtol_i() to improve error handling
when parsing UIDVALIDITY, UIDNEXT, and APPENDUID in IMAP commands.
Invalid values, such as those with letters, now trigger error messages and
prevent malformed status responses.
I did not add any test for this commit as we do not have any test
for git-imap-send(1) at this point.

Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 14:03:44 -04:00
Usman Akinyemi e36f009e69 merge: replace atoi() with strtol_i() for marker size validation
Replace atoi() with strtol_i() for parsing conflict-marker-size to
improve error handling. Invalid values, such as those containing letters
now trigger a clear error message.
Update the test to verify invalid input handling.

Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 14:03:44 -04:00
Usman Akinyemi cc4023477f daemon: replace atoi() with strtoul_ui() and strtol_i()
Replace atoi() with strtoul_ui() for --timeout and --init-timeout
(non-negative integers) and with strtol_i() for --max-connections
(signed integers). This improves error handling and input validation
by detecting invalid values and providing clear error messages.
Update tests to ensure these arguments are properly validated.

Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 14:03:43 -04:00
Karthik Nayak be75cec1b6 CodingGuidelines: discourage arbitrary suffixes in function names
We often name functions with arbitrary suffixes like `_1` as an
extension of another existing function. This creates confusion and
doesn't provide good clarity into the functions purpose. Let's document
good function naming etiquette in our CodingGuidelines.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 12:51:30 -04:00
Andrew Kreimer f56f9d6c0b t: fix typos
Fix typos and grammar in documentation, comments, etc.

Via codespell.

Reported-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24 12:45:53 -04:00
Kristoffer Haugsbakk 0fcd473fdd t7001: add failure test which triggers assertion
`git mv a/a.txt a b/` is a nonsense instruction.  Instead of failing
gracefully the command trips over itself,[1] leaving behind unfinished
work:

1. first it moves `a/a.txt` to `b/a.txt`; then
2. tries to move `a/`, including `a/a.txt`; then
3. figures out that it’s in a bad state (assertion); and finally
4. aborts.

Now you’re left with a partially-updated index.

The command should instead fail gracefully and make no changes to the
index until it knows that it can complete a sensible action.

For now just add a failing test since this has been known about for
a while.[2]

† 1: Caused by a `pos >= 0` assertion
[2]: https://lore.kernel.org/git/d1f739fe-b28e-451f-9e01-3d2e24a0fe0d@app.fastmail.com/

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:36:15 -04:00
brian m. carlson 5f139a194f gitweb: make use of s///r
In Perl 5.14, released in May 2011, the r modifier was added to the s///
operator to allow it to return the modified string instead of modifying
the string in place. This allows to write nicer, more succinct code in
several cases, so let's do that here.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:36 -04:00
brian m. carlson 702d8c1f3b Require Perl 5.26.0
Our platform support policy states that we require "versions of
dependencies which are generally accepted as stable and supportable,
e.g., in line with the version used by other long-term-support
distributions".  Of Debian, Ubuntu, RHEL, and SLES, the four most common
distributions that provide LTS versions, the version with mainstream
long-term security support with the oldest Perl is 5.26.0 in SLES 15.6.

This is a major upgrade, since Perl 5.8.1, according to the Perl
documentation, was released in September of 2003.  It brings a lot of
new features that we can choose to use, such as s///r to return the
modified string, the postderef functionality, and subroutine signatures,
although the latter was still considered experimental until 5.36.

This change was made with the following one-liner, which intentionally
excludes modifying the vendored modules we include to avoid conflicts:

    git grep -l 'use 5.008001' | grep -v 'LoadCPAN/' | xargs perl -pi -e 's/use 5.008001/require v5.26/'

Use require instead of use to avoid changing the behavior as the latter
enables features and the former does not.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:36 -04:00
brian m. carlson 7bae4e7f58 INSTALL: document requirement for libcurl 7.61.0
Our platform support policy states that we require "versions of
dependencies which are generally accepted as stable and supportable,
e.g., in line with the version used by other long-term-support
distributions".  Of Debian, Ubuntu, and RHEL, the three most common
distributions that provide LTS versions, the version with mainstream
long-term security support with the oldest libcurl is 7.61.0 in RHEL 8.

Update the documentation to state that this is the new base version for
libcurl.  Remove text that is no longer applicable to older versions.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
brian m. carlson 603cf3e942 git-curl-compat: remove check for curl 7.56.0
libcurl 7.56.0 was released in September 2017, which is over seven years
ago, and no major operating system vendor is still providing security
support for it.  Debian 10, which is out of mainstream security support,
has supported a newer version, and Ubuntu 20.04 and RHEL 8, which are
still in support, also have a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
brian m. carlson d2f078c341 git-curl-compat: remove check for curl 7.53.0
libcurl 7.53.0 was released in February 2017, which is over seven years
ago, and no major operating system vendor is still providing security
support for it.  Debian 10 and Ubuntu 18.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
brian m. carlson 17de6fd83b git-curl-compat: remove check for curl 7.52.0
libcurl 7.52.0 was released in August 2017, which is over seven years
ago, and no major operating system vendor is still providing security
support for it.  Debian 9 and Ubuntu 18.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
brian m. carlson 5c91da6d5b git-curl-compat: remove check for curl 7.44.0
libcurl 7.44.0 was released in August 2015, which is over nine years
ago, and no major operating system vendor is still providing security
support for it.  Debian 9 and Ubuntu 16.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
brian m. carlson f47a1faa9b git-curl-compat: remove check for curl 7.43.0
libcurl 7.43.0 was released in June 2015, which is over nine years
ago, and no major operating system vendor is still providing security
support for it.  Debian 9 and Ubuntu 16.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
brian m. carlson 05dd4ec507 git-curl-compat: remove check for curl 7.39.0
libcurl 7.39.0 was released in November 2014, which is almost ten years
ago, and no major operating system vendor is still providing security
support for it.  Debian 9 and Ubuntu 16.04, both of which are out of
mainstream security support, have supported a newer version, and RHEL 8,
which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
brian m. carlson 6545b26eeb git-curl-compat: remove check for curl 7.34.0
libcurl 7.34.0 was released in December 2013, which is well over ten
years ago, and no major operating system vendor is still providing
security support for it.  Debian 8 and Ubuntu 14.04, both of which are
out of mainstream security support, have supported a newer version, and
RHEL 8, which is still in support, also has a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
brian m. carlson f7c094060c git-curl-compat: remove check for curl 7.25.0
libcurl 7.25.0 was released in March 2012, which is well over ten years
ago, and no major operating system vendor is still providing security
support for it.  Debian 8, RHEL 7, and Ubuntu 12.10, all of which are
out of mainstream security support, have all supported a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
brian m. carlson 8bf7f9e1ff git-curl-compat: remove check for curl 7.21.5
libcurl 7.21.5 was released in April 2011, which is well over ten years
ago, and no major operating system vendor is still providing security
support for it.  Debian 7, RHEL 7, and Ubuntu 12.04, all of which are
out of mainstream security support, have all supported a newer version.

Remove the check for this version and use this functionality
unconditionally.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 16:16:35 -04:00
Seyi Kuforiji 09bf122507 t9101: ensure no whitespace after redirect
This change updates the script to conform to the coding
standards outlined in the Git project's documentation. According to the
guidelines in Documentation/CodingGuidelines under "Redirection
operators", there should be no whitespace after redirection operators.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23 14:57:32 -04:00
Seyi Kuforiji 91687cd13f t7011: ensure no whitespace after redirect
This change updates the script to conform to the coding standards
outlined in the Git project's documentation. According to the guidelines
in Documentation/CodingGuidelines under "Redirection operators", there
should be no whitespace after redirection operators.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 15:37:59 -04:00
Taylor Blau fd3785337b The third batch
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 14:43:46 -04:00
Taylor Blau 8e08668322 Merge branch 'cw/worktree-relative'
An extra worktree attached to a repository points at each other to
allow finding the repository from the worktree and vice versa
possible.  Turn this linkage to relative paths.

* cw/worktree-relative:
  worktree: add test for path handling in linked worktrees
  worktree: link worktrees with relative paths
  worktree: refactor infer_backlink() to use *strbuf
  worktree: repair copied repository and linked worktrees
2024-10-22 14:40:39 -04:00
Taylor Blau 6ca9a05e63 Merge branch 'ps/cache-tree-w-broken-index-entry'
Fail gracefully instead of crashing when attempting to write the
contents of a corrupt in-core index as a tree object.

* ps/cache-tree-w-broken-index-entry:
  unpack-trees: detect mismatching number of cache-tree/index entries
  cache-tree: detect mismatching number of index entries
  cache-tree: refactor verification to return error codes
2024-10-22 14:40:38 -04:00
Caleb White 19f5ce0bc2 doc: consolidate extensions in git-config documentation
The `technical/repository-version.txt` document originally served as the
master list for extensions, requiring that any new extensions be defined
there. However, the `config/extensions.txt` file was introduced later
and has since become the de facto location for describing extensions,
with several extensions listed there but missing from
`repository-version.txt`.

This consolidates all extension definitions into `config/extensions.txt`,
making it the authoritative source for extensions. The references in
`repository-version.txt` are updated to point to `config/extensions.txt`,
and cross-references to related documentation such as
`gitrepository-layout[5]` and `git-config[1]` are added.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Caleb White <cdwhite3@pm.me>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 12:49:32 -04:00
Chizoba ODINAKA 9e362dd060 t6050: avoid pipes with upstream Git commands
In pipes, the exit code of a chain of commands is determined by
the final command. In order not to miss the exit code of a failed
Git command, avoid pipes instead write output of Git commands
into a file.
For better debugging experience, instances of "grep" were changed
to "test_grep". "test_grep" provides more context in case of a
failed "grep".

Signed-off-by: Chizoba ODINAKA <chizobajames21@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 12:47:27 -04:00
René Scharfe ce025ae4f6 grep: disable lookahead on error
regexec(3) can fail.  E.g. on macOS it fails if it is used with an UTF-8
locale to match a valid regex against a buffer containing invalid UTF-8
characters.

git grep has two ways to search for matches in a file: Either it splits
its contents into lines and matches them separately, or it matches the
whole content and figures out line boundaries later.  The latter is done
by look_ahead() and it's quicker in the common case where most files
don't contain a match.

Fall back to line-by-line matching if look_ahead() encounters an
regexec(3) error by propagating errors out of patmatch() and bailing out
of look_ahead() if there is one.  This way we at least can find matches
in lines that contain only valid characters.  That matches the behavior
of grep(1) on macOS.

pcre2match() dies if pcre2_jit_match() or pcre2_match() fail, but since
we use the flag PCRE2_MATCH_INVALID_UTF it handles invalid UTF-8
characters gracefully.  So implement the fall-back only for regexec(3)
and leave the PCRE2 matching unchanged.

Reported-by: David Gstir <david@sigma-star.at>
Signed-off-by: René Scharfe <l.s.r@web.de>
Tested-by: David Gstir <david@sigma-star.at>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 12:45:49 -04:00
Andrew Kreimer c348192afe t1016: clean up style
Use `test_config`.

Remove whitespace after redirect operator.

Reported-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22 12:35:05 -04:00
Kousik Sanagavarapu a73070fbd4 t4205: fix typo in 'NUL termination with --stat'
Correct "expected" to rightly terminate with NUL ie '\0' instead of '0'
which may have been typoed.

We didn't notice this before because the test is run with
"test_expect_failure", meaning the test would have been marked broken
anyways.

Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 17:37:11 -04:00
Kristoffer Haugsbakk 52acf6771b SubmittingPatches: tags -> trailers
“Trailer” is the preferred nomenclature in this project.  Also add a
definite article where I think it makes sense.

As we can see the rest of the document already prefers this term.  This
just gets rid of the last stragglers.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 17:23:00 -04:00
Patrick Steinhardt 30bf9f0aaa cmake: set up proper dependencies for generated clar headers
The auto-generated headers used by clar are written at configure time
and thus do not get regenerated automatically. Refactor the build
recipes such that we use custom commands instead, which also has the
benefit that we can reuse the same infrastructure as our Makefile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
Patrick Steinhardt a4f8a59ddc cmake: fix compilation of clar-based unit tests
The compilation of clar-based unit tests is broken because we do not
add the binary directory into which we generate the "clar-decls.h" and
"clar.suite" files as include directories. Instead, we accidentally set
up the source directory as include directory.

Fix this by including the binary directory instead of the source
directory. Furthermore, set up the include directories as PUBLIC instead
of PRIVATE such that they propagate from "unit-tests.lib" to the
"unit-tests" executable, which needs to include the same directory.

Reported-by: Ed Reel <edreel@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
Patrick Steinhardt 67f75dfe1b Makefile: extract script to generate clar declarations
Extract the script to generate function declarations for the clar unit
testing framework into a standalone script. This is done such that we
can reuse it in other build systems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
Alejandro R. Sedeño a779c8e8d5 Makefile: adjust sed command for generating "clar-decls.h"
This moves the end-of-line marker out of the captured group, matching
the start-of-line marker and for some reason fixing generation of
"clar-decls.h" on some older, more esoteric platforms.

Signed-off-by: Alejandro R. Sedeño <asedeno@mit.edu>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
Patrick Steinhardt 7d5f18a901 t/unit-tests: update clar to 206accb
Update clar from:

    - 1516124 (Merge pull request #97 from pks-t/pks-whitespace-fixes, 2024-08-15).

To:

    - 206accb (Merge pull request #108 from pks-t/pks-uclibc-without-wchar, 2024-10-21)

This update includes a bunch of fixes and improvements that we have
discussed in Git when initial support for clar was merged:

  - There is a ".editorconfig" file now.

  - Compatibility with Windows has been improved so that the clar
    compiles on this platform without an issue. This has been tested
    with Cygwin, MinGW and Microsoft Visual Studio.

  - clar now uses CMake. This does not impact us at all as we wire up
    the clar into our own build infrastructure anyway. This conversion
    was done such that we can easily run CI jobs against Windows.

  - Allocation failures are now checked for consistently.

  - We now define feature test macros in "clar.c", which fixes
    compilation on some platforms that didn't previously pull in
    non-standard functions like lstat(3p) or strdup(3p). This was
    reported by a user of OpenSUSE Leap.

  - We stop using `struct timezone`, which is undefined behaviour
    nowadays and results in a compilation error on some platforms.

  - We now use the combination of mktemp(3) and mkdir(3) on SunOS, same
    as we do on NonStop.

  - We now support uClibc without support for <wchar.h>.

The most important bits here are the improved platform compatibility
with Windows, OpenSUSE, SunOS and uClibc.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:53:07 -04:00
Kristoffer Haugsbakk 18b0b6c690 Documentation: mutually link update-ref and symbolic-ref
These two commands are similar enough to acknowledge each other on their
documentation pages.

See the previous commit where we discussed that option-less update-ref
does not support updating symbolic refs but symbolic-ref does.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:49:31 -04:00
Kristoffer Haugsbakk 74522b6b12 Documentation/git-update-ref.txt: discuss symbolic refs
Add a paragraph which just emphasizes that the command without any
options does not support refs in the final arguments.  This is clear
already from the names `<new-oid>` and `<old-oid>` but the right balance
of redundancy makes documentation robust against stray interpretation.

This is also a good place to mention why `--stdin` has those `symref-*`
commands.

Suggested-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:49:31 -04:00
Kristoffer Haugsbakk 793e308f1e Documentation/git-update-ref.txt: remove confusing paragraph
This paragraph interrupts the flow of the section by going into detail
about what a symbolic ref file is and how it is implemented.  It is not
clear what the purpose is since symbolic refs were already mentioned
prior (“possibly dereferencing the symbolic refs”).  Worse, it can
confuse the reader about what argument can be a symbolic ref since it
just says “it” and not which of the parameters; in turn the reader can
be lead to try `<new-oid>` and then get a confusing error since
update-ref will just say that it is not a valid SHA1.

gitglossary(7) already documents what a symref is, concretely, and quite
well at that.

Reported-by: Bence Ferdinandy <bence@ferdinandy.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21 16:49:31 -04:00