Commit Graph

71889 Commits (v2.43.6)

Author SHA1 Message Date
Michael Strawbridge 0d8647034e send-email: move validation code below process_address_list
Move validation logic below processing of email address lists so that
email validation gets the proper email addresses.  As a side effect,
some initialization needed to be moved down.  In order for validation
and the actual email sending to have the same initial state, the
initialized variables that get modified by pre_process_file are
encapsulated in a new function.

This fixes email address validation errors when the optional
perl module Email::Valid is installed and multiple addresses are passed
in on a single to/cc argument like --to=foo@example.com,bar@example.com.
A new test was added to t9001 to expose failures with this case in the
future.

Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-26 21:46:10 +09:00
Andrei Rybak d15b85391a SubmittingPatches: call gitk's command "Copy commit reference"
Documentation/SubmittingPatches informs the contributor that gitk's
context menu command "Copy commit summary" can be used to obtain the
conventional format of referencing existing commits.  This command in
gitk was renamed to "Copy commit reference" in commit [1], following
implementation of Git's "reference" pretty format in [2].

Update mention of this gitk command in Documentation/SubmittingPatches
to its new name.

[1] b8b60957ce (gitk: rename "commit summary" to "commit reference",
    2019-12-12)
[2] commit 1f0fc1d (pretty: implement 'reference' format, 2019-11-20)

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-24 15:27:23 -07:00
Junio C Hamano 2e8e77cbac The twenty-first batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-23 13:56:38 -07:00
Junio C Hamano d12166d3c8 Merge branch 'en/docfixes'
Documentation typo and grammo fixes.

* en/docfixes: (25 commits)
  documentation: add missing parenthesis
  documentation: add missing quotes
  documentation: add missing fullstops
  documentation: add some commas where they are helpful
  documentation: fix whitespace issues
  documentation: fix capitalization
  documentation: fix punctuation
  documentation: use clearer prepositions
  documentation: add missing hyphens
  documentation: remove unnecessary hyphens
  documentation: add missing article
  documentation: fix choice of article
  documentation: whitespace is already generally plural
  documentation: fix singular vs. plural
  documentation: fix verb vs. noun
  documentation: fix adjective vs. noun
  documentation: fix verb tense
  documentation: employ consistent verb tense for a list
  documentation: fix subject/verb agreement
  documentation: remove extraneous words
  ...
2023-10-23 13:56:37 -07:00
Junio C Hamano 5edbcead42 Merge branch 'bc/racy-4gb-files'
The index file has room only for lower 32-bit of the file size in
the cached stat information, which means cached stat information
will have 0 in its sd_size member for a file whose size is multiple
of 4GiB.  This is mistaken for a racily clean path.  Avoid it by
storing a bogus sd_size value instead for such files.

* bc/racy-4gb-files:
  Prevent git from rehashing 4GiB files
  t: add a test helper to truncate files
2023-10-23 13:56:37 -07:00
Junio C Hamano 626f689f79 Merge branch 'jc/fail-stash-to-store-non-stash'
Feeding "git stash store" with a random commit that was not created
by "git stash create" now errors out.

* jc/fail-stash-to-store-non-stash:
  stash: be careful what we store
2023-10-23 13:56:37 -07:00
Junio C Hamano 755fb09163 Merge branch 'so/diff-merges-dd'
"git log" and friends learned "--dd" that is a short-hand for
"--diff-merges=first-parent -p".

* so/diff-merges-dd:
  completion: complete '--dd'
  diff-merges: introduce '--dd' option
  diff-merges: improve --diff-merges documentation
2023-10-23 13:56:37 -07:00
Junio C Hamano f32af12cee Merge branch 'jk/chunk-bounds'
The codepaths that read "chunk" formatted files have been corrected
to pay attention to the chunk size and notice broken files.

* jk/chunk-bounds: (21 commits)
  t5319: make corrupted large-offset test more robust
  chunk-format: drop pair_chunk_unsafe()
  commit-graph: detect out-of-order BIDX offsets
  commit-graph: check bounds when accessing BIDX chunk
  commit-graph: check bounds when accessing BDAT chunk
  commit-graph: bounds-check generation overflow chunk
  commit-graph: check size of generations chunk
  commit-graph: bounds-check base graphs chunk
  commit-graph: detect out-of-bounds extra-edges pointers
  commit-graph: check size of commit data chunk
  midx: check size of revindex chunk
  midx: bounds-check large offset chunk
  midx: check size of object offset chunk
  midx: enforce chunk alignment on reading
  midx: check size of pack names chunk
  commit-graph: check consistency of fanout table
  midx: check size of oid lookup chunk
  commit-graph: check size of oid fanout chunk
  midx: stop ignoring malformed oid fanout chunk
  t: add library for munging chunk-format files
  ...
2023-10-23 13:56:36 -07:00
Javier Mora 3f02785de9 doc/git-bisect: clarify `git bisect run` syntax
The description of the `git bisect run` command syntax at the beginning
of the manpage is `git bisect run <cmd>...`, which isn't quite clear
about what `<cmd>` is or what the `...` mean; one could think that it is
the whole (quoted) command line with all arguments in a single string,
or that it supports multiple commands, or that it doesn't accept
commands with arguments at all.

Change to `git bisect run <cmd> [<arg>...]` to clarify the syntax,
in both the manpage and the `git bisect -h` command output.

Additionally, change `--term-{new,bad}` et al to `--term-(new|bad)`
for consistency with the synopsis syntax conventions.

Signed-off-by: Javier Mora <cousteaulecommandant@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-23 13:04:47 -07:00
Isoken June Ibizugbe 12b99928c8 builtin/branch.c: adjust error messages to coding guidelines
As per the CodingGuidelines document, it is recommended that error messages
such as die(), error() and warning(), should start with a lowercase letter
and should not end with a period.

This patch adjusts tests to match updated messages.

Signed-off-by: Isoken June Ibizugbe <isokenjune@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-23 12:22:57 -07:00
王常新 243c79fdc7 merge-ort.c: fix typo 'neeed' to 'needed'
Signed-off-by: 王常新 <wchangxin824@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-21 23:13:49 -07:00
Junio C Hamano ceadf0f3cf The twentieth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 16:23:11 -07:00
Junio C Hamano 4835409be1 Merge branch 'ps/rewritten-is-per-worktree-doc'
Doc update.

* ps/rewritten-is-per-worktree-doc:
  doc/git-worktree: mention "refs/rewritten" as per-worktree refs
2023-10-20 16:23:11 -07:00
Junio C Hamano 92741d83c0 Merge branch 'ak/pretty-decorate-more-fix'
Unlike "git log --pretty=%D", "git log --pretty="%(decorate)" did
not auto-initialize the decoration subsystem, which has been
corrected.

* ak/pretty-decorate-more-fix:
  pretty: fix ref filtering for %(decorate) formats
2023-10-20 16:23:11 -07:00
Junio C Hamano 6b1e2254d6 Merge branch 'vd/loose-ref-iteration-optimization'
The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

* vd/loose-ref-iteration-optimization:
  files-backend.c: avoid stat in 'loose_fill_ref_dir'
  dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
  dir.[ch]: expose 'get_dtype'
  ref-cache.c: fix prefix matching in ref iteration
2023-10-20 16:23:11 -07:00
Junio C Hamano c662038629 Merge branch 'ty/merge-tree-strategy-options'
"git merge-tree" learned to take strategy backend specific options
via the "-X" option, like "git merge" does.

* ty/merge-tree-strategy-options:
  merge: introduce {copy|clear}_merge_options()
  merge-tree: add -X strategy option
2023-10-20 16:23:11 -07:00
Michal Suchanek f6d83e2115 git-push doc: more visibility for -q option
The "-v" option is shown in the SYNOPSIS section near the top, but
"-q" is not shown anywhere there.

List "-q" alongside "-v".

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 15:13:38 -07:00
Oswald Buddenhagen 96db17352d rebase: move parse_opt_keep_empty() down
This moves it right next to parse_opt_empty(), which is a much more
logical place. As a side effect, this removes the need for a forward
declaration of imply_merge().

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:47:44 -07:00
Oswald Buddenhagen 37e80a2471 rebase: handle --strategy via imply_merge() as well
At least after the successive trimming of enum rebase_type mentioned in
the previous commit, this code did exactly what imply_merge() does, so
just call it instead.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:47:44 -07:00
Oswald Buddenhagen a5b5740bf6 rebase: simplify code related to imply_merge()
The code's evolution left in some bits surrounding enum rebase_type that
don't really make sense any more. In particular, it makes no sense to
invoke imply_merge() if the type is already known not to be
REBASE_APPLY, and it makes no sense to assign the type after calling
imply_merge().

enum rebase_type had more values until commit a74b35081c ("rebase: drop
support for `--preserve-merges`") and commit 10cdb9f38a ("rebase: rename
the two primary rebase backends"). The latter commit also renamed
imply_interactive() to imply_merge().

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:47:43 -07:00
Jeff King 3ec6167567 send-email: handle to/cc/bcc from --compose message
If the user writes a message via --compose, send-email will pick up
various headers like "From", "Subject", etc and use them for other
patches as if they were specified on the command-line. But we don't
handle "To", "Cc", or "Bcc" this way; we just tell the user "those
aren't interpeted yet" and ignore them.

But it seems like an obvious thing to want, especially as the same
feature exists when the cover letter is generated separately by
format-patch. There it is gated behind the --to-cover option, but I
don't think we'd need the same control here; since we generate the
--compose template ourselves based on the existing input, if the user
leaves the lines unchanged then the behavior remains the same.

So let's fill in the implementation; like those other headers we already
handle, we just need to assign to the initial_* variables. The only
difference in this case is that they are arrays, so we'll feed them
through parse_address_line() to split them (just like we would when
reading a single string via prompting).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:31:39 -07:00
Jeff King 637e8944a1 Revert "send-email: extract email-parsing code into a subroutine"
This reverts commit b6049542b9.

Prior to that commit, we read the results of the user editing the
"--compose" message in a loop, picking out parts we cared about, and
streaming the result out to a ".final" file. That commit split the
reading/interpreting into two phases; we'd now read into a hash, and
then pick things out of the hash.

The goal was making the code more readable. And in some ways it did,
because the ugly regexes are confined to the reading phase. But it also
introduced several bugs, because now the two phases need to match each
other. In particular:

  - we pick out headers like "Subject: foo" with a case-insensitive
    regex, and then use the user-provided header name as the key in a
    case-sensitive hash. So if the user wrote "subject: foo", we'd no
    longer recognize it as a subject.

  - the namespace for the hash keys conflates header names with meta
    information like "body". If you put "body: foo" in your message, it
    would be misinterpreted as the actual message body (nobody is likely
    to do that in practice, but it seems like an unnecessary danger).

  - the handling for to/cc/bcc is totally broken. The behavior before
    that commit is to recognize and skip those headers, with a note to
    the user that they are not yet handled. Not great, but OK. But
    after the patch, the reading side now splits the addresses into a
    perl array-ref. But the interpreting side doesn't handle this at
    all, and blindly prints the stringified array-ref value. This leads
    to garbage like:

      (mbox) Adding to: ARRAY (0x555b4345c428) from line 'To: ARRAY(0x555b4345c428)'
      error: unable to extract a valid address from: ARRAY (0x555b4345c428)
      What to do with this address? ([q]uit|[d]rop|[e]dit):

    Probably not a huge deal, since nobody should even try to use those
    headers in the first place (since they were not implemented). But
    the new behavior is worse, and indicative of the sorts of problems
    that come from having the two layers.

The revert had a few conflicts, due to later work in this area from
15dc3b9161 (send-email: rename variable for clarity, 2018-03-04) and
d11c943c78 (send-email: support separate Reply-To address, 2018-03-04).
I've ported the changes from those commits over as part of the conflict
resolution.

The new tests show the bugs. Note the use of GIT_SEND_EMAIL_NOTTY in the
second one. Without it, the test is happy to reach outside the test
harness to the developer's actual terminal (when run with the buggy
state before this patch).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:31:32 -07:00
Jeff King e0c7e2c326 doc/send-email: mention handling of "reply-to" with --compose
The documentation for git-send-email lists the headers handled specially
by --compose in a way that implies that this is the complete set of
headers that are special. But one more was added by d11c943c78
(send-email: support separate Reply-To address, 2018-03-04) and never
documented.

Let's add it, and reword the documentation slightly to avoid having to
specify the list of headers twice (as it is growing and will continue to
do so as we add new features).

If you read the code, you may notice that we also handle MIME-Version
specially, in that we'll avoid over-writing user-provided MIME headers.
I don't think this is worth mentioning, as it's what you'd expect to
happen (as opposed to the other headers, which are picked up to be used
in later emails). And certainly this feature existed when the
documentation was expanded in 01d3861217 (git-send-email.txt: describe
--compose better, 2009-03-16), and we chose not to mention it then.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:31:30 -07:00
Linus Arver 7cb26a1722 commit: ignore_non_trailer computes number of bytes to ignore
ignore_non_trailer() returns the _number of bytes_ that should be
ignored from the end of the log message. It does not by itself "ignore"
anything.

Rename this function to remove the leading "ignore" verb, to sound more
like a quantity than an action.

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 14:25:12 -07:00
Kristoffer Haugsbakk b1688ea02d grep: die gracefully when outside repository
Die gracefully when `git grep --no-index` is run outside of a Git
repository and the path is outside the directory tree.

If you are not in a Git repository and say:

    git grep --no-index search ..

You trigger a `BUG`:

    BUG: environment.c:213: git environment hasn't been setup
    Aborted (core dumped)

Because `..` is a valid path which is treated as a pathspec. Then
`pathspec` figures out that it is not in the current directory tree. The
`BUG` is triggered when `pathspec` tries to advise the user about how the
path is not in the current (non-existing) repository.

Reported-by: ks1322 ks1322 <ks1322@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20 11:06:45 -07:00
Matthew McClain 10c89a02b0 git-p4 shouldn't attempt to store symlinks in LFS
git-p4.py would attempt to put a symlink in LFS if its file extension
matched git-p4.largeFileExtensions.

Git LFS doesn't store symlinks because smudge/clean filters don't handle
symlinks. They never get passed to the filter process nor the
smudge/clean filters, nor could that occur without a change to the
protocol or command-line interface. Unless Git learned how to send them
to the filters, Git LFS would have a hard time using them in any useful
way.

Git LFS's goal is to move large files out of the repository history, and
symlinks are functionally limited to 4 KiB or a similar size on most
systems.

Signed-off-by: Matthew McClain <mmcclain@noprivs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-19 10:57:44 -07:00
Dorcas AnonoLitunya 5abb758118 t7601: use "test_path_is_file" etc. instead of "test -f"
Some tests in t7601 use "test -f" and "test ! -f" to see if a path
exists or is missing.

Use test_path_is_file and test_path_is_missing helper functions to
clarify these tests a bit better. This especially matters for the
"missing" case because "test ! -f F" will be happy if "F" exists as a
directory, but the intent of the test is that "F" should not exist, even
as a directory. The updated code expresses this better.

Signed-off-by: Dorcas AnonoLitunya <anonolitunya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-18 16:57:49 -07:00
Junio C Hamano 14d569b1a7 am: align placeholder for --whitespace option with apply
`git am` passes the value given to its `--whitespace` option through
to the underlying `git apply`, and the value is called <action> over
there.  Fix the documentation for the command that calls the value
<option> to say <action> instead.

Note that the option help given by `git am -h` already calls the
value <action>, so there is no need to make a matching change there.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-18 16:35:44 -07:00
Junio C Hamano 813d9a9188 The nineteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-18 13:25:42 -07:00
Junio C Hamano 7906b5c957 Merge branch 'jc/merge-ort-attr-index-fix'
Fix "git merge-tree" to stop segfaulting when the --attr-source
option is used.

* jc/merge-ort-attr-index-fix:
  merge-ort: initialize repo in index state
2023-10-18 13:25:42 -07:00
Junio C Hamano cc7d7183f0 Merge branch 'sn/cat-file-doc-update'
"git cat-file" documentation updates.

* sn/cat-file-doc-update:
  doc/cat-file: make synopsis and description less confusing
2023-10-18 13:25:41 -07:00
Junio C Hamano 0bc6bff9d5 Merge branch 'xz/commit-title-soft-limit-doc'
Doc update.

* xz/commit-title-soft-limit-doc:
  doc: correct the 50 characters soft limit (+)
2023-10-18 13:25:41 -07:00
Junio C Hamano 79861babe2 Merge branch 'tb/repack-max-cruft-size'
"git repack" learned "--max-cruft-size" to prevent cruft packs from
growing without bounds.

* tb/repack-max-cruft-size:
  repack: free existing_cruft array after use
  builtin/repack.c: avoid making cruft packs preferred
  builtin/repack.c: implement support for `--max-cruft-size`
  builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE
  t7700: split cruft-related tests to t7704
2023-10-18 13:25:41 -07:00
Junio C Hamano a060705d94 commit: do not use cryptic "new_index" in end-user facing messages
These error messages say "new_index" as if that spelling has some
significance to the end users (e.g. the file "$GIT_DIR/new_index"
has some issues), but that is not the case at all.  The i18n folks
were made to include the word literally in the translated messages,
which was not a good idea at all.  Spell it "new index", as we are
just telling the users that we failed to create a new index file.
The term is expected to be translated to the end-users' languages,
not left as if it were a literal file name.

This dates all the way back to the first re-implemenation of "git
commit" command in C (the scripted version did not have such wording
in its error messages), in f5bbc322 (Port git commit to C.,
2007-11-08).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-17 22:09:54 -07:00
Naomi Ibe 48399e9cf0 builtin/add.c: clean up die() messages
As described in the CodingGuidelines document, a single line message
given to die() and its friends should not capitalize its first word,
and should not add full-stop at the end.

Signed-off-by: Naomi Ibe <naomi.ibeh69@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-17 12:41:55 -07:00
Rubén Justo 990adccbdf status: fix branch shown when not only bisecting
In 83c750acde (wt-status.*: better advice for git status added,
2012-06-05), git-status received new informative messages to describe
the ongoing work in a worktree.

These messages were enhanced in 0722c805d6 (status: show the branch name
if possible in in-progress info, 2013-02-03), to show, if possible, the
branch where the operation was initiated.

Since then, we show incorrect information when several operations are in
progress and one of them is bisect:

   $ git checkout -b foo
   $ GIT_SEQUENCE_EDITOR='echo break >' git rebase -i HEAD~
   $ git checkout -b bar
   $ git bisect start
   $ git status
   ...

   You are currently editing a commit while rebasing branch 'bar' on '...'.

   You are currently bisecting, started from branch 'bar'.

   ...

Note that we erroneously say "while rebasing branch 'bar'" when we
should be referring to "foo".

This must have gone unnoticed for so long because it must be unusual to
start a bisection while another operation is in progress.  And even less
usual to involve different branches.

It caught my attention reviewing a leak introduced in 8b87cfd000
(wt-status: move strbuf into read_and_strip_branch(), 2013-03-16).

A simple change to deal with this situation can be to record in struct
wt_status_state, the branch where the bisect starts separately from the
branch related to other operations.

Let's do it and so we'll be able to display correct information and
we'll avoid the leak as well.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-16 15:05:27 -07:00
Patrick Steinhardt ca3285dd69 doc/git-repack: don't mention nonexistent "--unpacked" option
The documentation for geometric repacking mentions a "--unpacked" option
that supposedly changes how loose objects are rolled up. This option has
never existed, and the implied behaviour, namely to include all unpacked
objects into the resulting packfile, is in fact the default behaviour.

Correct the documentation to not mention this option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-16 14:21:59 -07:00
Patrick Steinhardt e9cc3a027b doc/git-repack: fix syntax for `-g` shorthand option
The `-g` switch is a shorthand for `--geometric=` and allows the user to
specify the geometric. The documentation is wrong though and indicates
that the syntax for the shorthand is `-g=<factor>`. In fact though, the
option must be specified without the equals sign via `-g<factor>`.

Fix the syntax accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-16 14:21:59 -07:00
Jeff King 7538f9d89b t5319: make corrupted large-offset test more robust
The test t5319.88 ("reader bounds-checks large offset table") can fail
intermittently. The failure mode looks like this:

  1. An earlier test sets up "objects64", a directory that can be used
     to produce a midx with a corrupted large-offsets table. To get the
     large offsets, it corrupts the normal ".idx" file to have a fake
     large offset, and then builds a midx from that.

     That midx now has a large offset table, which is what we want. But
     we also have a .idx on disk that has a corrupted entry. We'll call
     the object with the corrupted large-offset "X".

  2. In t5319.88, we further corrupt the midx by reducing the size of
     the large-offset chunk (because our goal is to make sure we do not
     do an out-of-bounds read on it).

  3. We then enumerate all of the objects with "cat-file --batch-check
     --batch-all-objects", expecting to see a complaint when we try to
     show object X. We use --batch-all-objects because our objects64
     repo doesn't actually have any refs (but if we check them all, one
     of them will be the failing one). The default batch-check format
     includes %(objecttype) and %(objectsize), both of which require us
     to access the actual pack data (and thus requires looking at the
     offset).

  4a. Usually, this succeeds. We try to output object X, do a lookup via
      the midx for the type/size lookup, and run into the corrupt
      large-offset table.

  4b. But sometimes we hit a different error. If another object points
      to X as a delta base, then trying to find the type of that object
      requires walking the delta chain to the base entry (since only the
      base has the concrete type; deltas themselves are either OFS_DELTA
      or REF_DELTA).

      Normally this would not require separate offset lookups at all, as
      deltas are usually stored as OFS_DELTA, specifying the relative
      offset to the base. But the corrupt idx created in step 1 is done
      directly with "git pack-objects" and does not pass the
      --delta-base-offset option, meaning we have REF_DELTA entries!
      Those do have to consult an index to find the location of the base
      object, and they use the pack .idx to do this. The same pack .idx
      that we know is corrupted from step 1!

      Git does notice the error, but it does so by seeing the corrupt
      .idx file, not the corrupt midx file, and the error it reports is
      different, causing the test to fail.

The set of objects created in the test is deterministic. But the delta
selection seems not to be (which is not too surprising, as it is
multi-threaded). I have seen the failure in Windows CI but haven't
reproduced it locally (not even with --stress). Re-running a failed
Windows CI job tends to work. But when I download and examine the trash
directory from a failed run, it shows a different set of deltas than I
get locally. But the exact source of non-determinism isn't that
important; our test should be robust against any order.

There are a few options to fix this:

  a. It would be OK for the "objects64" setup to "unbreak" the .idx file
     after generating the midx. But then it would be hard for subsequent
     tests to reuse it, since it is the corrupted idx that forces the
     midx to have a large offset table.

  b. The "objects64" setup could use --delta-base-offset. This would fix
     our problem, but earlier tests have many hard-coded offsets. Using
     OFS_DELTA would change the locations of objects in the pack (this
     might even be OK because I think most of the offsets are within the
     .idx file, but it seems brittle and I'm afraid to touch it).

  c. Our cat-file output is in oid order by default. Since we store
     bases before deltas, if we went in pack order (using the
     "--unordered" flag), we'd always see our corrupt X before any delta
     which depends on it. But using "--unordered" means we skip the midx
     entirely. That makes sense, since it is just enumerating all of
     the packs, using the offsets found in their .idx files directly.
     So it doesn't work for our test.

  d. We could ask directly about object X, rather than enumerating all
     of them. But that requires further hard-coding of the oid (both
     sha1 and sha256) of object X. I'd prefer not to introduce more
     brittleness.

  e. We can use a --batch-check format that looks at the pack data, but
     doesn't have to chase deltas. The problem in this case is
     %(objecttype), which has to walk to the base. But %(objectsize)
     does not; we can get the value directly from the delta itself.
     Another option would be %(deltabase), where we report the REF_DELTA
     name but don't look at its data.

I've gone with option (e) here. It's kind of subtle, but it's simple and
has no side effects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-14 10:17:25 -07:00
Junio C Hamano a9ecda2788 The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 14:18:29 -07:00
Junio C Hamano 2920971a7f Merge branch 'jk/decoration-and-other-leak-fixes'
Leakfix.

* jk/decoration-and-other-leak-fixes:
  daemon: free listen_addr before returning
  revision: clear decoration structs during release_revisions()
  decorate: add clear_decoration() function
2023-10-13 14:18:28 -07:00
Junio C Hamano 09dcbb486d Merge branch 'ar/diff-index-merge-base-fix'
"git diff --merge-base X other args..." insisted that X must be a
commit and errored out when given an annotated tag that peels to a
commit, but we only need it to be a committish.  This has been
corrected.

* ar/diff-index-merge-base-fix:
  diff: fix --merge-base with annotated tags
2023-10-13 14:18:28 -07:00
Junio C Hamano b32f5b6b34 Merge branch 'js/submodule-fix-misuse-of-path-and-name'
In .gitmodules files, submodules are keyed by their names, and the
path to the submodule whose name is $name is specified by the
submodule.$name.path variable.  There were a few codepaths that
mixed the name and path up when consulting the submodule database,
which have been corrected.  It took long for these bugs to be found
as the name of a submodule initially is the same as its path, and
the problem does not surface until it is moved to a different path,
which apparently happens very rarely.

* js/submodule-fix-misuse-of-path-and-name:
  t7420: test that we correctly handle renamed submodules
  t7419: test that we correctly handle renamed submodules
  t7419, t7420: use test_cmp_config instead of grepping .gitmodules
  t7419: actually test the branch switching
  submodule--helper: return error from set-url when modifying failed
  submodule--helper: use submodule_from_path in set-{url,branch}
2023-10-13 14:18:28 -07:00
Junio C Hamano a45eddec40 Merge branch 'jk/commit-graph-leak-fixes'
Leakfix.

* jk/commit-graph-leak-fixes:
  commit-graph: clear oidset after finishing write
  commit-graph: free write-context base_graph_name during cleanup
  commit-graph: free write-context entries before overwriting
  commit-graph: free graph struct that was not added to chain
  commit-graph: delay base_graph assignment in add_graph_to_chain()
  commit-graph: free all elements of graph chain
  commit-graph: move slab-clearing to close_commit_graph()
  merge: free result of repo_get_merge_bases()
  commit-reach: free temporary list in get_octopus_merge_bases()
  t6700: mark test as leak-free
2023-10-13 14:18:28 -07:00
Junio C Hamano c75e91499b Merge branch 'la/trailer-test-and-doc-updates'
Test coverage for trailers has been improved.

* la/trailer-test-and-doc-updates:
  trailer doc: <token> is a <key> or <keyAlias>, not both
  trailer doc: separator within key suppresses default separator
  trailer doc: emphasize the effect of configuration variables
  trailer --unfold help: prefer "reformat" over "join"
  trailer --parse docs: add explanation for its usefulness
  trailer --only-input: prefer "configuration variables" over "rules"
  trailer --parse help: expose aliased options
  trailer --no-divider help: describe usual "---" meaning
  trailer: trailer location is a place, not an action
  trailer doc: narrow down scope of --where and related flags
  trailer: add tests to check defaulting behavior with --no-* flags
  trailer test description: this tests --where=after, not --where=before
  trailer tests: make test cases self-contained
2023-10-13 14:18:27 -07:00
Junio C Hamano e56b9edf22 Merge branch 'ds/mailmap-entry-update'
Update mailmap entry for Derrick.

* ds/mailmap-entry-update:
  mailmap: change primary address for Derrick Stolee
2023-10-13 14:18:27 -07:00
Jason Hatton 5143ac07b1 Prevent git from rehashing 4GiB files
The index stores file sizes using a uint32_t. This causes any file
that is a multiple of 2^32 to have a cached file size of zero.
Zero is a special value used by racily clean. This causes git to
rehash every file that is a multiple of 2^32 every time git status
or git commit is run.

This patch mitigates the problem by making all files that are a
multiple of 2^32 appear to have a size of 1<<31 instead of zero.

The value of 1<<31 is chosen to keep it as far away from zero
as possible to help prevent things getting mixed up with unpatched
versions of git.

An example would be to have a 2^32 sized file in the index of
patched git. Patched git would save the file as 2^31 in the cache.
An unpatched git would very much see the file has changed in size
and force it to rehash the file, which is safe. The file would
have to grow or shrink by exactly 2^31 and retain all of its
ctime, mtime, and other attributes for old git to not notice
the change.

This patch does not change the behavior of any file that is not
an exact multiple of 2^32.

Signed-off-by: Jason D. Hatton <jhatton@globalfinishing.com>
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 13:33:35 -07:00
brian m. carlson 678eb55f5d t: add a test helper to truncate files
In a future commit, we're going to work with some large files which will
be at least 4 GiB in size.  To take advantage of the sparseness
functionality on most Unix systems and avoid running the system out of
disk, it would be convenient to use truncate(2) to simply create a
sparse file of sufficient size.

However, the GNU truncate(1) utility isn't portable, so let's write a
tiny test helper that does the work for us.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 13:33:35 -07:00
John Cai 9f9c40cf34 attr: add attr.tree for setting the treeish to read attributes from
44451a2 (attr: teach "--attr-source=<tree>" global option to "git",
2023-05-06) provided the ability to pass in a treeish as the attr
source. In the context of serving Git repositories as bare repos like we
do at GitLab however, it would be easier to point --attr-source to HEAD
for all commands by setting it once.

Add a new config attr.tree that allows this.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 11:43:29 -07:00
John Cai 2386535511 attr: read attributes from HEAD when bare repo
The motivation for 44451a2e5e (attr: teach "--attr-source=<tree>" global
option to "git" , 2023-05-06), was to make it possible to use
gitattributes with bare repositories.

To make it easier to read gitattributes in bare repositories however,
let's just make HEAD:.gitattributes the default. This is in line with
how mailmap works, 8c473cecfd (mailmap: default mailmap.blob in bare
repositories, 2012-12-13).

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-13 11:43:29 -07:00