Commit Graph

74087 Commits (f037e607a875790dcadbba3f7a8fc185e2877792)

Author SHA1 Message Date
Rubén Justo 03930f93c4 t0612: mark as leak-free
A quick test tells us that t0612 does not trigger any leak:

    $ make SANITIZE=leak test GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true GIT_TEST_OPTS=-i T=t0612-reftable-jgit-compatibility.sh
    [...]
    *** t0612-reftable-jgit-compatibility.sh ***
    in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
    ok 1 - CGit repository can be read by JGit
    ok 2 - JGit repository can be read by CGit
    ok 3 - mixed writes from JGit and CGit
    ok 4 - JGit can read multi-level index
    # passed all 4 test(s)
    1..4
    # faking up non-zero exit with --invert-exit-code
    make[2]: *** [Makefile:75: t0612-reftable-jgit-compatibility.sh] Error 1

Let's mark it as leak-free to silence the machinery activated by
`GIT_TEST_PASSING_SANITIZE_LEAK=check`.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 15:11:05 -07:00
Rubén Justo 47c6d4dad2 test-lib: fix GIT_TEST_SANITIZE_LEAK_LOG
When a test that leaks runs with GIT_TEST_SANITIZE_LEAK_LOG=true,
the test returns zero, which is not what we want.

In the if-else's chain we have in "check_test_results_san_file_", we
consider three variables: $passes_sanitize_leak, $sanitize_leak_check
and, implicitly, GIT_TEST_SANITIZE_LEAK_LOG (always set to "true" at
that point).

For the first two variables we have different considerations depending
on the value of $test_failure, which makes sense.  However, for the
third, GIT_TEST_SANITIZE_LEAK_LOG, we don't;  regardless of
$test_failure, we use "invert_exit_code=t" to produce a non-zero
return value.

That assumes "$test_failure" is always zero at that point.  But it
may not be:

   $ git checkout v2.40.1
   $ make test SANITIZE=leak T=t3200-branch.sh # this fails
   $ make test SANITIZE=leak GIT_TEST_SANITIZE_LEAK_LOG=true T=t3200-branch.sh # this succeeds
   [...]
   With GIT_TEST_SANITIZE_LEAK_LOG=true, our logs revealed a memory leak, exiting with a non-zero status!
   # faked up failures as TODO & now exiting with 0 due to --invert-exit-code

We need to use "invert_exit_code=t" only when "$test_failure" is zero.

Let's add the missing conditions in the if-else's chain to make it work
as expected.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 15:09:07 -07:00
Rubén Justo d0b38a27c6 t0613: mark as leak-free
We can mark t0613 as leak-free:

    $ make test SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true T=t0613-reftable-write-options.sh
    [...]
    *** t0613-reftable-write-options.sh ***
    in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
    ok 1 - default write options
    ok 2 - disabled reflog writes no log blocks
    ok 3 - many refs results in multiple blocks
    ok 4 - tiny block size leads to error
    ok 5 - small block size leads to multiple ref blocks
    ok 6 - small block size fails with large reflog message
    ok 7 - block size exceeding maximum supported size
    ok 8 - restart interval at every single record
    ok 9 - restart interval exceeding maximum supported interval
    ok 10 - object index gets written by default with ref index
    ok 11 - object index can be disabled
    # passed all 11 test(s)
    1..11
    # faking up non-zero exit with --invert-exit-code
    make[2]: *** [Makefile:75: t0613-reftable-write-options.sh] Error 1

Do it.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 12:29:01 -07:00
Abhijeet Sonar 231cf7370e pathspec: fix typo "glossary-context.txt" -> "glossary-content.txt"
The pathspec syntax is explained in the file "glossary-content.txt".
Moreover, no file named "glossary-context.txt" exists in the repository.

Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 12:19:26 -07:00
René Scharfe 4b837f821e submodule--helper: use strvec_pushf() for --super-prefix
Use the strvec_pushf() call that already appends a slash to also produce
the stuck form of the option --super-prefix instead of adding the option
name in a separate call of strvec_push() or strvec_pushl().  This way we
can more easily see that these parts make up a single option with its
argument and save a function call.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 12:18:22 -07:00
Csókás, Bence c852531f45 git-send-email: use sanitized address when reading mbox body
Addresses that are mentioned on the trailers in the commit log
messages (e.g., "Reviewed-by") are added to the "Cc:" list by "git
send-email".  These hand-written addresses, however, may be
malformed (e.g., having unquoted "." and other punctutation marks in
the display-name part) and can upset MTA.

The code does use the sanitize_address() helper on these
address-looking strings to turn them into valid addresses, but it is
used only to see if the address should be suppressed.  The original
string taken from the message is added to the @cc list if the code
decides the address is not suppressed.

Because the addresses on trailer lines are hand-written and more
likely to contain malformed addresses, when adding to the @cc list,
use the result from sanitize_address, not the original.  Note that
we do not modify the behaviour for addresses taken from the e-mail
headers, as they are more likely to be machine generated and
well-formed.

Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-01 11:38:29 -07:00
Orgad Shaneh f402c7941f git-gui: fix inability to quit after closing another instance
If you open 2 git gui instances in the same directory, then close one
of them and try to close the other, an error message pops up, saying:
'error renaming ".git/GITGUI_BCK": no such file or directory', and it
is no longer possible to close the window ever.

Fix by catching this error, and proceeding even if the file no longer
exists.

Signed-off-by: Orgad Shaneh <orgads@gmail.com>
2024-06-30 09:15:04 +03:00
Junio C Hamano 790a17fb19 Sync with 'maint' 2024-06-28 16:03:59 -07:00
Junio C Hamano 09e5e7f718 More post 2.45.2 updates from the 'master' front
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 15:53:19 -07:00
Junio C Hamano 5d5675515e Merge branch 'ds/ahead-behind-fix' into maint-2.45
Fix for a progress bar.

* ds/ahead-behind-fix:
  commit-graph: increment progress indicator
2024-06-28 15:53:19 -07:00
Junio C Hamano 112bd6a67c Merge branch 'ds/doc-add-interactive-singlekey' into maint-2.45
Doc update.

* ds/doc-add-interactive-singlekey:
  doc: interactive.singleKey is disabled by default
2024-06-28 15:53:18 -07:00
Junio C Hamano ce359a4dcc Merge branch 'jc/varargs-attributes' into maint-2.45
Varargs functions that are unannotated as printf-like or execl-like
have been annotated as such.

* jc/varargs-attributes:
  __attribute__: add a few missing format attributes
  __attribute__: mark some functions with LAST_ARG_MUST_BE_NULL
  __attribute__: remove redundant attribute declaration for git_die_config()
  __attribute__: trace2_region_enter_printf() is like "printf"
2024-06-28 15:53:18 -07:00
Junio C Hamano b2a62b6a42 Merge branch 'ps/ci-fix-detection-of-ubuntu-20' into maint-2.45
Fix for an embarrassing typo that prevented Python2 tests from running
anywhere.

* ps/ci-fix-detection-of-ubuntu-20:
  ci: fix check for Ubuntu 20.04
2024-06-28 15:53:17 -07:00
Junio C Hamano f30e5332e4 Merge branch 'jk/cap-exclude-file-size' into maint-2.45
An overly large ".gitignore" files are now rejected silently.

* jk/cap-exclude-file-size:
  dir.c: reduce max pattern file size to 100MB
  dir.c: skip .gitignore, etc larger than INT_MAX
2024-06-28 15:53:17 -07:00
Junio C Hamano ce75d32b99 Merge branch 'jc/safe-directory-leading-path' into maint-2.45
The safe.directory configuration knob has been updated to
optionally allow leading path matches.

* jc/safe-directory-leading-path:
  safe.directory: allow "lead/ing/path/*" match
2024-06-28 15:53:17 -07:00
Junio C Hamano 7b7db54b83 Merge branch 'rs/difftool-env-simplify' into maint-2.45
Code simplification.

* rs/difftool-env-simplify:
  difftool: add env vars directly in run_file_diff()
2024-06-28 15:53:16 -07:00
Junio C Hamano 6e3eb346ed Merge branch 'ps/fix-reinit-includeif-onbranch' into maint-2.45
"git init" in an already created directory, when the user
configuration has includeif.onbranch, started to fail recently,
which has been corrected.

* ps/fix-reinit-includeif-onbranch:
  setup: fix bug with "includeIf.onbranch" when initializing dir
2024-06-28 15:53:16 -07:00
Junio C Hamano 903b4da27f Merge branch 'es/chainlint-ncores-fix' into maint-2.45
The chainlint script (invoked during "make test") did nothing when
it failed to detect the number of available CPUs.  It now falls
back to 1 CPU to avoid the problem.

* es/chainlint-ncores-fix:
  chainlint.pl: latch CPU count directly reported by /proc/cpuinfo
  chainlint.pl: fix incorrect CPU count on Linux SPARC
  chainlint.pl: make CPU count computation more robust
2024-06-28 15:53:15 -07:00
Junio C Hamano 2988b82b87 Merge branch 'jc/rev-parse-fatal-doc' into maint-2.45
Doc update.

* jc/rev-parse-fatal-doc:
  rev-parse: document how --is-* options work outside a repository
2024-06-28 15:53:14 -07:00
Junio C Hamano 0d56a5946a Merge branch 'jc/doc-diff-name-only' into maint-2.45
The documentation for "git diff --name-only" has been clarified
that it is about showing the names in the post-image tree.

* jc/doc-diff-name-only:
  diff: document what --name-only shows
2024-06-28 15:53:14 -07:00
Junio C Hamano db9d38d9bb Merge branch 'mt/t0211-typofix' into maint-2.45
Test fix.

* mt/t0211-typofix:
  t/t0211-trace2-perf.sh: fix typo patern -> pattern
2024-06-28 15:53:13 -07:00
Junio C Hamano db15f4d794 Merge branch 'dg/fetch-pack-code-cleanup' into maint-2.45
Code clean-up to remove an unused struct definition.

* dg/fetch-pack-code-cleanup:
  fetch-pack: remove unused 'struct loose_object_iter'
2024-06-28 15:53:13 -07:00
Junio C Hamano 0c6c514c50 Merge branch 'dm/update-index-doc-fix' into maint-2.45
Doc fix.

* dm/update-index-doc-fix:
  documentation: git-update-index: add --show-index-version to synopsis
2024-06-28 15:53:12 -07:00
Junio C Hamano b608b33f3d Merge branch 'ds/scalar-reconfigure-all-fix' into maint-2.45
Scalar fix.

* ds/scalar-reconfigure-all-fix:
  scalar: avoid segfault in reconfigure --all
2024-06-28 15:53:12 -07:00
Junio C Hamano abfdc596d8 Merge branch 'vd/doc-merge-tree-x-option' into maint-2.45
Doc update.

* vd/doc-merge-tree-x-option:
  Documentation/git-merge-tree.txt: document -X
2024-06-28 15:53:11 -07:00
Junio C Hamano 0d23421e2a Merge branch 'fa/p4-error' into maint-2.45
P4 update.

* fa/p4-error:
  git-p4: show Perforce error to the user
2024-06-28 15:53:11 -07:00
Junio C Hamano 6840423c6f Merge branch 'tb/attr-limits' into maint-2.45
The maximum size of attribute files is enforced more consistently.

* tb/attr-limits:
  attr.c: move ATTR_MAX_FILE_SIZE check into read_attr_from_buf()
2024-06-28 15:53:10 -07:00
Junio C Hamano a5adab9b16 Merge branch 'rs/diff-parseopts-cleanup' into maint-2.45
Code clean-up to remove code that is now a noop.

* rs/diff-parseopts-cleanup:
  diff-lib: stop calling diff_setup_done() in do_diff_cache()
2024-06-28 15:53:10 -07:00
Junio C Hamano fc636b413b Merge branch 'dk/zsh-git-repo-path-fix' into maint-2.45
Command line completion support for zsh (in contrib/) has been
updated to stop exposing internal state to end-user shell
interaction.

* dk/zsh-git-repo-path-fix:
  completion: zsh: stop leaking local cache variable
2024-06-28 15:53:09 -07:00
Junio C Hamano 079323dc6d Merge branch 'bc/zsh-compatibility' into maint-2.45
zsh can pretend to be a normal shell pretty well except for some
glitches that we tickle in some of our scripts. Work them around
so that "vimdiff" and our test suite works well enough with it.

* bc/zsh-compatibility:
  vimdiff: make script and tests work with zsh
  t4046: avoid continue in &&-chain for zsh
2024-06-28 15:53:09 -07:00
Junio C Hamano 1b1b4d490d Merge branch 'js/for-each-repo-keep-going' into maint-2.45
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out.  Now it keeps going.

* js/for-each-repo-keep-going:
  maintenance: running maintenance should not stop on errors
  for-each-repo: optionally keep going on an error
2024-06-28 15:53:08 -07:00
Junio C Hamano 2a78de0d9f Merge branch 'aj/stash-staged-fix' into maint-2.45
"git stash -S" did not handle binary files correctly, which has
been corrected.

* aj/stash-staged-fix:
  stash: fix "--staged" with binary files
2024-06-28 15:53:07 -07:00
Junio C Hamano a41463e437 Merge branch 'xx/disable-replace-when-building-midx' into maint-2.45
The procedure to build multi-pack-index got confused by the
replace-refs mechanism, which has been corrected by disabling the
latter.

* xx/disable-replace-when-building-midx:
  midx: disable replace objects
2024-06-28 15:53:07 -07:00
Junio C Hamano 332bcf74ea Merge branch 'pw/rebase-m-signoff-fix' into maint-2.45
"git rebase --signoff" used to forget that it needs to add a
sign-off to the resulting commit when told to continue after a
conflict stops its operation.

* pw/rebase-m-signoff-fix:
  rebase -m: fix --signoff with conflicts
  sequencer: store commit message in private context
  sequencer: move current fixups to private context
  sequencer: start removing private fields from public API
  sequencer: always free "struct replay_opts"
2024-06-28 15:53:06 -07:00
Derrick Stolee 114bff72ac sparse-index: improve lstat caching of sparse paths
The clear_skip_worktree_from_present_files() method was first introduced
in af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files
present in worktree, 2022-01-14) to allow better interaction with the
working directory in the presence of paths outside of the
sparse-checkout. The initial implementation would lstat() every single
SKIP_WORKTREE path to see if it existed; if it ran across a sparse
directory that existed (when a sparse index was in use), then it would
expand the index and then check every SKIP_WORKTREE path.

Since these lstat() calls were very expensive, this was improved in
d79d299352 (Accelerate clear_skip_worktree_from_present_files() by
caching, 2022-01-14) by caching directories that do not exist so it
could avoid lstat()ing any files under such directories. However, there
are some inefficiencies in that caching mechanism.

The caching mechanism stored only the parent directory as not existing,
even if a higher parent directory also does not exist. This means that
wasted lstat() calls would occur when the paths passed to path_found()
change immediate parent directories but within the same parent directory
that does not exist.

To create an example repository that demonstrates this problem, it helps
to have a directory outside of the sparse-checkout that contains many
deep paths. In particular, the first paths (in lexicographic order)
underneath the sparse directory should have deep directory structures,
maximizing the difference between the old caching algorithm that looks
to a single parent and the new caching algorithm that looks to the
top-most missing directory.

The performance test script p2000-sparse-operations.sh takes the sample
repository and copies its HEAD to several copies nested in directories
of the form f<i>/f<j>/f<k> where i, j, and k are numbers from 1 to 4.
The sparse-checkout cone is then selected as "f2/f4/". Creating "f1/f1/"
will trigger the behavior and also lead to some interesting cases for
the caching algorithm since "f1/f1/" exists but "f1/f2/" and "f3/" do
not.

This is difficult to notice when running performance tests using the Git
repository (or a blow-up of the Git repository, as in
p2000-sparse-operations.sh) because Git has a very shallow directory
structure.

This change reorganizes the caching algorithm to focus on storing the
highest level leading directory that does not exist; specifically this
means that that directory's parent _does_ exist. By doing a little extra
work on a path passed to path_found(), we can short-circuit all of the
paths passed to path_found() afterwards that match a prefix with that
non-existing directory. When in a repository where the first sparse file
is likely to have a much deeper path than the first non-existing
directory, this can realize significant gains.

The details of this algorithm require careful attention, so the new
implementation of path_found() has detailed comments, including the use
of a new max_common_dir_prefix() method that may be of independent
interest.

It's worth noting that this is not universally positive, since we are
doing extra lstat() calls to establish the exact path to cache. In the
blow-up of the Git repository, we can see that the lstat count
_increases_ from 28 to 31. However, these numbers were already
artificially low.

Contributor Elijah Newren created a publicly-available test repository
that demonstrates the difference in these caching algorithms in the most
extreme way. To test, follow these steps:

  git clone --sparse https://github.com/newren/gvfs-like-git-bomb
  cd gvfs-like-git-bomb
  ./runme.sh                   # NOTE: check scripts before running!

At this point, assuming you do not have index.sparse=true set globally,
the index has one million paths with the SKIP_WORKTREE bit and they will
all be sent to path_found() in the sparse loop. You can measure this by
running 'git status' with GIT_TRACE2_PERF=1:

    Sparse files in the index: 1,000,000
  sparse_lstat_count (before):   200,000
   sparse_lstat_count (after):         2

And here are the performance numbers:

  Benchmark 1: old
    Time (mean ± σ):     397.5 ms ±   4.1 ms
    Range (min … max):   391.2 ms … 404.8 ms    10 runs

  Benchmark 2: new
    Time (mean ± σ):     252.7 ms ±   3.1 ms
    Range (min … max):   249.4 ms … 259.5 ms    11 runs

  Summary
    'new' ran
      1.57 ± 0.02 times faster than 'old'

By modifying this example further, we can demonstrate a more realistic
example and include the sparse index expansion. Continue by creating
this directory, confusing both caching algorithms somewhat:

  mkdir -p bomb/d/e/f/a/a

Then re-run the 'git status' tests to see these statistics:

    Sparse files in the index: 1,000,000
  sparse_lstat_count (before):   724,010
   sparse_lstat_count (after):       106

  Benchmark 1: old
    Time (mean ± σ):     753.0 ms ±   3.5 ms
    Range (min … max):   749.7 ms … 760.9 ms    10 runs

  Benchmark 2: new
    Time (mean ± σ):     201.4 ms ±   3.2 ms
    Range (min … max):   196.0 ms … 207.9 ms    14 runs

  Summary
    'new' ran
      3.74 ± 0.06 times faster than 'old'

Note that if this repository had a sparse index enabled, the additional
cost of expanding the sparse index affects the total time of these
commands by over four seconds, significantly diminishing the benefit of
the caching algorithm. Having existing paths outside of the
sparse-checkout is a known performance issue for the sparse index and is
a known trade-off for the performance benefits given when no such paths
exist.

Using an internal monorepo with over two million paths at HEAD and a
typical sparse-checkout cone such that the sparse index contains
~190,000 entries (including over two thousand sparse trees), I was able
to measure these lstat counts when one sparse directory actually exists
on disk:

  Sparse files in expanded index: 1,841,997
       full_lstat_count (before): 1,188,161
       full_lstat_count  (after):     4,404

This resulted in this absolute time change, on a warm disk:

      Time in full loop (before): 13.481 s
      Time in full loop  (after):  0.081 s

(These times were calculated on a Windows machine, where lstat() is
slower than a similar Linux machine.)

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:12 -07:00
Derrick Stolee c4e8c42c19 sparse-index: count lstat() calls
The clear_skip_worktree.. methods already report some statistics about
how many cache entries are checked against path_found() due to having
the skip-worktree bit set. However, due to path_found() performing some
caching, this isn't the only information that would be helpful to
report.

Add a new lstat_count member to the path_found_data struct to count the
number of times path_found() calls lstat(). This will be helpful to help
explain performance problems in this method as well as to demonstrate
future changes to the caching algorithm in a more concrete way than
end-to-end timings.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:12 -07:00
Derrick Stolee 23dd6f8bcc sparse-index: use strbuf in path_found()
The path_found() method previously reused strings from the cache entries
the calling methods were using. This prevents string manipulation in
place and causes some odd reallocation before the final lstat() call in
the method.

Refactor the method to use strbufs and copy the path into the strbuf,
but also only the parent directory and not the whole path. This looks
like extra copying when assigning the path to the strbuf, but we save an
allocation by dropping the 'tmp' string, and we are "reusing" the copy
from 'tmp' to put the data in the strbuf.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:11 -07:00
Derrick Stolee b746a85d9a sparse-index: refactor path_found()
In advance of changing the behavior of path_found(), take all of the
intermediate data values and group them into a single struct. This
simplifies the method prototype as well as the initialization. Future
changes can be made directly to the struct and method without changing
the callers with this approach.

Note that the clear_path_found_data() method is currently empty, as
there is nothing to free. This method is a placeholder for future
changes that require a non-trivial implementation. Its stub is created
now so consumers could call it now and not change in future changes.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:11 -07:00
Derrick Stolee 532e216986 sparse-checkout: refactor skip worktree retry logic
The clear_skip_worktree_from_present_files() method was introduced in
af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files present
in worktree, 2022-01-14) to help cases where sparse-checkout is enabled
but some paths outside of the sparse-checkout also exist on disk.  This
operation can be slow as it needs to check path existence in a way not
stored in the index, so caching was introduced in d79d299352 (Accelerate
clear_skip_worktree_from_present_files() by caching, 2022-01-14).

This check is particularly confusing in the presence of a sparse index,
as a sparse tree entry corresponding to an existing directory must first
be expanded to a full index before examining the paths within. This is
currently implemented using a 'goto' and a boolean variable to ensure we
restart only once.

Even with that caching, it was noticed that this could take a long time
to execute. 89aaab11a3 (index: add trace2 region for clear skip
worktree, 2022-11-03) introduced trace2 regions to measure this time.
Further, the way the loop repeats itself was slightly confusing and
prone to breakage, so a BUG() statement was added in 8c7abdc596 (index:
raise a bug if the index is materialised more than once, 2022-11-03) to
be sure that the second run of the loop does not hit any sparse trees.

One thing that can be confusing about the current setup is that the
trace2 regions nest and it is not clear that a second loop is running
after a sparse index is expanded. Here is an example of what the regions
look like in a typical case:

| region_enter | ... | label:clear_skip_worktree_from_present_files
| region_enter | ... | ..label:update
| region_leave | ... | ..label:update
| region_enter | ... | ..label:ensure_full_index
| region_enter | ... | ....label:update
| region_leave | ... | ....label:update
| region_leave | ... | ..label:ensure_full_index
| data         | ... | ..sparse_path_count:1
| data         | ... | ..sparse_path_count_full:269538
| region_leave | ... | label:clear_skip_worktree_from_present_files

One thing that is particularly difficult to understand about these
regions is that most of the time is spent between the close of the
ensure_full_index region and the reporting of the end data. This is
because of the restart of the loop being within the same region as the
first iteration of the loop.

This change refactors the method into two separate methods that are
traced separately. This will be more important later when we change
other features of the methods, but for now the only functional change is
the difference in the structure of the trace regions.

After this change, the same telemetry section is split into three
distinct chunks:

| region_enter | ... | label:clear_skip_worktree_from_present_files_sparse
| data         | ... | ..sparse_path_count:1
| region_leave | ... | label:clear_skip_worktree_from_present_files_sparse
| region_enter | ... | label:update
| region_leave | ... | label:update
| region_enter | ... | label:ensure_full_index
| region_enter | ... | ..label:update
| region_leave | ... | ..label:update
| region_leave | ... | label:ensure_full_index
| region_enter | ... | label:clear_skip_worktree_from_present_files_full
| data         | ... | ..full_path_count:269538
| region_leave | ... | label:clear_skip_worktree_from_present_files_full

Here, we see the sparse loop terminating early with its first sparse
path being a sparse directory containing a file. Then, that loop's
region terminates before ensure_full_index begins (in this case, the
cache-tree must also be computed). Then, _after_ the index is expanded,
the full loop begins with its own region.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-28 12:32:10 -07:00
Junio C Hamano daed0c68e9 The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-27 09:20:00 -07:00
Junio C Hamano b781a3e08e Merge branch 'jk/fetch-pack-fsck-wo-lock-pack'
"git fetch-pack -k -k" without passing "--lock-pack" (which we
never do ourselves) did not work at all, which has been corrected.

* jk/fetch-pack-fsck-wo-lock-pack:
  fetch-pack: fix segfault when fscking without --lock-pack
2024-06-27 09:19:59 -07:00
Junio C Hamano 5dce36e04f Merge branch 'rs/remove-unused-find-header-mem'
Code clean-up.

* rs/remove-unused-find-header-mem:
  commit: remove find_header_mem()
2024-06-27 09:19:59 -07:00
Junio C Hamano b8d1a1b06c Merge branch 'jk/t5500-typofix'
A helper function shared between two tests had a copy-paste bug,
which has been corrected.

* jk/t5500-typofix:
  t5500: fix mistaken $SERVER reference in helper function
2024-06-27 09:19:59 -07:00
Junio C Hamano 424a13db64 Merge branch 'js/mingw-remove-unused-extern-decl'
An unused extern declaration for mingw has been removed to prevent
it from causing build failure.

* js/mingw-remove-unused-extern-decl:
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
2024-06-27 09:19:58 -07:00
Junio C Hamano 6c0bfce914 Merge branch 'kz/merge-fail-early-upon-refresh-failure'
When "git merge" sees that the index cannot be refreshed (e.g. due
to another process doing the same in the background), it died but
after writing MERGE_HEAD etc. files, which was useless for the
purpose to recover from the failure.

* kz/merge-fail-early-upon-refresh-failure:
  merge: avoid write merge state when unable to write index
2024-06-27 09:19:58 -07:00
Jeff King 407cdbd271 t/lib-bundle-uri: use local fake bundle URLs
A few of the bundle URI tests point config at a fake bundle; they care
only that the client has been configured with _some_ bundle, but it
doesn't have to actually contain objects.

For the file:// tests, we use "$BUNDLE_URI_REPO_URI/fake.bdl", a
non-existent file inside the actual remote repo. But for git:// and
http:// tests, we use "https://example.com/fake.bdl". This works OK in
practice, but it means we actually make a request to example.com (which
returns a placeholder HTML response). That can be annoying when running
the test suite on a spotty network (it doesn't produce a wrong result,
since we expect it to fail, but it may introduce delays).

We can reduce our dependency on the outside world by using a local URL.
It would work to just do "file://$PWD/fake.bdl" here, since the bundle
code does not care about the actual location. But in the long run I
suspect we may have more restrictions on which protocols can be passed
around as bundle URIs. So instead, let's stick with the file:// repo's
pattern and just point to a bogus name based on the remote repo's URL.

For http this makes perfect sense; we'll make a request to the local
http server and find that there's nothing there. For git:// it's a
little weird, as you wouldn't normally access a bundle file over git://
at all. But it's probably the most reasonable guess we can make for now,
and anybody who tightens protocol selection later will know better
what's the best path forward.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 14:31:18 -07:00
Jeff King e6653ec3c6 t5551: do not confirm that bogus url cannot be used
t5551 tries to access a URL with a bogus hostname and confirms that
http.curloptResolve lets us use this otherwise unresolvable name.

Before doing so, though, we confirm that trying to access the bogus
hostname without http.curloptResolve fails as expected. This isn't
testing Git at all, but is confirming the test's assumptions. That's
often a good thing to do, but in this case it means that we'll actually
try to resolve the external name. Even though it's unlikely that
"gitbogusexamplehost.invalid" would ever resolve, the DNS lookup itself
may take time.

It's probably reasonable to just assume that this obviously-bogus name
would not actually resolve in practice, which lets us reduce our test
suite's dependency on the outside world.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 14:31:18 -07:00
Jeff King 63ec97faf7 t5553: use local url for invalid fetch
We test how "fetch --set-upstream" behaves when given an invalid URL,
using the bogus URL "http://nosuchdomain.example.com". But finding out
that it is invalid requires an actual DNS lookup.

Reduce our dependency on external factors by using an invalid local
filesystem URL, which works just as well for our purposes.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 14:31:17 -07:00
Abhijeet Sonar b8ae42e292 describe: refresh the index when 'broken' flag is used
When describe is run with 'dirty' flag, we refresh the index
to make sure it is in sync with the filesystem before
determining if the working tree is dirty.  However, this is
not done for the codepath where the 'broken' flag is used.

This causes `git describe --broken --dirty` to false
positively report the worktree being dirty if a file has
different stat info than what is recorded in the index.
Running `git update-index -q --refresh` to refresh the index
before running diff-index fixes the problem.

Also add tests to deliberately update stat info of a
file before running describe to verify it behaves correctly.

Reported-by: Paul Millar <paul.millar@desy.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 13:04:08 -07:00
Junio C Hamano 72c282098d archive: document that --add-virtual-file takes full path
Tom Scogland noticed that `--add-virtual-file` option uses the path
specified as its value as-is, without prepending any value given to
the `--prefix` option like `--add-file` does.

The behaviour has always been that way since the option was
introduced, but the documentation has always been wrong and said
that it would use the value of `--prefix` just like `--add-file`
does.

We could modify the behaviour to make it literally work like the
documentation said, but it would break existing scripts the users
use.

Noticed-by: Tom Scogland <scogland1@llnl.gov>
Acked-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26 12:56:45 -07:00