While not immediately obvious, git-show-ref(1) actually implements three
different subcommands:
- `git show-ref <patterns>` can be used to list references that
match a specific pattern.
- `git show-ref --verify <refs>` can be used to list references.
These are _not_ patterns.
- `git show-ref --exclude-existing` can be used as a filter that
reads references from standard input, performing some conversions
on each of them.
Let's make this more explicit in the code by splitting up the three
subcommands into separate functions. This also allows us to address the
confusingly named `patterns` variable, which may hold either patterns or
reference names.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `pattern` variable is a global variable that tracks either the
reference names (not patterns!) for the `--verify` mode or the patterns
for the non-verify mode. This is a bit confusing due to the slightly
different meanings.
Convert the variable to be local. While this does not yet fix the double
meaning of the variable, this change allows us to address it in a
subsequent patch more easily by explicitly splitting up the different
subcommands of git-show-ref(1).
Note that we introduce a `struct show_ref_data` to pass the patterns to
`show_ref()`. While this is overengineered now, we will extend this
structure in a subsequent patch.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--missing` object option in rev-list currently works only with
missing blobs/trees. For missing commits the revision walker fails with
a fatal error.
Let's extend the functionality of `--missing` option to also support
commit objects. This is done by adding a `missing_objects` field to
`rev_info`. This field is an `oidset` to which we'll add the missing
commits as we encounter them. The revision walker will now continue the
traversal and call `show_commit()` even for missing commits. In rev-list
we can then check if the commit is a missing commit and call the
existing code for parsing `--missing` objects.
A scenario where this option would be used is to find the boundary
objects between different object directories. Consider a repository with
a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
repository, using the `--missing=print` option while disabling the
alternate object directory allows us to find the boundary objects
between the main and alternate object directory.
Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `show_commit()` function already depends on `finish_commit()`, and
in the upcoming commit, we'll also add a dependency on
`finish_object__ma()`. Since in C symbols must be declared before
they're used, let's move `show_commit()` below both `finish_commit()`
and `finish_object__ma()`, so the code is cleaner as a whole without the
need for declarations.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bit `do_not_die_on_missing_tree` is used in revision.h to ensure the
revision walker does not die when encountering a missing tree. This is
currently exclusively set within `builtin/rev-list.c` to ensure the
`--missing` option works with missing trees.
In the upcoming commits, we will extend `--missing` to also support
missing commits. So let's rename the bit to
`do_not_die_on_missing_objects`, which is object type agnostic and can
be used for both trees/commits.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/do-not-trust-commit-graph-blindly-for-existence:
commit: detect commits that exist in commit-graph but not in the ODB
commit-graph: introduce envvar to disable commit existence checks
Commit graphs can become stale and contain references to commits that do
not exist in the object database anymore. Theoretically, this can lead
to a scenario where we are able to successfully look up any such commit
via the commit graph even though such a lookup would fail if done via
the object database directly.
As the commit graph is mostly intended as a sort of cache to speed up
parsing of commits we do not want to have diverging behaviour in a
repository with and a repository without commit graphs, no matter
whether they are stale or not. As commits are otherwise immutable, the
only thing that we really need to care about is thus the presence or
absence of a commit.
To address potentially stale commit data that may exist in the graph,
our `lookup_commit_in_graph()` function will check for the commit's
existence in both the commit graph, but also in the object database. So
even if we were able to look up the commit's data in the graph, we would
still pretend as if the commit didn't exist if it is missing in the
object database.
We don't have the same safety net in `parse_commit_in_graph_one()`
though. This function is mostly used internally in "commit-graph.c"
itself to validate the commit graph, and this usage is fine. We do
expose its functionality via `parse_commit_in_graph()` though, which
gets called by `repo_parse_commit_internal()`, and that function is in
turn used in many places in our codebase.
For all I can see this function is never used to directly turn an object
ID into a commit object without additional safety checks before or after
this lookup. What it is being used for though is to walk history via the
parent chain of commits. So when commits in the parent chain of a graph
walk are missing it is possible that we wouldn't notice if that missing
commit was part of the commit graph. Thus, a query like `git rev-parse
HEAD~2` can succeed even if the intermittent commit is missing.
It's unclear whether there are additional ways in which such stale
commit graphs can lead to problems. In any case, it feels like this is a
bigger bug waiting to happen when we gain additional direct or indirect
callers of `repo_parse_commit_internal()`. So let's fix the inconsistent
behaviour by checking for object existence via the object database, as
well.
This check of course comes with a performance penalty. The following
benchmarks have been executed in a clone of linux.git with stable tags
added:
Benchmark 1: git -c core.commitGraph=true rev-list --topo-order --all (git = master)
Time (mean ± σ): 2.913 s ± 0.018 s [User: 2.363 s, System: 0.548 s]
Range (min … max): 2.894 s … 2.950 s 10 runs
Benchmark 2: git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
Time (mean ± σ): 3.834 s ± 0.052 s [User: 3.276 s, System: 0.556 s]
Range (min … max): 3.780 s … 3.961 s 10 runs
Benchmark 3: git -c core.commitGraph=false rev-list --topo-order --all (git = master)
Time (mean ± σ): 13.841 s ± 0.084 s [User: 13.152 s, System: 0.687 s]
Range (min … max): 13.714 s … 13.995 s 10 runs
Benchmark 4: git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
Time (mean ± σ): 13.762 s ± 0.116 s [User: 13.094 s, System: 0.667 s]
Range (min … max): 13.645 s … 14.038 s 10 runs
Summary
git -c core.commitGraph=true rev-list --topo-order --all (git = master) ran
1.32 ± 0.02 times faster than git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
4.72 ± 0.05 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
4.75 ± 0.04 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = master)
We look at a ~30% regression in general, but in general we're still a
whole lot faster than without the commit graph. To counteract this, the
new check can be turned off with the `GIT_COMMIT_GRAPH_PARANOIA` envvar.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our `lookup_commit_in_graph()` helper tries to look up commits from the
commit graph and, if it doesn't exist there, falls back to parsing it
from the object database instead. This is intended to speed up the
lookup of any such commit that exists in the database. There is an edge
case though where the commit exists in the graph, but not in the object
database. To avoid returning such stale commits the helper function thus
double checks that any such commit parsed from the graph also exists in
the object database. This makes the function safe to use even when
commit graphs aren't updated regularly.
We're about to introduce the same pattern into other parts of our code
base though, namely `repo_parse_commit_internal()`. Here the extra
sanity check is a bit of a tougher sell: `lookup_commit_in_graph()` was
a newly introduced helper, and as such there was no performance hit by
adding this sanity check. If we added `repo_parse_commit_internal()`
with that sanity check right from the beginning as well, this would
probably never have been an issue to begin with. But by retrofitting it
with this sanity check now we do add a performance regression to
preexisting code, and thus there is a desire to avoid this or at least
give an escape hatch.
In practice, there is no inherent reason why either of those functions
should have the sanity check whereas the other one does not: either both
of them are able to detect this issue or none of them should be. This
also means that the default of whether we do the check should likely be
the same for both. To err on the side of caution, we thus rather want to
make `repo_parse_commit_internal()` stricter than to loosen the checks
that we already have in `lookup_commit_in_graph()`.
The escape hatch is added in the form of a new GIT_COMMIT_GRAPH_PARANOIA
environment variable that mirrors GIT_REF_PARANOIA. If enabled, which is
the default, we will double check that commits looked up in the commit
graph via `lookup_commit_in_graph()` also exist in the object database.
This same check will also be added in `repo_parse_commit_internal()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As an attempt to come up with a useful mechanism to ensure that
certain messages are left untranslated [*], we earlier wrote
GIT_TEST_GETTEXT_POISON off as a failed experiment.
But the output from the test helper was easier to use while
debugging failed tests, compared to the same test writtein with the
plain-vanilla "grep". Especially when a test that expects a certain
string to appear in the output (e.g. "this test must fail with this
message") fails, "grep message output" would just silently fail and
in a &&-chained sequence of commands, it is hard to tell which step
failed. test_i18ngrep explicitly said "we wanted to see a line that
match this pattern but did not see a hit in this file".
What we have as test_i18ngrep in our tree still retains this verbose
output (even though we got rid of the "poison" support). Let's
rename it to test_grep (because it is no longer about i18n at all)
and then make test_i18ngrep a thin wrapper around it. Existing
callers of test_i18ngrep can be mechanically rewritten to instead
use test_grep over time, but it does not have to be done in this
commit.
[Footnote]
* The idea was that human-facing messages are often translated, but
there are messages that should never be translated. We use
"grep" only for the latter kind of messages, and then run tests
in "poison" mode that spew garbage for translatable messages. If
such a test run fails, it means these messages tested with "grep"
were marked for translation by mistake. test_i18ngrep was to be
used for other messages that are to be translated, and was to
always "succeed" when runing under the "poison" mode.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Information on how users are accessing hosted repositories can be
helpful to server operators. For example, being able to broadly
differentiate between fetches and initial clones; the use of shallow
repository features; or partial clone filters.
a29263c (fetch-pack: add tracing for negotiation rounds, 2022-08-02)
added some information on have counts to fetch-pack itself to help
diagnose negotiation; but from a git-upload-pack (server) perspective,
there's no means of accessing such information without using
GIT_TRACE_PACKET to examine the protocol packets.
Improve this by emitting a Trace2 JSON event from upload-pack with
summary information on the contents of a fetch request.
* haves, wants, and want-ref counts can help determine (broadly) between
fetches and clones, and the use of single-branch, etc.
* shallow clone depth, tip counts, and deepening options.
* any partial clone filter type.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepath to handle recipient addresses `git send-email
--compose` learns from the user was completely broken, which has
been corrected.
* jk/send-email-fix-addresses-from-composed-messages:
send-email: handle to/cc/bcc from --compose message
Revert "send-email: extract email-parsing code into a subroutine"
doc/send-email: mention handling of "reply-to" with --compose
Code clean-up.
* ob/rebase-cleanup:
rebase: move parse_opt_keep_empty() down
rebase: handle --strategy via imply_merge() as well
rebase: simplify code related to imply_merge()
The pathspec code carelessly dereferenced NULL while emitting an
error message, which has been corrected.
* kh/pathspec-error-wo-repository-fix:
grep: die gracefully when outside repository
"git p4" tried to store symlinks to LFS when told, but has been
fixed not to do so, because it does not make sense.
* mm/p4-symlink-with-lfs:
git-p4 shouldn't attempt to store symlinks in LFS
The attribute subsystem learned to honor `attr.tree` configuration
that specifies which tree to read the .gitattributes files from.
* jc/attr-tree-config:
attr: add attr.tree for setting the treeish to read attributes from
attr: read attributes from HEAD when bare repo
Many typos, ungrammatical sentences and wrong phrasing have been
fixed.
* sn/typo-grammo-phraso-fixes:
t/README: fix multi-prerequisite example
doc/gitk: s/sticked/stuck/
git-jump: admit to passing merge mode args to ls-files
doc/diff-options: improve wording of the log.diffMerges mention
doc: fix some typos, grammar and wording issues
33d7bdd645 (builtin/reflog.c: use parse-options api for expire, delete
subcommands, 2022-01-06) broke the option --single-worktree of git
reflog expire and added a non-printable short flag for it, presumably by
accident. While before it set the variable "all_worktrees" to 0, now it
sets it to 1, its default value. --no-single-worktree is required now
to set it to 0.
Fix it by replacing the variable with one that has the opposite meaning,
to avoid the negation and its potential for confusion. The new variable
"single_worktree" directly captures whether --single-worktree was given.
Also remove the unprintable short flag SOH (start of heading) because it
is undocumented, hard to use and is likely to have been added by mistake
in connection with the negation bug above.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use parentheses and pipes to present alternatives in the argument help
for the --empty options of git am and git rebase, like in the rest of
the documentation.
While at it remove a stray use of the enum empty_action value
STOP_ON_EMPTY_COMMIT to indicate that no short option is present.
While it has a value of 0 and thus there is no user-visible change,
that enum is not meant to hold short option characters. Hard-code 0,
like we do for other options without a short option.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let the parse-options code detect and handle the use of options that are
incompatible with --show-current-patch. This requires exposing the
distinction between the "raw" and "diff" sub-modes. Do that by
splitting the mode RESUME_SHOW_PATCH into RESUME_SHOW_PATCH_RAW and
RESUME_SHOW_PATCH_DIFF and stop tracking sub-modes in a separate struct.
The result is a simpler callback function and more precise error
messages. The original reports a spurious argument or a NULL pointer:
$ git am --show-current-patch --show-current-patch=diff
error: options '--show-current-patch=diff' and '--show-current-patch=raw' cannot be used together
$ git am --show-current-patch=diff --show-current-patch
error: options '--show-current-patch=(null)' and '--show-current-patch=diff' cannot be used together
With this patch we get the more precise:
$ git am --show-current-patch --show-current-patch=diff
error: --show-current-patch=diff is incompatible with --show-current-patch
$ git am --show-current-patch=diff --show-current-patch
error: --show-current-patch is incompatible with --show-current-patch=diff
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only a single PARSE_OPT_CMDMODE option can be specified for the same
variable at the same time. This is enforced by get_value(), but the
error messages are imprecise in three ways:
1. If a non-PARSE_OPT_CMDMODE option changes the value variable of a
PARSE_OPT_CMDMODE option then an ominously vague message is shown:
$ t/helper/test-tool parse-options --set23 --mode1
error: option `mode1' : incompatible with something else
Worse: If the order of options is reversed then no error is reported at
all:
$ t/helper/test-tool parse-options --mode1 --set23
boolean: 0
integer: 23
magnitude: 0
timestamp: 0
string: (not set)
abbrev: 7
verbose: -1
quiet: 0
dry run: no
file: (not set)
Fortunately this can currently only happen in the test helper; actual
Git commands don't share the same variable for the value of options with
and without the flag PARSE_OPT_CMDMODE.
2. If there are multiple options with the same value (synonyms), then
the one that is defined first is shown rather than the one actually
given on the command line, which is confusing:
$ git am --resolved --quit
error: option `quit' is incompatible with --continue
3. Arguments of PARSE_OPT_CMDMODE options are not handled by the
parse-option machinery. This is left to the callback function. We
currently only have a single affected option, --show-current-patch of
git am. Errors for it can show an argument that was not actually given
on the command line:
$ git am --show-current-patch --show-current-patch=diff
error: options '--show-current-patch=diff' and '--show-current-patch=raw' cannot be used together
The options --show-current-patch and --show-current-patch=raw are
synonyms, but the error accuses the user of input they did not actually
made. Or it can awkwardly print a NULL pointer:
$ git am --show-current-patch=diff --show-current-patch
error: options '--show-current-patch=(null)' and '--show-current-patch=diff' cannot be used together
The reasons for these shortcomings is that the current code checks
incompatibility only when encountering a PARSE_OPT_CMDMODE option at the
command line, and that it searches the previous incompatible option by
value.
Fix the first two points by checking all PARSE_OPT_CMDMODE variables
after parsing each option and by storing all relevant details if their
value changed. Do that whether or not the changing options has the flag
PARSE_OPT_CMDMODE set. Report an incompatibility only if two options
change the variable to different values and at least one of them is a
PARSE_OPT_CMDMODE option. This changes the output of the first three
examples above to:
$ t/helper/test-tool parse-options --set23 --mode1
error: --mode1 is incompatible with --set23
$ t/helper/test-tool parse-options --mode1 --set23
error: --set23 is incompatible with --mode1
$ git am --resolved --quit
error: --quit is incompatible with --resolved
Store the argument of PARSE_OPT_CMDMODE options of type OPTION_CALLBACK
as well to allow taking over the responsibility for compatibility
checking from the callback function. The next patch will use this
capability to fix the messages for git am --show-current-patch.
Use a linked list for storing the PARSE_OPT_CMDMODE variables. This
somewhat outdated data structure is simple and suffices, as the number
of elements per command is currently only zero or one. We do support
multiple different command modes variables per command, but I don't
expect that we'd ever use a significant number of them. Once we do we
can switch to a hashmap.
Since we no longer need to search the conflicting option, the all_opts
parameter of get_value() is no longer used. Remove it.
Extend the tests to check for both conflicting option names, but don't
insist on a particular order.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-bugreport already rejected unrecognized flag arguments, like
`--diaggnose`, but this doesn't help if the user's mistake was to forget
the `--` in front of the argument. This can result in a user's intended
argument not being parsed with no indication to the user that something
went wrong. Since git-bugreport presently doesn't take any positionals
at all, let's reject all positionals and give the user a usage hint.
Signed-off-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since e6545201ad (Merge branch 'ab/detox-config-gettext', 2021-04-13),
test_i18ngrep is no longer required. Quit using it in the bugreport
tests, since it's setting a bad example for tests added later.
Signed-off-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tutorial in Documentation/MyFirstContribution.txt has steps to print
some text using the "_" function. However, this leads to compiler errors
when running "make" since "gettext.h" is not #included.
Update docs with a note to #include "gettext.h" in "builtin/psuh.c".
Signed-off-by: Jacob Stopak <jacob@initialcommit.io>
Reviewed-by: Emily Shaffer <nasamuffin@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move validation logic below processing of email address lists so that
email validation gets the proper email addresses. As a side effect,
some initialization needed to be moved down. In order for validation
and the actual email sending to have the same initial state, the
initialized variables that get modified by pre_process_file are
encapsulated in a new function.
This fixes email address validation errors when the optional
perl module Email::Valid is installed and multiple addresses are passed
in on a single to/cc argument like --to=foo@example.com,bar@example.com.
A new test was added to t9001 to expose failures with this case in the
future.
Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/SubmittingPatches informs the contributor that gitk's
context menu command "Copy commit summary" can be used to obtain the
conventional format of referencing existing commits. This command in
gitk was renamed to "Copy commit reference" in commit [1], following
implementation of Git's "reference" pretty format in [2].
Update mention of this gitk command in Documentation/SubmittingPatches
to its new name.
[1] b8b60957ce (gitk: rename "commit summary" to "commit reference",
2019-12-12)
[2] commit 1f0fc1d (pretty: implement 'reference' format, 2019-11-20)
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index file has room only for lower 32-bit of the file size in
the cached stat information, which means cached stat information
will have 0 in its sd_size member for a file whose size is multiple
of 4GiB. This is mistaken for a racily clean path. Avoid it by
storing a bogus sd_size value instead for such files.
* bc/racy-4gb-files:
Prevent git from rehashing 4GiB files
t: add a test helper to truncate files
Feeding "git stash store" with a random commit that was not created
by "git stash create" now errors out.
* jc/fail-stash-to-store-non-stash:
stash: be careful what we store
The codepaths that read "chunk" formatted files have been corrected
to pay attention to the chunk size and notice broken files.
* jk/chunk-bounds: (21 commits)
t5319: make corrupted large-offset test more robust
chunk-format: drop pair_chunk_unsafe()
commit-graph: detect out-of-order BIDX offsets
commit-graph: check bounds when accessing BIDX chunk
commit-graph: check bounds when accessing BDAT chunk
commit-graph: bounds-check generation overflow chunk
commit-graph: check size of generations chunk
commit-graph: bounds-check base graphs chunk
commit-graph: detect out-of-bounds extra-edges pointers
commit-graph: check size of commit data chunk
midx: check size of revindex chunk
midx: bounds-check large offset chunk
midx: check size of object offset chunk
midx: enforce chunk alignment on reading
midx: check size of pack names chunk
commit-graph: check consistency of fanout table
midx: check size of oid lookup chunk
commit-graph: check size of oid fanout chunk
midx: stop ignoring malformed oid fanout chunk
t: add library for munging chunk-format files
...
The description of the `git bisect run` command syntax at the beginning
of the manpage is `git bisect run <cmd>...`, which isn't quite clear
about what `<cmd>` is or what the `...` mean; one could think that it is
the whole (quoted) command line with all arguments in a single string,
or that it supports multiple commands, or that it doesn't accept
commands with arguments at all.
Change to `git bisect run <cmd> [<arg>...]` to clarify the syntax,
in both the manpage and the `git bisect -h` command output.
Additionally, change `--term-{new,bad}` et al to `--term-(new|bad)`
for consistency with the synopsis syntax conventions.
Signed-off-by: Javier Mora <cousteaulecommandant@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As per the CodingGuidelines document, it is recommended that error messages
such as die(), error() and warning(), should start with a lowercase letter
and should not end with a period.
This patch adjusts tests to match updated messages.
Signed-off-by: Isoken June Ibizugbe <isokenjune@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unlike "git log --pretty=%D", "git log --pretty="%(decorate)" did
not auto-initialize the decoration subsystem, which has been
corrected.
* ak/pretty-decorate-more-fix:
pretty: fix ref filtering for %(decorate) formats
The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.
* vd/loose-ref-iteration-optimization:
files-backend.c: avoid stat in 'loose_fill_ref_dir'
dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
dir.[ch]: expose 'get_dtype'
ref-cache.c: fix prefix matching in ref iteration
"git merge-tree" learned to take strategy backend specific options
via the "-X" option, like "git merge" does.
* ty/merge-tree-strategy-options:
merge: introduce {copy|clear}_merge_options()
merge-tree: add -X strategy option