gitk uses ${NS} to select between the original Tk widgets and the newer
themed widgets in ttk. As gitk uses only themed widgets from ttk::,
this indirection now serves no purpose, so let's switch to explicit use
of ttk:: via global search/replace. More simplification, including
removal of the NS variable, is kept for a later patch to keep this one
smaller.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
gitk added the option to used themed Tk (ttk) in 0cc08ff7dd ("gitk: Add
a user preference to enable/disable use of themed widgets", 2009-09-05).
Using ttk had to be optional as Tk 8.4, then in common use, does not
have ttk. ttk is the default when available, so the ttk code paths are
by now very well tested. gitk also has code paths for the older default
widgets, increasing the maintenance burden. Let's make ttk non-optional
to reduce code complexity in later commits.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
gitk includes many user defined configuration variables, has all of
these are listed in $config_variables. But this list is not used to
define the variables to be loaded, saved, or restored when cancelling
the configuration dialog, and developers must maintain separate lists of
variables for these purposes. This leads to unnecessary errors and merge
conflicts. Let's replace those separate lists with $config_variables to
make maintenance easier.
While we are on topic, sort the list of names in $config_variables.
This makes it simpler to scan and has fewer chances of conflicts
when new names are introduced.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
eab5dbab (ci: wire up Meson builds, 2024-12-13) added two instances
of a very similar construct
FAILED_TEST_ARTIFACTS=${TEST_OUTPUT_DIRECTORY:-t}/failed-test-artifacts
one to ci/lib.sh and the other to ci/print-test-failures.sh
Unfortunately, the latter had a typo causing shell to emit "Bad
substitution". Fix it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds the basic support of SHA256 Git repositories.
Most of changes are idiomatic replacement of the hard-coded hash ID
length, but there are subtle things:
* The hash length is determined on startup, and stored in $hashlength
global variable (either 40 or 64).
* The hard-coded "40" are replaced with $hashlength;
for regexp patterns, the ugly string map is used.
* Some code have the fixed numbers like 39 and 45, and those are
replaced with the $hashlength and the offset correction.
* $nullid and $nullid2 are generated for the hash length.
A caveat is that repository picker dialog is performed before
evaluating the repo type, hence $hashlength isn't set there yet.
So the code dealing with the hard-coded "40" are handled differently;
namely, the regexp range is expanded, and the null id is generated
from the HEAD id length locally.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Documentation updates for "git send-email".
* ag/doc-send-email:
docs: mention possible options for Proton Mail users
docs: add a paragraph explaining the `sendmailCmd` option of sendemail
docs: add an OAuth2.0 credential helper for AOL accounts
docs: add outlookidfix config option to sendemail documentation
docs: link OpenSSL's verify(1) manual page to know about -CAfile and -CApath options
Define .precision to more canned parse-options type to avoid bugs
coming from using a variable with a wrong type to capture the
parsed values.
* rs/parse-options-precision:
parse-options: add precision handling for OPTION_COUNTUP
parse-options: add precision handling for OPTION_BITOP
parse-options: add precision handling for OPTION_NEGBIT
parse-options: add precision handling for OPTION_BIT
parse-options: add precision handling for OPTION_SET_INT
parse-options: add precision handling for PARSE_OPT_CMDMODE
parse-options: require PARSE_OPT_NOARG for OPTION_BITOP
When a ref creation at refs/heads/foo/bar fails, the files backend
now removes refs/heads/foo/ if the directory is otherwise not used.
* ps/refs-files-remove-empty-parent:
refs/files: remove empty parent dirs when ref creation fails
"git fetch --prune" used to be O(n^2) expensive when there are many
refs, which has been corrected.
* ph/fetch-prune-optim:
clean up interface for refs_warn_dangling_symrefs
refs: remove old refs_warn_dangling_symref
fetch-prune: optimize dangling-ref reporting
Both $nullid and $null_sha1 point to the same content.
Use only $nullid consistently.
This is a preliminary cleanup for adding the support of SHA256 repo.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
gitk includes code specifically for Tcl 8.4 and 8.5, but the requirement
is now for at least 8.6. Remove the now unusable code targeting earlier
Tcl.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
gitk runs under wish so naturally has Tcl and Tk available and of the
same version. gitk sets a requirement on Tk version >= 8.4: this is very
outdated, and the earliest Tcl currently shipping on any supported OS is
8.6. As 8.7 is in alpha test and is generally compatible with 8.6, we
should allow 8.7. Tcl 9.0 has planned compatibility breaking changes so
is not yet supported.
Let's change the requirements to 8.6-8.7, but not 9.0. Place this at the
top of file so the requirements are obvious.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
gitk has a few code fragments that are used only for git versions <=
1.7.2 that do not support submodules, notes, word differences, or
textconv filters. We just set the minimum git version higher than 1.7.2
so these code fragments have no effect. Delete them.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
gitk has alternate code paths for early git up to 1.72, and has no
defined minimum version. Setting any version > 1.72 as minimum will
allow removing those code paths.
The recent set of advisories published for git, gitk, and git-gui add
updates for v2.43 and later, but Debian has buster withgit 2.20 still
available. While Debian would be responsible for backporting any fixes
to such an early version, we have no good reason preclude it.
So, make 2.20 the minimum required git version.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
If conflict comments already use a comment character that isn't "#", and
core.commentChar is set "auto", Git will ignore these lines during the
scan using ignored_log_message_bytes() and pick a new comment character
based on the rest of the message. The newly chosen character may be
different from the one used in the conflict comments and therefore,
these are no longer treated as comments and end up in the final commit
message.
For example, during a rebase if the user previously set
core.commentChar=% and then encounters a conflict, conflict comments
like "% Conflicts:" are generated. If the user subsequently sets
core.commentChar=auto before running `rebase --continue`, Git parses the
"auto" setting and begins scanning. It first uses the existing
'comment_line_str' (which is '%') to detect and ignore conflict comments
via ignored_log_message_bytes().
Then, Git scans the rest of the message (excluding conflict comments),
sees that none of the remaining lines start with '#' and decides to set
comment_line_str to '#'. Since the final commit character differs from
the one used in the conflict comments, those lines are no longer
considered comments and get included in the final commit message.
Set 'comment_line_str' to '#' when core.commentChar is set to 'auto' to
reset any previously set value.
While this does not solve the issue of conflict comment inclusion and
the user visible behaviour stays tha same, it standardizes the behaviour
of the code by always resetting 'comment_line_str' to '#' when
core.commentChar=auto is parsed.
The patch text is based on Phillip Wood's message:
https://lore.kernel.org/git/9e96aaab-79a2-4632-94cd-d016d4a63b30@gmail.com/
and the commit log message is wriiten by me.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When core.commentChar is set to "auto", Git selects a comment character
by scanning the commit message contents and avoiding any character
already present in the message.
If the message still contains old conflict comments (starting with a
comment character), Git assumes that character is in use and chooses a
different one. As a result, those existing comment lines are no longer
recognized as comments and end up being included in the final commit
message.
To avoid this, skip scanning the trailing comment block when selecting
the comment character. This allows Git to safely reuse the original
character when appropriate, keeping the commit message clean and free of
leftover conflict information.
Background:
The "auto" value for core.commentchar was introduced in the commit
84c9dc2c5a (commit: allow core.commentChar=auto for character auto
selection, 2014-05-17) but did not exhibit this issue at that time.
The bug was introduced in commit a6c2654f83 (rebase -m: fix --signoff
with conflicts, 2024-04-18) where Git started writing conflict comments
to the file at 'rebase_path_message()'.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the string predicates defined in git-compat-util.h all
return bool let's convert the return type of the string predicates
in strbuf.{c,h} to match them.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 8277dbe987 (git-compat-util: convert skip_{prefix,suffix}{,_mem}
to bool, 2023-12-16) a number of our string predicates have been
returning bool instead of int. Now that we've declared that experiment
a success, let's convert the return type of the case-independent
skip_iprefix() and skip_iprefix_mem() functions to match the return
type of their case-dependent equivalents. Returning bool instead of
int makes it clear that these functions are predicates.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have had a test balloon for C99's bool type since 8277dbe987
(git-compat-util: convert skip_{prefix,suffix}{,_mem} to bool,
2023-12-16). As we've had it over 18 months without any complaints
let's declare it a success.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our submission guidelines require people to use their real name, but
this is not always suitable for various reasons.
For people who are transgender or non-binary and are transitioning or
who think they might want to transition, it can be a major obstacle and
cause major discomfort to require the use of their real name. This is
made worse by the fact that Git provides no way to change names built
into history, so the use of a deadname is forever. Our code of conduct
states that we "pledge to act and interact in ways that contribute to an
open, welcoming, diverse, inclusive, and healthy community," and
changing this policy is one way we can improve things for contributors.
In addition, there are some developers who are so widely known
pseudonymously that they have a Wikipedia page with their handle and no
real name. It would seem silly to reject patches from people who are
known and respected in their open-source community just because they
don't wish to share a real name.
There are also other good reasons why people might operate
pseudonymously: because they or their family members are well known and
they wish to protect their privacy, because of current or past
harassment or retaliation or fear of that happening in the future, or
because of concerns about unwanted attention from government officials
or other authority figures. As much as possible, we want to welcome
contributions from anyone who is willing to participate positively in
our community without having them worry about their safety or privacy.
In all of these cases, we should allow people to proceed using a
preferred name or pseudonymously if, in their best judgment, that's the
right thing to do. State that it is common to use a real name but
explicitly mention that contributors who are not comfortable doing so or
prefer to operate pseudonymously or under a preferred name can proceed
otherwise, provided the name is distinctive, identifying, and not
misleading. For instance, using U+2060 (WORD JOINER) as one's ID would
likely be distinctive but not identifying, since most people would have
trouble reading it due to its zero-width nature.
We prohibit identities which are misleading, since our goal is to create
a community which works together with a common goal, and misleading or
deceiving others is not conducive to good community or compatible with
our code of conduct, nor is it compatible with making a legal assertion
about the provenance of one's code.
Explicitly prohibit anonymous contributions to ensure that we have some
line of provenance to a known (if pseudonymous) author who might be able
to respond to questions about it. Explain that this is the reason we
have this policy to help contributors understand the rationale better.
Use "some form of your real name" since some current contributors use
shortened forms of their name or use initials, which have always been
considered acceptable. This helps guide people who would be fine using
their real name but have misconfigured `user.name` thinking it is
intended to be a username or is used for authentication (despite our
documentation to the contrary), but also allows for a variety of
circumstances where the contributor would feel more comfortable not
doing so.
Note that this policy is the same as that of the Linux kernel[0] and the
CNCF[1], as well as many smaller projects. The Linux kernel patch was
Acked-by one of the Linux Foundation's lawyers, Michael Dolan, so it
appears these changes have had legal review.
Additionally, retain the section header ID for ease of linking across
versions.
[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4563201f33a022fc0353033d9dfeb1606a88330
[1] https://github.com/cncf/foundation/blob/659fd32c86dc/dco-guidelines.md
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit bf5ce434db ("l10n: Add full Irish translation (ga.po)", 2025-05-16)
added a new translation to git. In a make build, new 'po' files (ga.po
in this case) are added to the build automatically using a wildcard
pattern. In a meson build you have to add the language code ('ga') to a
list explicitly to have it included in the build. In order to include the
new translation in the meson build, add the 'ga' language code to the
list of translations in the 'po/meson.build' file.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit 837f637cf5 ("meson.build: correct setting of GIT_EXEC_PATH",
2025-05-19) corrected the GIT_EXEC_PATH build setting, but then forgot
to update the installation path for the library executables. This causes
a regression when attempting to execute commands, after installing to a
non-standard location (reported here[1]):
$ meson -Dprefix=/tmp/git -Dlibexecdir=libexec-different build
$ meson install
$ /tmp/git/bin/git --exec-path
/tmp/git/libexec-different
$ /tmp/git/bin/git daemon
git: 'daemon' is not a git command. See 'git --help'
In order to fix the issue, use the 'git_exec_path' variable (calculated
while processing -Dlibexecdir) as the 'install_dir' field during the
installation of the library executables.
[1]: <66fd343a-1351-4350-83eb-c797e47b7693@gmail.com>
Reported-by: irecca.kun@gmail.com
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leakfix with a new and a bit invasive test.
* ly/load-bitmap-leakfix:
pack-bitmap: add load corrupt bitmap test
pack-bitmap: reword comments in test_bitmap_commits()
pack-bitmap: fix memory leak if load_bitmap() failed
Code clean-up around object access API.
* ps/object-store:
odb: rename `read_object_with_reference()`
odb: rename `pretend_object_file()`
odb: rename `has_object()`
odb: rename `repo_read_object_file()`
odb: rename `oid_object_info()`
odb: trivial refactorings to get rid of `the_repository`
odb: get rid of `the_repository` when handling submodule sources
odb: get rid of `the_repository` when handling the primary source
odb: get rid of `the_repository` in `for_each()` functions
odb: get rid of `the_repository` when handling alternates
odb: get rid of `the_repository` in `odb_mkstemp()`
odb: get rid of `the_repository` in `assert_oid_type()`
odb: get rid of `the_repository` in `find_odb()`
odb: introduce parent pointers
object-store: rename files to "odb.{c,h}"
object-store: rename `object_directory` to `odb_source`
object-store: rename `raw_object_store` to `object_database`
The compiler is in general able to recognize the endian shift and
replace it with an optimized opcode if possible. On certain
architectures such as RiscV or MIPS the situation can get complicated.
They don't provide an optimized opcode and masking the "higher" bits may
required loading a constant which needs shifting. This causes the
compiler to emit a lot of instructions for the operation.
The provided builtin directive on these architecture calls a function
which does the operation instead of emitting the code for operation.
Bring back the change from commit 6547d1c9 (bswap.h: add support for
built-in bswap functions, 2025-04-23). The bswap32/64 macro can now be
defined unconditionally so it won't regress on big endian architectures.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On x86 the bswap32/64 macro is implemented based on the x86 opcode which
performs the required shifting in just one opcode.
The other CPUs fallback to the generic shifting as implemented by
default_swab32() and default_bswap64() if needed.
I've been looking at how good a compiler is at recognizing the default
shift and emitting an optimized operation:
- x86, arm64 msvc v19.20
default_swab32() optimized
default_bswap64() shifts
_byteswap_uint64() optimized
- x86, arm64 msvc v19.37
default_swab32() optimized
default_bswap64() optimized
_byteswap_uint64() optimized
- arm64, gcc-4.9.4: optimized
- x86-64, gcc-4.4.7: shifts
- x86-64, gcc-4.5.3: optimized
- x86-64, clang-3.0: optimized
Given that gcc-4.5 and clang-3.0 are fairly old, any recent compiler
should recognize the shift.
Remove the optimized x86 version and rely on the compiler.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ntohl and htonl macros are redefined because the provided macros were
not always optimal. Sometimes it was a function call, sometimes it was a
macro which did the shifting. Using the 'bswap' opcode on x86 provides
probably better performance than performing the shifting.
These macros are only overwritten on x86 if the "optimized" version is
available.
The ntohll and htonll macros are not available on every platform (at
least glibc does not provide them) which means they need to be defined
once the endianness of the system is determined.
In order to get a more symmetrical setup, redfine the macros once the
endianness of the system has been determined.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Microsoft Visual C++ (MSVC) compiler (as of Visual Studio 2022
version 17.13.6) does not define __BYTE_ORDER__ and its C-library does
not define __BYTE_ORDER. The compiler is supported only on arm64 and x86
which are all little endian.
Define GIT_BYTE_ORDER on msvc as little endian to avoid further checks.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The __BYTE_ORDER__ define is provided by gcc (since ~v4.6), clang
(since ~v3.2) and icc (since ~16.0.3).
The __BYTE_ORDER and BYTE_ORDER macros are libc specific and are not
available on all supported platforms such as mingw.
Add support for the __BYTE_ORDER__ macro as a fallback.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
$GIT_TEST_INSTALLED can be set to use an "installed" git instead of the
one from $GIT_BUILD_DIR. This is used by my company's internal test
infrastructure, and not using $GIT_TEST_INSTALLED when querying the
default hash meant that the tests were failing because the hash was
effectively set to the empty string (since git didn't execute).
In the two places we attempt to detect/execute git itself prior to
overriding everything and putting it in $PATH, use identical logic for
identifying the git binary to execute. This also has the effect of
including the $X suffix when querying the default hash, but that's not
strictly necessary. You don't need to specify .exe when running a binary
on Windows, just when testing whether it exists or not.
Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As well as receiving the config key and value, config callbacks
also receive a "struct key_value_info" containing information about
the source of the key-value pair. Accessing the "path" field of
this struct from a callback passed to repo_config() results in a
use-after-free. This happens because repo_config() first populates a
configset by calling config_with_options() and then iterates over the
configset with the callback passed by the caller. When the configset
is constructed it takes a shallow copy of the "struct key_value_info"
for each config setting. This leads to the use-after-free as the
"path" member is freed before config_with_options() returns.
We could fix this by interning the "path" field as we do
for the "filename" field but the "path" field is not actually
needed. It is populated with a copy of the "path" field from "struct
config_source". That field was added in d14d42440d (config: disallow
relative include paths from blobs, 2014-02-19) to distinguish between
relative include directives in files and those in blobs. However,
since 1b8132d99d (i18n: config: unfold error messages marked for
translation, 2016-07-28) we can differentiate these by looking at the
"origin_type" field in "struct key_value_info". So let's remove the
"path" members from "struct config_source" and "struct key_value_info"
and instead use a combination of the "filename" and "origin_type"
fields to determine the absolute path of relative includes.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commits we have migrated all users of the linked list
of multi-pack indices to instead use those stored in the object database
sources. Remove those now-unused pointers.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor `get_all_packs()` so that we stop using the linked list of
multi-pack indices. Note that there is no need to explicitly prepare
alternates, and neither do we have to use `get_multi_pack_index()`,
because `prepare_packed_git()` already takes care of populating all data
structures for us.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor `find_pack_entry()` so that we stop using the linked list of
multi-pack indices. Note that there is no need to explicitly prepare
alternates, and neither do we have to use `get_multi_pack_index()`,
because `prepare_packed_git()` already takes care of populating all data
structures for us.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `get_multi_pack_index()` loads multi-pack indices via
`prepare_packed_git()` and then returns the linked list of multi-pack
indices that is stored in `struct object_database`. That list is in the
process of being removed though in favor of storing the MIDX as part of
the object database source it belongs to.
Refactor `get_multi_pack_index()` so that it returns the multi-pack
index for a single object source. Callers are now expected to call this
function for each source they are interested in. This requires them to
iterate through alternates, so we have to prepare alternate object
sources before doing so.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `close_midx()` we not only close the multi-pack index for
one object source, but instead we iterate through the whole linked list
of MIDXs to close all of them. This linked list is about to go away in
favor of using the new per-source pointer to its respective MIDX.
Refactor the function to iterate through sources instead.
Note that after this patch, there's a couple of callsites left that
continue to use `close_midx()` without iterating through all sources.
These are all cases where we don't care about the MIDX from other
sources though, so it's fine to keep them as-is.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commit we refactored how we load multi-pack indices to
take a corresponding "source" as input. As part of this refactoring we
started to store a pointer to the MIDX in `struct odb_source` itself.
Refactor loading of packfiles in the same way: instead of passing in the
object directory, we now pass in the source from which we want to load
packfiles. This allows us to simplify the code because we don't have to
search for a corresponding MIDX anymore, but we can instead directly use
the MIDX that we have already prepared beforehand.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Multi-pack indices are tracked via `struct multi_pack_index`. This data
structure is stored as a linked list inside `struct object_database`,
which is the global database that spans across all of the object
sources.
This layout causes two problems:
- Object databases consist of multiple object sources (e.g. one source
per alternate object directory), where each multi-pack index is
specific to one of those sources. Regardless of that though, the
MIDX is not tracked per source, but tracked globally for the whole
object database. This creates a mismatch between the on-disk layout
and how things are organized in the object database subsystems and
makes some parts, like figuring out whether a source has an MIDX,
quite awkward.
- Multi-pack indices are an implementation detail of how efficient
access for packfiles work. As such, they are neither relevant in the
context of loose objects, nor in a potential future where we have
pluggable backends.
Refactor `prepare_multi_pack_index_one()` so that it works on a specific
source, which allows us to easily store a pointer to the multi-pack
index inside of it. For now, this pointer exists next to the existing
linked list we have in the object database. Users will be adjusted in
subsequent patches to instead use the per-source pointers.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tb/midx-avoid-cruft-packs:
repack: exclude cruft pack(s) from the MIDX where possible
pack-objects: introduce '--stdin-packs=follow'
pack-objects: swap 'show_{object,commit}_pack_hint'
pack-objects: fix typo in 'show_object_pack_hint()'
pack-objects: perform name-hash traversal for unpacked objects
pack-objects: declare 'rev_info' for '--stdin-packs' earlier
pack-objects: factor out handling '--stdin-packs'
pack-objects: limit scope in 'add_object_entry_from_pack()'
pack-objects: use standard option incompatibility functions
The `git-for-each-ref(1)` command is used to iterate over references
present in a repository. In large repositories with millions of
references, it would be optimal to paginate this output such that we
can start iteration from a given reference. This would avoid having to
iterate over all references from the beginning each time when paginating
through results.
The previous commit added 'seek' functionality to the reference
backends. Utilize this and expose a '--start-after' option in
'git-for-each-ref(1)'. When used, the reference iteration seeks to the
lexicographically next reference and iterates from there onward.
This enables efficient pagination workflows, where the calling script
can remember the last provided reference and use that as the starting
point for the next set of references:
git for-each-ref --count=100
git for-each-ref --count=100 --start-after=refs/heads/branch-100
git for-each-ref --count=100 --start-after=refs/heads/branch-200
Since the reference iterators only allow seeking to a specified marker
via the `ref_iterator_seek()`, we introduce a helper function
`start_ref_iterator_after()`, which seeks to next reference by simply
adding (char) 1 to the marker.
We must note that pagination always continues from the provided marker,
as such any concurrent reference updates lexicographically behind the
marker will not be output. Document the same.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 'ref-filter.c', there is an 'else' clause within `do_filter_refs()`.
This is unnecessary since the 'if' clause calls `die()`, which would
exit the program. So let's remove the unnecessary 'else' clause. This
improves readability since the indentation is also reduced and flow is
simpler.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref iterator exposes a `ref_iterator_seek()` function. The name
suggests that this would seek the iterator to a specific reference in
some ways similar to how `fseek()` works for the filesystem.
However, the function actually sets the prefix for refs iteration. So
further iteration would only yield references which match the particular
prefix. This is a bit confusing.
Let's add a 'flags' field to the function, which when set with the
'REF_ITERATOR_SEEK_SET_PREFIX' flag, will set the prefix for the
iteration in-line with the existing behavior. Otherwise, the reference
backends will simply seek to the specified reference and clears any
previously set prefix. This allows users to start iteration from a
specific reference.
In the packed and reftable backend, since references are available in a
sorted list, the changes are simply setting the prefix if needed. The
changes on the files-backend are a little more involved, since the files
backend uses the 'ref-cache' mechanism. We move out the existing logic
within `cache_ref_iterator_seek()` to `cache_ref_iterator_set_prefix()`
which is called when the 'REF_ITERATOR_SEEK_SET_PREFIX' flag is set. We
then parse the provided seek string and set the required levels and
their indexes to ensure that seeking is possible.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'find_ref_entry' function is no longer used, so remove it.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `ref_iterator` is an internal structure to the 'refs/'
sub-directory, which allows iteration over refs. All reference iteration
is built on top of these iterators.
External clients of the 'refs' subsystem use the various
'refs_for_each...()' functions to iterate over refs. However since these
are wrapper functions, each combination of functionality requires a new
wrapper function. This is not feasible as the functions pile up with the
increase in requirements. Expose the internal reference iterator, so
advanced users can mix and match options as needed.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
External tools such as Jujutsu may add many references that are of no
interest to the user. This preference allows hiding them.
Signed-off-by: Ori Avtalion <ori@avtalion.name>