* maint-2.42:
Git 2.42.4
Git 2.41.3
Git 2.40.4
credential: disallow Carriage Returns in the protocol by default
credential: sanitize the user prompt
credential_format(): also encode <host>[:<port>]
t7300: work around platform-specific behaviour with long paths on MinGW
compat/regex: fix argument order to calloc(3)
mingw: drop bogus (and unneeded) declaration of `_pgmptr`
ci: remove 'Upload failed tests' directories' step from linux32 jobs
* maint-2.41:
Git 2.41.3
Git 2.40.4
credential: disallow Carriage Returns in the protocol by default
credential: sanitize the user prompt
credential_format(): also encode <host>[:<port>]
t7300: work around platform-specific behaviour with long paths on MinGW
compat/regex: fix argument order to calloc(3)
mingw: drop bogus (and unneeded) declaration of `_pgmptr`
ci: remove 'Upload failed tests' directories' step from linux32 jobs
* maint-2.40:
Git 2.40.4
credential: disallow Carriage Returns in the protocol by default
credential: sanitize the user prompt
credential_format(): also encode <host>[:<port>]
t7300: work around platform-specific behaviour with long paths on MinGW
compat/regex: fix argument order to calloc(3)
mingw: drop bogus (and unneeded) declaration of `_pgmptr`
ci: remove 'Upload failed tests' directories' step from linux32 jobs
While Git has documented that the credential protocol is line-based,
with newlines as terminators, the exact shape of a newline has not been
documented.
From Git's perspective, which is firmly rooted in the Linux ecosystem,
it is clear that "a newline" means a Line Feed character.
However, even Git's credential protocol respects Windows line endings
(a Carriage Return character followed by a Line Feed character, "CR/LF")
by virtue of using `strbuf_getline()`.
There is a third category of line endings that has been used originally
by MacOS, and that is respected by the default line readers of .NET and
node.js: bare Carriage Returns.
Git cannot handle those, and what is worse: Git's remedy against
CVE-2020-5260 does not catch when credential helpers are used that
interpret bare Carriage Returns as newlines.
Git Credential Manager addressed this as CVE-2024-50338, but other
credential helpers may still be vulnerable. So let's not only disallow
Line Feed characters as part of the values in the credential protocol,
but also disallow Carriage Return characters.
In the unlikely event that a credential helper relies on Carriage
Returns in the protocol, introduce an escape hatch via the
`credential.protectProtocol` config setting.
This addresses CVE-2024-52006.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When asking the user interactively for credentials, we want to avoid
misleading them e.g. via control sequences that pretend that the URL
targets a trusted host when it does not.
While Git learned, over the course of the preceding commits, to disallow
URLs containing URL-encoded control characters by default, credential
helpers are still allowed to specify values very freely (apart from Line
Feed and NUL characters, anything is allowed), and this would allow,
say, a username containing control characters to be specified that would
then be displayed in the interactive terminal prompt asking the user for
the password, potentially sending those control characters directly to
the terminal. This is undesirable because control characters can be used
to mislead users to divulge secret information to untrusted sites.
To prevent such an attack vector, let's add a `git_prompt()` that forces
the displayed text to be sanitized, i.e. displaying question marks
instead of control characters.
Note: While this commit's diff changes a lot of `user@host` strings to
`user%40host`, which may look suspicious on the surface, there is a good
reason for that: this string specifies a user name, not a
<username>@<hostname> combination! In the context of t5541, the actual
combination looks like this: `user%40@127.0.0.1:5541`. Therefore, these
string replacements document a net improvement introduced by this
commit, as `user@host@127.0.0.1` could have left readers wondering where
the user name ends and where the host name begins.
Hinted-at-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Doc updates to clarify what an "unborn branch" means.
* jc/orphan-unborn:
orphan/unborn: fix use of 'orphan' in end-user facing messages
orphan/unborn: add to the glossary and use them consistently
Since commit 62db5247 (rebase -i: generate the script via
rebase--helper, 2017-07-14), the short hash is given in
rebase-todo. Specifying rebase.instructionFormat does not alter this
behavior, contrary to what the documentation implies.
Signed-off-by: Maarten van der Schrieck <maarten@thingsconnected.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no 'ref/notes/' hierarchy. '[format] notes = foo' uses notes
that are found in 'refs/notes/foo'.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To orphan is a verb that denotes the act of getting on an unborn
branch, and a few references to "orphan branch" in our documentation
are misuses of the word. They caused end-user confusion, which was
made even worse because we did not have the term defined in the
glossary document. Add entries for "unborn" branch and "orphan"
operation to the glossary, and adjust existing documentation
accordingly.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Overly long label names used in the sequencer machinery are now
chopped to fit under filesystem limitation.
* mp/rebase-label-length-limit:
rebase: allow overriding the maximal length of the generated labels
sequencer: truncate labels to accommodate loose refs
The attribute subsystem learned to honor `attr.tree` configuration
that specifies which tree to read the .gitattributes files from.
* jc/attr-tree-config:
attr: add attr.tree for setting the treeish to read attributes from
attr: read attributes from HEAD when bare repo
Many typos, ungrammatical sentences and wrong phrasing have been
fixed.
* sn/typo-grammo-phraso-fixes:
t/README: fix multi-prerequisite example
doc/gitk: s/sticked/stuck/
git-jump: admit to passing merge mode args to ls-files
doc/diff-options: improve wording of the log.diffMerges mention
doc: fix some typos, grammar and wording issues
"git repack" learned "--max-cruft-size" to prevent cruft packs from
growing without bounds.
* tb/repack-max-cruft-size:
repack: free existing_cruft array after use
builtin/repack.c: avoid making cruft packs preferred
builtin/repack.c: implement support for `--max-cruft-size`
builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE
t7700: split cruft-related tests to t7704
44451a2 (attr: teach "--attr-source=<tree>" global option to "git",
2023-05-06) provided the ability to pass in a treeish as the attr
source. In the context of serving Git repositories as bare repos like we
do at GitLab however, it would be easier to point --attr-source to HEAD
for all commands by setting it once.
Add a new config attr.tree that allows this.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cruft packs are an alternative mechanism for storing a collection of
unreachable objects whose mtimes are recent enough to avoid being
pruned out of the repository.
When cruft packs were first introduced back in b757353676
(builtin/pack-objects.c: --cruft without expiration, 2022-05-20) and
a7d493833f (builtin/pack-objects.c: --cruft with expiration,
2022-05-20), the recommended workflow consisted of:
- Repacking periodically, either by packing anything loose in the
repository (via `git repack -d`) or producing a geometric sequence
of packs (via `git repack --geometric=<d> -d`).
- Every so often, splitting the repository into two packs, one cruft
to store the unreachable objects, and another non-cruft pack to
store the reachable objects.
Repositories may (out of band with the above) choose periodically to
prune out some unreachable objects which have aged out of the grace
period by generating a pack with `--cruft-expiration=<approxidate>`.
This allowed repositories to maintain relatively few packs on average,
and quarantine unreachable objects together in a cruft pack, avoiding
the pitfalls of holding unreachable objects as loose while they age out
(for more, see some of the details in 3d89a8c118
(Documentation/technical: add cruft-packs.txt, 2022-05-20)).
This all works, but can be costly from an I/O-perspective when
frequently repacking a repository that has many unreachable objects.
This problem is exacerbated when those unreachable objects are rarely
(if every) pruned.
Since there is at most one cruft pack in the above scheme, each time we
update the cruft pack it must be rewritten from scratch. Because much of
the pack is reused, this is a relatively inexpensive operation from a
CPU-perspective, but is very costly in terms of I/O since we end up
rewriting basically the same pack (plus any new unreachable objects that
have entered the repository since the last time a cruft pack was
generated).
At the time, we decided against implementing more robust support for
multiple cruft packs. This patch implements that support which we were
lacking.
Introduce a new option `--max-cruft-size` which allows repositories to
accumulate cruft packs up to a given size, after which point a new
generation of cruft packs can accumulate until it reaches the maximum
size, and so on. To generate a new cruft pack, the process works like
so:
- Sort a list of any existing cruft packs in ascending order of pack
size.
- Starting from the beginning of the list, group cruft packs together
while the accumulated size is smaller than the maximum specified
pack size.
- Combine the objects in these cruft packs together into a new cruft
pack, along with any other unreachable objects which have since
entered the repository.
Once a cruft pack grows beyond the size specified via `--max-cruft-size`
the pack is effectively frozen. This limits the I/O churn up to a
quadratic function of the value specified by the `--max-cruft-size`
option, instead of behaving quadratically in the number of total
unreachable objects.
When pruning unreachable objects, we bypass the new code paths which
combine small cruft packs together, and instead start from scratch,
passing in the appropriate `--max-pack-size` down to `pack-objects`,
putting it in charge of keeping the resulting set of cruft packs sized
correctly.
This may seem like further I/O churn, but in practice it isn't so bad.
We could prune old cruft packs for whom all or most objects are removed,
and then generate a new cruft pack with just the remaining set of
objects. But this additional complexity buys us relatively little,
because most objects end up being pruned anyway, so the I/O churn is
well contained.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous commit implemented the `gc.repackFilter` config option
to specify a filter that should be used by `git gc` when
performing repacks.
Another previous commit has implemented
`git repack --filter-to=<dir>` to specify the location of the
packfile containing filtered out objects when using a filter.
Let's implement the `gc.repackFilterTo` config option to specify
that location in the config when `gc.repackFilter` is used.
Now when `git gc` will perform a repack with a <dir> configured
through this option and not empty, the repack process will be
passed a corresponding `--filter-to=<dir>` argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous commit has implemented `git repack --filter=<filter-spec>` to
allow users to filter out some objects from the main pack and move them
into a new different pack.
Users might want to perform such a cleanup regularly at the same time as
they perform other repacks and cleanups, so as part of `git gc`.
Let's allow them to configure a <filter-spec> for that purpose using a
new gc.repackFilter config option.
Now when `git gc` will perform a repack with a <filter-spec> configured
through this option and not empty, the repack process will be passed a
corresponding `--filter=<filter-spec>` argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new configuration option diff.statNameWidth=<width> that is equivalent
to the command-line option --stat-name-width=<width>, but it is ignored
by format-patch. This follows the logic established by the already
existing configuration option diff.statGraphWidth=<width>.
Limiting the widths of names and graphs in the --stat output makes sense
for interactive work on wide terminals with many columns, hence the support
for these configuration options. They don't affect format-patch because
it already adheres to the traditional 80-column standard.
Update the documentation and add more tests to cover new configuration
option diff.statNameWidth=<width>. While there, perform a few minor code
and whitespace cleanups here and there, as spotted.
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We now limit depth of the tree objects and maximum length of
pathnames recorded in tree objects.
* jk/tree-name-and-depth-limit:
lower core.maxTreeDepth default to 2048
tree-diff: respect max_allowed_tree_depth
list-objects: respect max_allowed_tree_depth
read_tree(): respect max_allowed_tree_depth
traverse_trees(): respect max_allowed_tree_depth
add core.maxTreeDepth config
fsck: detect very large tree pathnames
tree-walk: rename "error" variable
tree-walk: drop MAX_TRAVERSE_TREES macro
tree-walk: reduce stack size for recursive functions
Most of our tree traversal algorithms use recursion to visit sub-trees.
For pathologically large trees, this can cause us to run out of stack
space and abort in an uncontrolled way. Let's put our own limit here so
that we can fail gracefully rather than segfaulting.
In similar cases where we recursed along the commit graph, we rewrote
the algorithms to avoid recursion and keep any stack data on the heap.
But the commit graph is meant to grow without bound, whereas it's not an
imposition to put a limit on the maximum size of tree we'll handle.
And this has a bonus side effect: coupled with a limit on individual
tree entry names, this limits the total size of a path we may encounter.
This gives us an extra protection against code handling long path names
which may suffer from integer overflows in the size (which could then be
exploited by malicious trees).
The default of 4096 is set to be much longer than anybody would care
about in the real world. Even with single-letter interior tree names
(like "a/b/c"), such a path is at least 8191 bytes. While most operating
systems will let you create such a path incrementally, trying to
reference the whole thing in a system call (as Git would do when
actually trying to access it) will result in ENAMETOOLONG. Coupled with
the recent fsck.largePathname warning, the maximum total pathname Git
will handle is (by default) 16MB.
This config option doesn't do anything yet; future patches will convert
various algorithms to respect the limit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Overly long label names used in the sequencer machinery are now
chopped to fit under filesystem limitation.
* mp/rebase-label-length-limit:
rebase: allow overriding the maximal length of the generated labels
sequencer: truncate labels to accommodate loose refs
With this change, users can override the compiled-in default for the
maximal length of the label names generated by `git rebase
--rebase-merges`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Mark Ruvald Pedersen <mped@demant.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pack-objects" learned to invoke a new hook program that
enumerates extra objects to be used as anchoring points to keep
otherwise unreachable objects in cruft packs.
* tb/gc-recent-object-hook:
gc: introduce `gc.recentObjectsHook`
reachable.c: extract `obj_is_recent()`
The object traversal using reachability bitmap done by
"pack-object" has been tweaked to take advantage of the fact that
using "boundary" commits as representative of all the uninteresting
ones can save quite a lot of object enumeration.
* tb/pack-bitmap-traversal-with-boundary:
pack-bitmap.c: use commit boundary during bitmap traversal
pack-bitmap.c: extract `fill_in_bitmap()`
object: add object_array initializer helper function
'git worktree add' learned how to create a worktree based on an
orphaned branch with `--orphan`.
* ja/worktree-orphan:
worktree add: emit warn when there is a bad HEAD
worktree add: extend DWIM to infer --orphan
worktree add: introduce "try --orphan" hint
worktree add: add --orphan flag
t2400: add tests to verify --quiet
t2400: refactor "worktree add" opt exclusion tests
t2400: cleanup created worktree in test
worktree add: include -B in usage docs
This patch introduces a new multi-valued configuration option,
`gc.recentObjectsHook` as a means to mark certain objects as recent (and
thus exempt from garbage collection), regardless of their age.
When performing a garbage collection operation on a repository with
unreachable objects, Git makes its decision on what to do with those
object(s) based on how recent the objects are or not. Generally speaking,
unreachable-but-recent objects stay in the repository, and older objects
are discarded.
However, we have no convenient way to keep certain precious, unreachable
objects around in the repository, even if they have aged out and would
be pruned. Our options today consist of:
- Point references at the reachability tips of any objects you
consider precious, which may be undesirable or infeasible if there
are many such objects.
- Track them via the reflog, which may be undesirable since the
reflog's lifetime is limited to that of the reference it's tracking
(and callers may want to keep those unreachable objects around for
longer).
- Extend the grace period, which may keep around other objects that
the caller *does* want to discard.
- Manually modify the mtimes of objects you want to keep. If those
objects are already loose, this is easy enough to do (you can just
enumerate and `touch -m` each one).
But if they are packed, you will either end up modifying the mtimes
of *all* objects in that pack, or be forced to write out a loose
copy of that object, both of which may be undesirable. Even worse,
if they are in a cruft pack, that requires modifying its `*.mtimes`
file by hand, since there is no exposed plumbing for this.
- Force the caller to construct the pack of objects they want
to keep themselves, and then mark the pack as kept by adding a
".keep" file. This works, but is burdensome for the caller, and
having extra packs is awkward as you roll forward your cruft pack.
This patch introduces a new option to the above list via the
`gc.recentObjectsHook` configuration, which allows the caller to
specify a program (or set of programs) whose output is treated as a set
of objects to treat as recent, regardless of their true age.
The implementation is straightforward. Git enumerates recent objects via
`add_unseen_recent_objects_to_traversal()`, which enumerates loose and
packed objects, and eventually calls add_recent_object() on any objects
for which `want_recent_object()`'s conditions are met.
This patch modifies the recency condition from simply "is the mtime of
this object more recent than the cutoff?" to "[...] or, is this object
mentioned by at least one `gc.recentObjectsHook`?".
Depending on whether or not we are generating a cruft pack, this allows
the caller to do one of two things:
- If generating a cruft pack, the caller is able to retain additional
objects via the cruft pack, even if they would have otherwise been
pruned due to their age.
- If not generating a cruft pack, the caller is likewise able to
retain additional objects as loose.
A potential alternative here is to introduce a new mode to alter the
contents of the reachable pack instead of the cruft one. One could
imagine a new option to `pack-objects`, say `--extra-reachable-tips`
that does the same thing as above, adding the visited set of objects
along the traversal to the pack.
But this has the unfortunate side-effect of altering the reachability
closure of that pack. If parts of the unreachable object graph mentioned
by one or more of the "extra reachable tips" programs is not closed,
then the resulting pack won't be either. This makes it impossible in the
general case to write out reachability bitmaps for that pack, since
closure is a requirement there.
Instead, keep these unreachable objects in the cruft pack (or set of
unreachable, loose objects) instead, to ensure that we can continue to
have a pack containing just reachable objects, which is always safe to
write a bitmap over.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>