Merge branch 'rj/doc-technical-fixes'
Documentation mark-up fixes. * rj/doc-technical-fixes: doc: add large-object-promisors.adoc to the docs build doc: commit-graph.adoc: fix up some formatting doc: sparse-checkout.adoc: fix asciidoc warnings doc: remembering-renames.adoc: fix asciidoc warningsmain
commit
411903ce4c
|
|
@ -123,6 +123,7 @@ TECH_DOCS += technical/bundle-uri
|
||||||
TECH_DOCS += technical/commit-graph
|
TECH_DOCS += technical/commit-graph
|
||||||
TECH_DOCS += technical/directory-rename-detection
|
TECH_DOCS += technical/directory-rename-detection
|
||||||
TECH_DOCS += technical/hash-function-transition
|
TECH_DOCS += technical/hash-function-transition
|
||||||
|
TECH_DOCS += technical/large-object-promisors
|
||||||
TECH_DOCS += technical/long-running-process-protocol
|
TECH_DOCS += technical/long-running-process-protocol
|
||||||
TECH_DOCS += technical/multi-pack-index
|
TECH_DOCS += technical/multi-pack-index
|
||||||
TECH_DOCS += technical/packfile-uri
|
TECH_DOCS += technical/packfile-uri
|
||||||
|
|
|
||||||
|
|
@ -39,6 +39,7 @@ A consumer may load the following info for a commit from the graph:
|
||||||
Values 1-4 satisfy the requirements of parse_commit_gently().
|
Values 1-4 satisfy the requirements of parse_commit_gently().
|
||||||
|
|
||||||
There are two definitions of generation number:
|
There are two definitions of generation number:
|
||||||
|
|
||||||
1. Corrected committer dates (generation number v2)
|
1. Corrected committer dates (generation number v2)
|
||||||
2. Topological levels (generation number v1)
|
2. Topological levels (generation number v1)
|
||||||
|
|
||||||
|
|
@ -158,7 +159,8 @@ number of commits in the full history. By creating a "chain" of commit-graphs,
|
||||||
we enable fast writes of new commit data without rewriting the entire commit
|
we enable fast writes of new commit data without rewriting the entire commit
|
||||||
history -- at least, most of the time.
|
history -- at least, most of the time.
|
||||||
|
|
||||||
## File Layout
|
File Layout
|
||||||
|
~~~~~~~~~~~
|
||||||
|
|
||||||
A commit-graph chain uses multiple files, and we use a fixed naming convention
|
A commit-graph chain uses multiple files, and we use a fixed naming convention
|
||||||
to organize these files. Each commit-graph file has a name
|
to organize these files. Each commit-graph file has a name
|
||||||
|
|
@ -170,11 +172,11 @@ hashes for the files in order from "lowest" to "highest".
|
||||||
|
|
||||||
For example, if the `commit-graph-chain` file contains the lines
|
For example, if the `commit-graph-chain` file contains the lines
|
||||||
|
|
||||||
```
|
----
|
||||||
{hash0}
|
{hash0}
|
||||||
{hash1}
|
{hash1}
|
||||||
{hash2}
|
{hash2}
|
||||||
```
|
----
|
||||||
|
|
||||||
then the commit-graph chain looks like the following diagram:
|
then the commit-graph chain looks like the following diagram:
|
||||||
|
|
||||||
|
|
@ -213,7 +215,8 @@ specifying the hashes of all files in the lower layers. In the above example,
|
||||||
`graph-{hash1}.graph` contains `{hash0}` while `graph-{hash2}.graph` contains
|
`graph-{hash1}.graph` contains `{hash0}` while `graph-{hash2}.graph` contains
|
||||||
`{hash0}` and `{hash1}`.
|
`{hash0}` and `{hash1}`.
|
||||||
|
|
||||||
## Merging commit-graph files
|
Merging commit-graph files
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
If we only added a new commit-graph file on every write, we would run into a
|
If we only added a new commit-graph file on every write, we would run into a
|
||||||
linear search problem through many commit-graph files. Instead, we use a merge
|
linear search problem through many commit-graph files. Instead, we use a merge
|
||||||
|
|
@ -225,6 +228,7 @@ is determined by the merge strategy that the files should collapse to
|
||||||
the commits in `graph-{hash1}` should be combined into a new `graph-{hash3}`
|
the commits in `graph-{hash1}` should be combined into a new `graph-{hash3}`
|
||||||
file.
|
file.
|
||||||
|
|
||||||
|
....
|
||||||
+---------------------+
|
+---------------------+
|
||||||
| |
|
| |
|
||||||
| (new commits) |
|
| (new commits) |
|
||||||
|
|
@ -250,6 +254,7 @@ file.
|
||||||
| |
|
| |
|
||||||
| |
|
| |
|
||||||
+-----------------------+
|
+-----------------------+
|
||||||
|
....
|
||||||
|
|
||||||
During this process, the commits to write are combined, sorted and we write the
|
During this process, the commits to write are combined, sorted and we write the
|
||||||
contents to a temporary file, all while holding a `commit-graph-chain.lock`
|
contents to a temporary file, all while holding a `commit-graph-chain.lock`
|
||||||
|
|
@ -257,14 +262,15 @@ lock-file. When the file is flushed, we rename it to `graph-{hash3}`
|
||||||
according to the computed `{hash3}`. Finally, we write the new chain data to
|
according to the computed `{hash3}`. Finally, we write the new chain data to
|
||||||
`commit-graph-chain.lock`:
|
`commit-graph-chain.lock`:
|
||||||
|
|
||||||
```
|
----
|
||||||
{hash3}
|
{hash3}
|
||||||
{hash0}
|
{hash0}
|
||||||
```
|
----
|
||||||
|
|
||||||
We then close the lock-file.
|
We then close the lock-file.
|
||||||
|
|
||||||
## Merge Strategy
|
Merge Strategy
|
||||||
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
When writing a set of commits that do not exist in the commit-graph stack of
|
When writing a set of commits that do not exist in the commit-graph stack of
|
||||||
height N, we default to creating a new file at level N + 1. We then decide to
|
height N, we default to creating a new file at level N + 1. We then decide to
|
||||||
|
|
@ -289,7 +295,8 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum
|
||||||
number of commits) could be extracted into config settings for full
|
number of commits) could be extracted into config settings for full
|
||||||
flexibility.
|
flexibility.
|
||||||
|
|
||||||
## Handling Mixed Generation Number Chains
|
Handling Mixed Generation Number Chains
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
With the introduction of generation number v2 and generation data chunk, the
|
With the introduction of generation number v2 and generation data chunk, the
|
||||||
following scenario is possible:
|
following scenario is possible:
|
||||||
|
|
@ -318,7 +325,8 @@ have corrected commit dates when written by compatible versions of Git. Thus,
|
||||||
rewriting split commit-graph as a single file (`--split=replace`) creates a
|
rewriting split commit-graph as a single file (`--split=replace`) creates a
|
||||||
single layer with corrected commit dates.
|
single layer with corrected commit dates.
|
||||||
|
|
||||||
## Deleting graph-{hash} files
|
Deleting graph-\{hash\} files
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
After a new tip file is written, some `graph-{hash}` files may no longer
|
After a new tip file is written, some `graph-{hash}` files may no longer
|
||||||
be part of a chain. It is important to remove these files from disk, eventually.
|
be part of a chain. It is important to remove these files from disk, eventually.
|
||||||
|
|
@ -333,7 +341,8 @@ files whose modified times are older than a given expiry window. This window
|
||||||
defaults to zero, but can be changed using command-line arguments or a config
|
defaults to zero, but can be changed using command-line arguments or a config
|
||||||
setting.
|
setting.
|
||||||
|
|
||||||
## Chains across multiple object directories
|
Chains across multiple object directories
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
In a repo with alternates, we look for the `commit-graph-chain` file starting
|
In a repo with alternates, we look for the `commit-graph-chain` file starting
|
||||||
in the local object directory and then in each alternate. The first file that
|
in the local object directory and then in each alternate. The first file that
|
||||||
|
|
|
||||||
|
|
@ -34,8 +34,8 @@ a new object representation for large blobs as discussed in:
|
||||||
|
|
||||||
https://lore.kernel.org/git/xmqqbkdometi.fsf@gitster.g/
|
https://lore.kernel.org/git/xmqqbkdometi.fsf@gitster.g/
|
||||||
|
|
||||||
0) Non goals
|
Non goals
|
||||||
------------
|
---------
|
||||||
|
|
||||||
- We will not discuss those client side improvements here, as they
|
- We will not discuss those client side improvements here, as they
|
||||||
would require changes in different parts of Git than this effort.
|
would require changes in different parts of Git than this effort.
|
||||||
|
|
@ -90,8 +90,8 @@ later in this document:
|
||||||
even more to host content with larger blobs or more large blobs
|
even more to host content with larger blobs or more large blobs
|
||||||
than currently.
|
than currently.
|
||||||
|
|
||||||
I) Issues with the current situation
|
I Issues with the current situation
|
||||||
------------------------------------
|
-----------------------------------
|
||||||
|
|
||||||
- Some statistics made on GitLab repos have shown that more than 75%
|
- Some statistics made on GitLab repos have shown that more than 75%
|
||||||
of the disk space is used by blobs that are larger than 1MB and
|
of the disk space is used by blobs that are larger than 1MB and
|
||||||
|
|
@ -138,8 +138,8 @@ I) Issues with the current situation
|
||||||
complaining that these tools require significant effort to set up,
|
complaining that these tools require significant effort to set up,
|
||||||
learn and use correctly.
|
learn and use correctly.
|
||||||
|
|
||||||
II) Main features of the "Large Object Promisors" solution
|
II Main features of the "Large Object Promisors" solution
|
||||||
----------------------------------------------------------
|
---------------------------------------------------------
|
||||||
|
|
||||||
The main features below should give a rough overview of how the
|
The main features below should give a rough overview of how the
|
||||||
solution may work. Details about needed elements can be found in
|
solution may work. Details about needed elements can be found in
|
||||||
|
|
@ -166,7 +166,7 @@ format. They should be used along with main remotes that contain the
|
||||||
other objects.
|
other objects.
|
||||||
|
|
||||||
Note 1
|
Note 1
|
||||||
++++++
|
^^^^^^
|
||||||
|
|
||||||
To clarify, a LOP is a normal promisor remote, except that:
|
To clarify, a LOP is a normal promisor remote, except that:
|
||||||
|
|
||||||
|
|
@ -178,7 +178,7 @@ To clarify, a LOP is a normal promisor remote, except that:
|
||||||
itself.
|
itself.
|
||||||
|
|
||||||
Note 2
|
Note 2
|
||||||
++++++
|
^^^^^^
|
||||||
|
|
||||||
Git already makes it possible for a main remote to also be a promisor
|
Git already makes it possible for a main remote to also be a promisor
|
||||||
remote storing both regular objects and large blobs for a client that
|
remote storing both regular objects and large blobs for a client that
|
||||||
|
|
@ -186,13 +186,13 @@ clones from it with a filter on blob size. But here we explicitly want
|
||||||
to avoid that.
|
to avoid that.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
LOPs aim to be good at handling large blobs while main remotes are
|
LOPs aim to be good at handling large blobs while main remotes are
|
||||||
already good at handling other objects.
|
already good at handling other objects.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
Git already has support for multiple promisor remotes, see
|
Git already has support for multiple promisor remotes, see
|
||||||
link:partial-clone.html#using-many-promisor-remotes[the partial clone documentation].
|
link:partial-clone.html#using-many-promisor-remotes[the partial clone documentation].
|
||||||
|
|
@ -213,19 +213,19 @@ remote helper (see linkgit:gitremote-helpers[7]) which makes the
|
||||||
underlying object storage appear like a remote to Git.
|
underlying object storage appear like a remote to Git.
|
||||||
|
|
||||||
Note
|
Note
|
||||||
++++
|
^^^^
|
||||||
|
|
||||||
A LOP can be a promisor remote accessed using a remote helper by
|
A LOP can be a promisor remote accessed using a remote helper by
|
||||||
both some clients and the main remote.
|
both some clients and the main remote.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
This looks like the simplest way to create LOPs that can cheaply
|
This looks like the simplest way to create LOPs that can cheaply
|
||||||
handle many large blobs.
|
handle many large blobs.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
Remote helpers are quite easy to write as shell scripts, but it might
|
Remote helpers are quite easy to write as shell scripts, but it might
|
||||||
be more efficient and maintainable to write them using other languages
|
be more efficient and maintainable to write them using other languages
|
||||||
|
|
@ -247,7 +247,7 @@ The underlying object storage that a LOP uses could also serve as
|
||||||
storage for large files handled by Git LFS.
|
storage for large files handled by Git LFS.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
This would simplify the server side if it wants to both use a LOP and
|
This would simplify the server side if it wants to both use a LOP and
|
||||||
act as a Git LFS server.
|
act as a Git LFS server.
|
||||||
|
|
@ -259,7 +259,7 @@ On the server side, a main remote should have a way to offload to a
|
||||||
LOP all its blobs with a size over a configurable threshold.
|
LOP all its blobs with a size over a configurable threshold.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
This makes it easy to set things up and to clean things up. For
|
This makes it easy to set things up and to clean things up. For
|
||||||
example, an admin could use this to manually convert a repo not using
|
example, an admin could use this to manually convert a repo not using
|
||||||
|
|
@ -268,7 +268,7 @@ some users would sometimes push large blobs, a cron job could use this
|
||||||
to regularly make sure the large blobs are moved to the LOP.
|
to regularly make sure the large blobs are moved to the LOP.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
Using something based on `git repack --filter=...` to separate the
|
Using something based on `git repack --filter=...` to separate the
|
||||||
blobs we want to offload from the other Git objects could be a good
|
blobs we want to offload from the other Git objects could be a good
|
||||||
|
|
@ -284,13 +284,13 @@ should have ways to prevent oversize blobs to be fetched, and also
|
||||||
perhaps pushed, into it.
|
perhaps pushed, into it.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
A main remote containing many oversize blobs would defeat the purpose
|
A main remote containing many oversize blobs would defeat the purpose
|
||||||
of LOPs.
|
of LOPs.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
The way to offload to a LOP discussed in 4) above can be used to
|
The way to offload to a LOP discussed in 4) above can be used to
|
||||||
regularly offload oversize blobs. About preventing oversize blobs from
|
regularly offload oversize blobs. About preventing oversize blobs from
|
||||||
|
|
@ -326,18 +326,18 @@ large blobs directly from the LOP and the server would not need to
|
||||||
fetch those blobs from the LOP to be able to serve the client.
|
fetch those blobs from the LOP to be able to serve the client.
|
||||||
|
|
||||||
Note
|
Note
|
||||||
++++
|
^^^^
|
||||||
|
|
||||||
For fetches instead of clones, a protocol negotiation might not always
|
For fetches instead of clones, a protocol negotiation might not always
|
||||||
happen, see the "What about fetches?" FAQ entry below for details.
|
happen, see the "What about fetches?" FAQ entry below for details.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
Security, configurability and efficiency of setting things up.
|
Security, configurability and efficiency of setting things up.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
A "promisor-remote" protocol v2 capability looks like a good way to
|
A "promisor-remote" protocol v2 capability looks like a good way to
|
||||||
implement this. The way the client and server use this capability
|
implement this. The way the client and server use this capability
|
||||||
|
|
@ -356,7 +356,7 @@ the client should be able to offload some large blobs it has fetched,
|
||||||
but might not need anymore, to the LOP.
|
but might not need anymore, to the LOP.
|
||||||
|
|
||||||
Note
|
Note
|
||||||
++++
|
^^^^
|
||||||
|
|
||||||
It might depend on the context if it should be OK or not for clients
|
It might depend on the context if it should be OK or not for clients
|
||||||
to offload large blobs they have created, instead of fetched, directly
|
to offload large blobs they have created, instead of fetched, directly
|
||||||
|
|
@ -367,13 +367,13 @@ This should be discussed and refined when we get closer to
|
||||||
implementing this feature.
|
implementing this feature.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
On the client, the easiest way to deal with unneeded large blobs is to
|
On the client, the easiest way to deal with unneeded large blobs is to
|
||||||
offload them.
|
offload them.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
This is very similar to what 4) above is about, except on the client
|
This is very similar to what 4) above is about, except on the client
|
||||||
side instead of the server side. So a good solution to 4) could likely
|
side instead of the server side. So a good solution to 4) could likely
|
||||||
|
|
@ -385,8 +385,8 @@ when cloning (see 6) above). Also if the large blobs were fetched from
|
||||||
a LOP, it is likely, and can easily be confirmed, that the LOP still
|
a LOP, it is likely, and can easily be confirmed, that the LOP still
|
||||||
has them, so that they can just be removed from the client.
|
has them, so that they can just be removed from the client.
|
||||||
|
|
||||||
III) Benefits of using LOPs
|
III Benefits of using LOPs
|
||||||
---------------------------
|
--------------------------
|
||||||
|
|
||||||
Many benefits are related to the issues discussed in "I) Issues with
|
Many benefits are related to the issues discussed in "I) Issues with
|
||||||
the current situation" above:
|
the current situation" above:
|
||||||
|
|
@ -406,8 +406,8 @@ the current situation" above:
|
||||||
|
|
||||||
- Reduced storage needs on the client side.
|
- Reduced storage needs on the client side.
|
||||||
|
|
||||||
IV) FAQ
|
IV FAQ
|
||||||
-------
|
------
|
||||||
|
|
||||||
What about using multiple LOPs on the server and client side?
|
What about using multiple LOPs on the server and client side?
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
@ -533,7 +533,7 @@ some objects it already knows about but doesn't have because they are
|
||||||
on a promisor remote.
|
on a promisor remote.
|
||||||
|
|
||||||
Regular fetch
|
Regular fetch
|
||||||
+++++++++++++
|
^^^^^^^^^^^^^
|
||||||
|
|
||||||
In a regular fetch, the client will contact the main remote and a
|
In a regular fetch, the client will contact the main remote and a
|
||||||
protocol negotiation will happen between them. It's a good thing that
|
protocol negotiation will happen between them. It's a good thing that
|
||||||
|
|
@ -551,7 +551,7 @@ new fetch will happen in the same way as the previous clone or fetch,
|
||||||
using, or not using, the same LOP(s) as last time.
|
using, or not using, the same LOP(s) as last time.
|
||||||
|
|
||||||
"Backfill" or "lazy" fetch
|
"Backfill" or "lazy" fetch
|
||||||
++++++++++++++++++++++++++
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
When there is a backfill fetch, the client doesn't necessarily contact
|
When there is a backfill fetch, the client doesn't necessarily contact
|
||||||
the main remote first. It will try to fetch from its promisor remotes
|
the main remote first. It will try to fetch from its promisor remotes
|
||||||
|
|
@ -576,8 +576,8 @@ from the client when it fetches from them. The client could get the
|
||||||
token when performing a protocol negotiation with the main remote (see
|
token when performing a protocol negotiation with the main remote (see
|
||||||
section II.6 above).
|
section II.6 above).
|
||||||
|
|
||||||
V) Future improvements
|
V Future improvements
|
||||||
----------------------
|
---------------------
|
||||||
|
|
||||||
It is expected that at the beginning using LOPs will be mostly worth
|
It is expected that at the beginning using LOPs will be mostly worth
|
||||||
it either in a corporate context where the Git version that clients
|
it either in a corporate context where the Git version that clients
|
||||||
|
|
|
||||||
|
|
@ -13,6 +13,7 @@ articles = [
|
||||||
'commit-graph.adoc',
|
'commit-graph.adoc',
|
||||||
'directory-rename-detection.adoc',
|
'directory-rename-detection.adoc',
|
||||||
'hash-function-transition.adoc',
|
'hash-function-transition.adoc',
|
||||||
|
'large-object-promisors.adoc',
|
||||||
'long-running-process-protocol.adoc',
|
'long-running-process-protocol.adoc',
|
||||||
'multi-pack-index.adoc',
|
'multi-pack-index.adoc',
|
||||||
'packfile-uri.adoc',
|
'packfile-uri.adoc',
|
||||||
|
|
|
||||||
|
|
@ -10,32 +10,32 @@ history as an optimization, assuming all merges are automatic and clean
|
||||||
|
|
||||||
Outline:
|
Outline:
|
||||||
|
|
||||||
0. Assumptions
|
1. Assumptions
|
||||||
|
|
||||||
1. How rebasing and cherry-picking work
|
2. How rebasing and cherry-picking work
|
||||||
|
|
||||||
2. Why the renames on MERGE_SIDE1 in any given pick are *always* a
|
3. Why the renames on MERGE_SIDE1 in any given pick are *always* a
|
||||||
superset of the renames on MERGE_SIDE1 for the next pick.
|
superset of the renames on MERGE_SIDE1 for the next pick.
|
||||||
|
|
||||||
3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also
|
4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also
|
||||||
a rename on MERGE_SIDE1 for the next pick
|
a rename on MERGE_SIDE1 for the next pick
|
||||||
|
|
||||||
4. A detailed description of the counter-examples to #3.
|
5. A detailed description of the counter-examples to #4.
|
||||||
|
|
||||||
5. Why the special cases in #4 are still fully reasonable to use to pair
|
6. Why the special cases in #5 are still fully reasonable to use to pair
|
||||||
up files for three-way content merging in the merge machinery, and why
|
up files for three-way content merging in the merge machinery, and why
|
||||||
they do not affect the correctness of the merge.
|
they do not affect the correctness of the merge.
|
||||||
|
|
||||||
6. Interaction with skipping of "irrelevant" renames
|
7. Interaction with skipping of "irrelevant" renames
|
||||||
|
|
||||||
7. Additional items that need to be cached
|
8. Additional items that need to be cached
|
||||||
|
|
||||||
8. How directory rename detection interacts with the above and why this
|
9. How directory rename detection interacts with the above and why this
|
||||||
optimization is still safe even if merge.directoryRenames is set to
|
optimization is still safe even if merge.directoryRenames is set to
|
||||||
"true".
|
"true".
|
||||||
|
|
||||||
|
|
||||||
=== 0. Assumptions ===
|
== 1. Assumptions ==
|
||||||
|
|
||||||
There are two assumptions that will hold throughout this document:
|
There are two assumptions that will hold throughout this document:
|
||||||
|
|
||||||
|
|
@ -44,8 +44,8 @@ There are two assumptions that will hold throughout this document:
|
||||||
|
|
||||||
* All merges are fully automatic
|
* All merges are fully automatic
|
||||||
|
|
||||||
and a third that will hold in sections 2-5 for simplicity, that I'll later
|
and a third that will hold in sections 3-6 for simplicity, that I'll later
|
||||||
address in section 8:
|
address in section 9:
|
||||||
|
|
||||||
* No directory renames occur
|
* No directory renames occur
|
||||||
|
|
||||||
|
|
@ -77,9 +77,9 @@ conflicts that the user needs to resolve), the cache of renames is not
|
||||||
stored on disk, and thus is thrown away as soon as the rebase or cherry
|
stored on disk, and thus is thrown away as soon as the rebase or cherry
|
||||||
pick stops for the user to resolve the operation.
|
pick stops for the user to resolve the operation.
|
||||||
|
|
||||||
The third assumption makes sections 2-5 simpler, and allows people to
|
The third assumption makes sections 3-6 simpler, and allows people to
|
||||||
understand the basics of why this optimization is safe and effective, and
|
understand the basics of why this optimization is safe and effective, and
|
||||||
then I can go back and address the specifics in section 8. It is probably
|
then I can go back and address the specifics in section 9. It is probably
|
||||||
also worth noting that if directory renames do occur, then the default of
|
also worth noting that if directory renames do occur, then the default of
|
||||||
merge.directoryRenames being set to "conflict" means that the operation
|
merge.directoryRenames being set to "conflict" means that the operation
|
||||||
will stop for users to resolve the conflicts and the cache will be thrown
|
will stop for users to resolve the conflicts and the cache will be thrown
|
||||||
|
|
@ -88,22 +88,26 @@ reason we need to address directory renames specifically, is that some
|
||||||
users will have set merge.directoryRenames to "true" to allow the merges to
|
users will have set merge.directoryRenames to "true" to allow the merges to
|
||||||
continue to proceed automatically. The optimization is still safe with
|
continue to proceed automatically. The optimization is still safe with
|
||||||
this config setting, but we have to discuss a few more cases to show why;
|
this config setting, but we have to discuss a few more cases to show why;
|
||||||
this discussion is deferred until section 8.
|
this discussion is deferred until section 9.
|
||||||
|
|
||||||
|
|
||||||
=== 1. How rebasing and cherry-picking work ===
|
== 2. How rebasing and cherry-picking work ==
|
||||||
|
|
||||||
Consider the following setup (from the git-rebase manpage):
|
Consider the following setup (from the git-rebase manpage):
|
||||||
|
|
||||||
|
------------
|
||||||
A---B---C topic
|
A---B---C topic
|
||||||
/
|
/
|
||||||
D---E---F---G main
|
D---E---F---G main
|
||||||
|
------------
|
||||||
|
|
||||||
After rebasing or cherry-picking topic onto main, this will appear as:
|
After rebasing or cherry-picking topic onto main, this will appear as:
|
||||||
|
|
||||||
|
------------
|
||||||
A'--B'--C' topic
|
A'--B'--C' topic
|
||||||
/
|
/
|
||||||
D---E---F---G main
|
D---E---F---G main
|
||||||
|
------------
|
||||||
|
|
||||||
The way the commits A', B', and C' are created is through a series of
|
The way the commits A', B', and C' are created is through a series of
|
||||||
merges, where rebase or cherry-pick sequentially uses each of the three
|
merges, where rebase or cherry-pick sequentially uses each of the three
|
||||||
|
|
@ -111,6 +115,7 @@ A-B-C commits in a special merge operation. Let's label the three commits
|
||||||
in the merge operation as MERGE_BASE, MERGE_SIDE1, and MERGE_SIDE2. For
|
in the merge operation as MERGE_BASE, MERGE_SIDE1, and MERGE_SIDE2. For
|
||||||
this picture, the three commits for each of the three merges would be:
|
this picture, the three commits for each of the three merges would be:
|
||||||
|
|
||||||
|
....
|
||||||
To create A':
|
To create A':
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -125,6 +130,7 @@ To create C':
|
||||||
MERGE_BASE: B
|
MERGE_BASE: B
|
||||||
MERGE_SIDE1: B'
|
MERGE_SIDE1: B'
|
||||||
MERGE_SIDE2: C
|
MERGE_SIDE2: C
|
||||||
|
....
|
||||||
|
|
||||||
Sometimes, folks are surprised that these three-way merges are done. It
|
Sometimes, folks are surprised that these three-way merges are done. It
|
||||||
can be useful in understanding these three-way merges to view them in a
|
can be useful in understanding these three-way merges to view them in a
|
||||||
|
|
@ -138,8 +144,7 @@ Conceptually the two statements above are the same as a three-way merge of
|
||||||
B, B', and C, at least the parts before you decide to record a commit.
|
B, B', and C, at least the parts before you decide to record a commit.
|
||||||
|
|
||||||
|
|
||||||
=== 2. Why the renames on MERGE_SIDE1 in any given pick are always a ===
|
== 3. Why the renames on MERGE_SIDE1 in any given pick are always a superset of the renames on MERGE_SIDE1 for the next pick. ==
|
||||||
=== superset of the renames on MERGE_SIDE1 for the next pick. ===
|
|
||||||
|
|
||||||
The merge machinery uses the filenames it is fed from MERGE_BASE,
|
The merge machinery uses the filenames it is fed from MERGE_BASE,
|
||||||
MERGE_SIDE1, and MERGE_SIDE2. It will only move content to a different
|
MERGE_SIDE1, and MERGE_SIDE2. It will only move content to a different
|
||||||
|
|
@ -156,6 +161,7 @@ filename under one of three conditions:
|
||||||
First, let's remember what commits are involved in the first and second
|
First, let's remember what commits are involved in the first and second
|
||||||
picks of the cherry-pick or rebase sequence:
|
picks of the cherry-pick or rebase sequence:
|
||||||
|
|
||||||
|
....
|
||||||
To create A':
|
To create A':
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -165,6 +171,7 @@ To create B':
|
||||||
MERGE_BASE: A
|
MERGE_BASE: A
|
||||||
MERGE_SIDE1: A'
|
MERGE_SIDE1: A'
|
||||||
MERGE_SIDE2: B
|
MERGE_SIDE2: B
|
||||||
|
....
|
||||||
|
|
||||||
So, in particular, we need to show that the renames between E and G are a
|
So, in particular, we need to show that the renames between E and G are a
|
||||||
superset of those between A and A'.
|
superset of those between A and A'.
|
||||||
|
|
@ -181,11 +188,11 @@ are a subset of those between E and G. Equivalently, all renames between E
|
||||||
and G are a superset of those between A and A'.
|
and G are a superset of those between A and A'.
|
||||||
|
|
||||||
|
|
||||||
=== 3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ ===
|
== 4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also a rename on MERGE_SIDE1 for the next pick. ==
|
||||||
=== always also a rename on MERGE_SIDE1 for the next pick. ===
|
|
||||||
|
|
||||||
Let's again look at the first two picks:
|
Let's again look at the first two picks:
|
||||||
|
|
||||||
|
....
|
||||||
To create A':
|
To create A':
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -195,17 +202,25 @@ To create B':
|
||||||
MERGE_BASE: A
|
MERGE_BASE: A
|
||||||
MERGE_SIDE1: A'
|
MERGE_SIDE1: A'
|
||||||
MERGE_SIDE2: B
|
MERGE_SIDE2: B
|
||||||
|
....
|
||||||
|
|
||||||
Now let's look at any given rename from MERGE_SIDE1 of the first pick, i.e.
|
Now let's look at any given rename from MERGE_SIDE1 of the first pick, i.e.
|
||||||
any given rename from E to G. Let's use the filenames 'oldfile' and
|
any given rename from E to G. Let's use the filenames 'oldfile' and
|
||||||
'newfile' for demonstration purposes. That first pick will function as
|
'newfile' for demonstration purposes. That first pick will function as
|
||||||
follows; when the rename is detected, the merge machinery will do a
|
follows; when the rename is detected, the merge machinery will do a
|
||||||
three-way content merge of the following:
|
three-way content merge of the following:
|
||||||
|
|
||||||
|
....
|
||||||
E:oldfile
|
E:oldfile
|
||||||
G:newfile
|
G:newfile
|
||||||
A:oldfile
|
A:oldfile
|
||||||
|
....
|
||||||
|
|
||||||
and produce a new result:
|
and produce a new result:
|
||||||
|
|
||||||
|
....
|
||||||
A':newfile
|
A':newfile
|
||||||
|
....
|
||||||
|
|
||||||
Note above that I've assumed that E->A did not rename oldfile. If that
|
Note above that I've assumed that E->A did not rename oldfile. If that
|
||||||
side did rename, then we most likely have a rename/rename(1to2) conflict
|
side did rename, then we most likely have a rename/rename(1to2) conflict
|
||||||
|
|
@ -254,19 +269,21 @@ were detected as renames, A:oldfile and A':newfile should also be
|
||||||
detectable as renames almost always.
|
detectable as renames almost always.
|
||||||
|
|
||||||
|
|
||||||
=== 4. A detailed description of the counter-examples to #3. ===
|
== 5. A detailed description of the counter-examples to #4. ==
|
||||||
|
|
||||||
We already noted in section 3 that rename/rename(1to1) (i.e. both sides
|
We already noted in section 4 that rename/rename(1to1) (i.e. both sides
|
||||||
renaming a file the same way) was one counter-example. The more
|
renaming a file the same way) was one counter-example. The more
|
||||||
interesting bit, though, is why did we need to use the "almost" qualifier
|
interesting bit, though, is why did we need to use the "almost" qualifier
|
||||||
when stating that A:oldfile and A':newfile are "almost" always detectable
|
when stating that A:oldfile and A':newfile are "almost" always detectable
|
||||||
as renames?
|
as renames?
|
||||||
|
|
||||||
Let's repeat an earlier point that section 3 made:
|
Let's repeat an earlier point that section 4 made:
|
||||||
|
|
||||||
|
....
|
||||||
A':newfile was created by applying the changes between E:oldfile and
|
A':newfile was created by applying the changes between E:oldfile and
|
||||||
G:newfile to A:oldfile. The changes between E:oldfile and G:newfile were
|
G:newfile to A:oldfile. The changes between E:oldfile and G:newfile were
|
||||||
<50% of the size of E:oldfile.
|
<50% of the size of E:oldfile.
|
||||||
|
....
|
||||||
|
|
||||||
If those changes that were <50% of the size of E:oldfile are also <50% of
|
If those changes that were <50% of the size of E:oldfile are also <50% of
|
||||||
the size of A:oldfile, then A:oldfile and A':newfile will be detectable as
|
the size of A:oldfile, then A:oldfile and A':newfile will be detectable as
|
||||||
|
|
@ -276,18 +293,21 @@ still somehow merge cleanly), then traditional rename detection would not
|
||||||
detect A:oldfile and A':newfile as renames.
|
detect A:oldfile and A':newfile as renames.
|
||||||
|
|
||||||
Here's an example where that can happen:
|
Here's an example where that can happen:
|
||||||
|
|
||||||
* E:oldfile had 20 lines
|
* E:oldfile had 20 lines
|
||||||
* G:newfile added 10 new lines at the beginning of the file
|
* G:newfile added 10 new lines at the beginning of the file
|
||||||
* A:oldfile kept the first 3 lines of the file, and deleted all the rest
|
* A:oldfile kept the first 3 lines of the file, and deleted all the rest
|
||||||
|
|
||||||
then
|
then
|
||||||
|
|
||||||
|
....
|
||||||
=> A':newfile would have 13 lines, 3 of which matches those in A:oldfile.
|
=> A':newfile would have 13 lines, 3 of which matches those in A:oldfile.
|
||||||
E:oldfile -> G:newfile would be detected as a rename, but A:oldfile and
|
E:oldfile -> G:newfile would be detected as a rename, but A:oldfile and
|
||||||
A':newfile would not be.
|
A':newfile would not be.
|
||||||
|
....
|
||||||
|
|
||||||
|
|
||||||
=== 5. Why the special cases in #4 are still fully reasonable to use to ===
|
== 6. Why the special cases in #5 are still fully reasonable to use to pair up files for three-way content merging in the merge machinery, and why they do not affect the correctness of the merge. ==
|
||||||
=== pair up files for three-way content merging in the merge machinery, ===
|
|
||||||
=== and why they do not affect the correctness of the merge. ===
|
|
||||||
|
|
||||||
In the rename/rename(1to1) case, A:newfile and A':newfile are not renames
|
In the rename/rename(1to1) case, A:newfile and A':newfile are not renames
|
||||||
since they use the *same* filename. However, files with the same filename
|
since they use the *same* filename. However, files with the same filename
|
||||||
|
|
@ -295,14 +315,14 @@ are obviously fine to pair up for three-way content merging (the merge
|
||||||
machinery has never employed break detection). The interesting
|
machinery has never employed break detection). The interesting
|
||||||
counter-example case is thus not the rename/rename(1to1) case, but the case
|
counter-example case is thus not the rename/rename(1to1) case, but the case
|
||||||
where A did not rename oldfile. That was the case that we spent most of
|
where A did not rename oldfile. That was the case that we spent most of
|
||||||
the time discussing in sections 3 and 4. The remainder of this section
|
the time discussing in sections 4 and 5. The remainder of this section
|
||||||
will be devoted to that case as well.
|
will be devoted to that case as well.
|
||||||
|
|
||||||
So, even if A:oldfile and A':newfile aren't detectable as renames, why is
|
So, even if A:oldfile and A':newfile aren't detectable as renames, why is
|
||||||
it still reasonable to pair them up for three-way content merging in the
|
it still reasonable to pair them up for three-way content merging in the
|
||||||
merge machinery? There are multiple reasons:
|
merge machinery? There are multiple reasons:
|
||||||
|
|
||||||
* As noted in sections 3 and 4, the diff between A:oldfile and A':newfile
|
* As noted in sections 4 and 5, the diff between A:oldfile and A':newfile
|
||||||
is *exactly* the same as the diff between E:oldfile and G:newfile. The
|
is *exactly* the same as the diff between E:oldfile and G:newfile. The
|
||||||
latter pair were detected as renames, so it seems unlikely to surprise
|
latter pair were detected as renames, so it seems unlikely to surprise
|
||||||
users for us to treat A:oldfile and A':newfile as renames.
|
users for us to treat A:oldfile and A':newfile as renames.
|
||||||
|
|
@ -394,7 +414,7 @@ cases 1 and 3 seem to provide as good or better behavior with the
|
||||||
optimization than without.
|
optimization than without.
|
||||||
|
|
||||||
|
|
||||||
=== 6. Interaction with skipping of "irrelevant" renames ===
|
== 7. Interaction with skipping of "irrelevant" renames ==
|
||||||
|
|
||||||
Previous optimizations involved skipping rename detection for paths
|
Previous optimizations involved skipping rename detection for paths
|
||||||
considered to be "irrelevant". See for example the following commits:
|
considered to be "irrelevant". See for example the following commits:
|
||||||
|
|
@ -421,24 +441,27 @@ detection -- though we can limit it to the paths for which we have not
|
||||||
already detected renames.
|
already detected renames.
|
||||||
|
|
||||||
|
|
||||||
=== 7. Additional items that need to be cached ===
|
== 8. Additional items that need to be cached ==
|
||||||
|
|
||||||
It turns out we have to cache more than just renames; we also cache:
|
It turns out we have to cache more than just renames; we also cache:
|
||||||
|
|
||||||
|
....
|
||||||
A) non-renames (i.e. unpaired deletes)
|
A) non-renames (i.e. unpaired deletes)
|
||||||
B) counts of renames within directories
|
B) counts of renames within directories
|
||||||
C) sources that were marked as RELEVANT_LOCATION, but which were
|
C) sources that were marked as RELEVANT_LOCATION, but which were
|
||||||
downgraded to RELEVANT_NO_MORE
|
downgraded to RELEVANT_NO_MORE
|
||||||
D) the toplevel trees involved in the merge
|
D) the toplevel trees involved in the merge
|
||||||
|
....
|
||||||
|
|
||||||
These are all stored in struct rename_info, and respectively appear in
|
These are all stored in struct rename_info, and respectively appear in
|
||||||
|
|
||||||
* cached_pairs (along side actual renames, just with a value of NULL)
|
* cached_pairs (along side actual renames, just with a value of NULL)
|
||||||
* dir_rename_counts
|
* dir_rename_counts
|
||||||
* cached_irrelevant
|
* cached_irrelevant
|
||||||
* merge_trees
|
* merge_trees
|
||||||
|
|
||||||
The reason for (A) comes from the irrelevant renames skipping
|
The reason for `(A)` comes from the irrelevant renames skipping
|
||||||
optimization discussed in section 6. The fact that irrelevant renames
|
optimization discussed in section 7. The fact that irrelevant renames
|
||||||
are skipped means we only get a subset of the potential renames
|
are skipped means we only get a subset of the potential renames
|
||||||
detected and subsequent commits may need to run rename detection on
|
detected and subsequent commits may need to run rename detection on
|
||||||
the upstream side on a subset of the remaining renames (to get the
|
the upstream side on a subset of the remaining renames (to get the
|
||||||
|
|
@ -447,23 +470,24 @@ deletes are involved in rename detection too, we don't want to
|
||||||
repeatedly check that those paths remain unpaired on the upstream side
|
repeatedly check that those paths remain unpaired on the upstream side
|
||||||
with every commit we are transplanting.
|
with every commit we are transplanting.
|
||||||
|
|
||||||
The reason for (B) is that diffcore_rename_extended() is what
|
The reason for `(B)` is that diffcore_rename_extended() is what
|
||||||
generates the counts of renames by directory which is needed in
|
generates the counts of renames by directory which is needed in
|
||||||
directory rename detection, and if we don't run
|
directory rename detection, and if we don't run
|
||||||
diffcore_rename_extended() again then we need to have the output from
|
diffcore_rename_extended() again then we need to have the output from
|
||||||
it, including dir_rename_counts, from the previous run.
|
it, including dir_rename_counts, from the previous run.
|
||||||
|
|
||||||
The reason for (C) is that merge-ort's tree traversal will again think
|
The reason for `(C)` is that merge-ort's tree traversal will again think
|
||||||
those paths are relevant (marking them as RELEVANT_LOCATION), but the
|
those paths are relevant (marking them as RELEVANT_LOCATION), but the
|
||||||
fact that they were downgraded to RELEVANT_NO_MORE means that
|
fact that they were downgraded to RELEVANT_NO_MORE means that
|
||||||
dir_rename_counts already has the information we need for directory
|
dir_rename_counts already has the information we need for directory
|
||||||
rename detection. (A path which becomes RELEVANT_CONTENT in a
|
rename detection. (A path which becomes RELEVANT_CONTENT in a
|
||||||
subsequent commit will be removed from cached_irrelevant.)
|
subsequent commit will be removed from cached_irrelevant.)
|
||||||
|
|
||||||
The reason for (D) is that is how we determine whether the remember
|
The reason for `(D)` is that is how we determine whether the remember
|
||||||
renames optimization can be used. In particular, remembering that our
|
renames optimization can be used. In particular, remembering that our
|
||||||
sequence of merges looks like:
|
sequence of merges looks like:
|
||||||
|
|
||||||
|
....
|
||||||
Merge 1:
|
Merge 1:
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -475,6 +499,7 @@ sequence of merges looks like:
|
||||||
MERGE_SIDE1: A'
|
MERGE_SIDE1: A'
|
||||||
MERGE_SIDE2: B
|
MERGE_SIDE2: B
|
||||||
=> Creates B'
|
=> Creates B'
|
||||||
|
....
|
||||||
|
|
||||||
It is the fact that the trees A and A' appear both in Merge 1 and in
|
It is the fact that the trees A and A' appear both in Merge 1 and in
|
||||||
Merge 2, with A as a parent of A' that allows this optimization. So
|
Merge 2, with A as a parent of A' that allows this optimization. So
|
||||||
|
|
@ -482,12 +507,11 @@ we store the trees to compare with what we are asked to merge next
|
||||||
time.
|
time.
|
||||||
|
|
||||||
|
|
||||||
=== 8. How directory rename detection interacts with the above and ===
|
== 9. How directory rename detection interacts with the above and why this optimization is still safe even if merge.directoryRenames is set to "true". ==
|
||||||
=== why this optimization is still safe even if ===
|
|
||||||
=== merge.directoryRenames is set to "true". ===
|
|
||||||
|
|
||||||
As noted in the assumptions section:
|
As noted in the assumptions section:
|
||||||
|
|
||||||
|
....
|
||||||
"""
|
"""
|
||||||
...if directory renames do occur, then the default of
|
...if directory renames do occur, then the default of
|
||||||
merge.directoryRenames being set to "conflict" means that the operation
|
merge.directoryRenames being set to "conflict" means that the operation
|
||||||
|
|
@ -497,11 +521,13 @@ As noted in the assumptions section:
|
||||||
is that some users will have set merge.directoryRenames to "true" to
|
is that some users will have set merge.directoryRenames to "true" to
|
||||||
allow the merges to continue to proceed automatically.
|
allow the merges to continue to proceed automatically.
|
||||||
"""
|
"""
|
||||||
|
....
|
||||||
|
|
||||||
Let's remember that we need to look at how any given pick affects the next
|
Let's remember that we need to look at how any given pick affects the next
|
||||||
one. So let's again use the first two picks from the diagram in section
|
one. So let's again use the first two picks from the diagram in section
|
||||||
one:
|
one:
|
||||||
|
|
||||||
|
....
|
||||||
First pick does this three-way merge:
|
First pick does this three-way merge:
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -513,6 +539,7 @@ one:
|
||||||
MERGE_SIDE1: A'
|
MERGE_SIDE1: A'
|
||||||
MERGE_SIDE2: B
|
MERGE_SIDE2: B
|
||||||
=> creates B'
|
=> creates B'
|
||||||
|
....
|
||||||
|
|
||||||
Now, directory rename detection exists so that if one side of history
|
Now, directory rename detection exists so that if one side of history
|
||||||
renames a directory, and the other side adds a new file to the old
|
renames a directory, and the other side adds a new file to the old
|
||||||
|
|
@ -545,7 +572,7 @@ while considering all of these cases:
|
||||||
concerned; see the assumptions section). Two interesting sub-notes
|
concerned; see the assumptions section). Two interesting sub-notes
|
||||||
about these counts:
|
about these counts:
|
||||||
|
|
||||||
* If we need to perform rename-detection again on the given side (e.g.
|
** If we need to perform rename-detection again on the given side (e.g.
|
||||||
some paths are relevant for rename detection that weren't before),
|
some paths are relevant for rename detection that weren't before),
|
||||||
then we clear dir_rename_counts and recompute it, making use of
|
then we clear dir_rename_counts and recompute it, making use of
|
||||||
cached_pairs. The reason it is important to do this is optimizations
|
cached_pairs. The reason it is important to do this is optimizations
|
||||||
|
|
@ -556,7 +583,7 @@ while considering all of these cases:
|
||||||
easiest way to "fix up" dir_rename_counts in such cases is to just
|
easiest way to "fix up" dir_rename_counts in such cases is to just
|
||||||
recompute it.
|
recompute it.
|
||||||
|
|
||||||
* If we prune rename/rename(1to1) entries from the cache, then we also
|
** If we prune rename/rename(1to1) entries from the cache, then we also
|
||||||
need to update dir_rename_counts to decrement the counts for the
|
need to update dir_rename_counts to decrement the counts for the
|
||||||
involved directory and any relevant parent directories (to undo what
|
involved directory and any relevant parent directories (to undo what
|
||||||
update_dir_rename_counts() in diffcore-rename.c incremented when the
|
update_dir_rename_counts() in diffcore-rename.c incremented when the
|
||||||
|
|
@ -578,6 +605,7 @@ in order:
|
||||||
|
|
||||||
Case 1: MERGE_SIDE1 renames old dir, MERGE_SIDE2 adds new file to old dir
|
Case 1: MERGE_SIDE1 renames old dir, MERGE_SIDE2 adds new file to old dir
|
||||||
|
|
||||||
|
....
|
||||||
This case looks like this:
|
This case looks like this:
|
||||||
|
|
||||||
MERGE_BASE: E, Has olddir/
|
MERGE_BASE: E, Has olddir/
|
||||||
|
|
@ -595,10 +623,13 @@ Case 1: MERGE_SIDE1 renames old dir, MERGE_SIDE2 adds new file to old dir
|
||||||
* MERGE_SIDE1 has cached olddir/newfile -> newdir/newfile
|
* MERGE_SIDE1 has cached olddir/newfile -> newdir/newfile
|
||||||
Given the cached rename noted above, the second merge can proceed as
|
Given the cached rename noted above, the second merge can proceed as
|
||||||
expected without needing to perform rename detection from A -> A'.
|
expected without needing to perform rename detection from A -> A'.
|
||||||
|
....
|
||||||
|
|
||||||
Case 2: MERGE_SIDE1 renames old dir, MERGE_SIDE2 renames file into old dir
|
Case 2: MERGE_SIDE1 renames old dir, MERGE_SIDE2 renames file into old dir
|
||||||
|
|
||||||
|
....
|
||||||
This case looks like this:
|
This case looks like this:
|
||||||
|
|
||||||
MERGE_BASE: E oldfile, olddir/
|
MERGE_BASE: E oldfile, olddir/
|
||||||
MERGE_SIDE1: G oldfile, olddir/ -> newdir/
|
MERGE_SIDE1: G oldfile, olddir/ -> newdir/
|
||||||
MERGE_SIDE2: A oldfile -> olddir/newfile
|
MERGE_SIDE2: A oldfile -> olddir/newfile
|
||||||
|
|
@ -617,9 +648,11 @@ Case 2: MERGE_SIDE1 renames old dir, MERGE_SIDE2 renames file into old dir
|
||||||
|
|
||||||
Given the cached rename noted above, the second merge can proceed as
|
Given the cached rename noted above, the second merge can proceed as
|
||||||
expected without needing to perform rename detection from A -> A'.
|
expected without needing to perform rename detection from A -> A'.
|
||||||
|
....
|
||||||
|
|
||||||
Case 3: MERGE_SIDE1 adds new file to old dir, MERGE_SIDE2 renames old dir
|
Case 3: MERGE_SIDE1 adds new file to old dir, MERGE_SIDE2 renames old dir
|
||||||
|
|
||||||
|
....
|
||||||
This case looks like this:
|
This case looks like this:
|
||||||
|
|
||||||
MERGE_BASE: E, Has olddir/
|
MERGE_BASE: E, Has olddir/
|
||||||
|
|
@ -635,9 +668,11 @@ Case 3: MERGE_SIDE1 adds new file to old dir, MERGE_SIDE2 renames old dir
|
||||||
In this case, with the optimization, note that after the first commit there
|
In this case, with the optimization, note that after the first commit there
|
||||||
were no renames on MERGE_SIDE1, and any renames on MERGE_SIDE2 are tossed.
|
were no renames on MERGE_SIDE1, and any renames on MERGE_SIDE2 are tossed.
|
||||||
But the second merge didn't need any renames so this is fine.
|
But the second merge didn't need any renames so this is fine.
|
||||||
|
....
|
||||||
|
|
||||||
Case 4: MERGE_SIDE1 renames file into old dir, MERGE_SIDE2 renames old dir
|
Case 4: MERGE_SIDE1 renames file into old dir, MERGE_SIDE2 renames old dir
|
||||||
|
|
||||||
|
....
|
||||||
This case looks like this:
|
This case looks like this:
|
||||||
|
|
||||||
MERGE_BASE: E, Has olddir/
|
MERGE_BASE: E, Has olddir/
|
||||||
|
|
@ -658,6 +693,7 @@ Case 4: MERGE_SIDE1 renames file into old dir, MERGE_SIDE2 renames old dir
|
||||||
|
|
||||||
Given the cached rename noted above, the second merge can proceed as
|
Given the cached rename noted above, the second merge can proceed as
|
||||||
expected without needing to perform rename detection from A -> A'.
|
expected without needing to perform rename detection from A -> A'.
|
||||||
|
....
|
||||||
|
|
||||||
Finally, I'll just note here that interactions with the
|
Finally, I'll just note here that interactions with the
|
||||||
skip-irrelevant-renames optimization means we sometimes don't detect
|
skip-irrelevant-renames optimization means we sometimes don't detect
|
||||||
|
|
|
||||||
|
|
@ -14,37 +14,41 @@ Table of contents:
|
||||||
* Reference Emails
|
* Reference Emails
|
||||||
|
|
||||||
|
|
||||||
=== Terminology ===
|
== Terminology ==
|
||||||
|
|
||||||
cone mode: one of two modes for specifying the desired subset of files
|
*`cone mode`*::
|
||||||
|
one of two modes for specifying the desired subset of files
|
||||||
in a sparse-checkout. In cone-mode, the user specifies
|
in a sparse-checkout. In cone-mode, the user specifies
|
||||||
directories (getting both everything under that directory as
|
directories (getting both everything under that directory as
|
||||||
well as everything in leading directories), while in non-cone
|
well as everything in leading directories), while in non-cone
|
||||||
mode, the user specifies gitignore-style patterns. Controlled
|
mode, the user specifies gitignore-style patterns. Controlled
|
||||||
by the --[no-]cone option to sparse-checkout init|set.
|
by the --[no-]cone option to sparse-checkout init|set.
|
||||||
|
|
||||||
SKIP_WORKTREE: When tracked files do not match the sparse specification and
|
*`SKIP_WORKTREE`*::
|
||||||
|
When tracked files do not match the sparse specification and
|
||||||
are removed from the working tree, the file in the index is marked
|
are removed from the working tree, the file in the index is marked
|
||||||
with a SKIP_WORKTREE bit. Note that if a tracked file has the
|
with a SKIP_WORKTREE bit. Note that if a tracked file has the
|
||||||
SKIP_WORKTREE bit set but the file is later written by the user to
|
SKIP_WORKTREE bit set but the file is later written by the user to
|
||||||
the working tree anyway, the SKIP_WORKTREE bit will be cleared at
|
the working tree anyway, the SKIP_WORKTREE bit will be cleared at
|
||||||
the beginning of any subsequent Git operation.
|
the beginning of any subsequent Git operation.
|
||||||
|
+
|
||||||
|
Most sparse checkout users are unaware of this implementation
|
||||||
|
detail, and the term should generally be avoided in user-facing
|
||||||
|
descriptions and command flags. Unfortunately, prior to the
|
||||||
|
`sparse-checkout` subcommand this low-level detail was exposed,
|
||||||
|
and as of time of writing, is still exposed in various places.
|
||||||
|
|
||||||
Most sparse checkout users are unaware of this implementation
|
*`sparse-checkout`*::
|
||||||
detail, and the term should generally be avoided in user-facing
|
a subcommand in git used to reduce the files present in
|
||||||
descriptions and command flags. Unfortunately, prior to the
|
|
||||||
`sparse-checkout` subcommand this low-level detail was exposed,
|
|
||||||
and as of time of writing, is still exposed in various places.
|
|
||||||
|
|
||||||
sparse-checkout: a subcommand in git used to reduce the files present in
|
|
||||||
the working tree to a subset of all tracked files. Also, the
|
the working tree to a subset of all tracked files. Also, the
|
||||||
name of the file in the $GIT_DIR/info directory used to track
|
name of the file in the $GIT_DIR/info directory used to track
|
||||||
the sparsity patterns corresponding to the user's desired
|
the sparsity patterns corresponding to the user's desired
|
||||||
subset.
|
subset.
|
||||||
|
|
||||||
sparse cone: see cone mode
|
*`sparse cone`*:: see cone mode
|
||||||
|
|
||||||
sparse directory: An entry in the index corresponding to a directory, which
|
*`sparse directory`*::
|
||||||
|
An entry in the index corresponding to a directory, which
|
||||||
appears in the index instead of all the files under that directory
|
appears in the index instead of all the files under that directory
|
||||||
that would normally appear. See also sparse-index. Something that
|
that would normally appear. See also sparse-index. Something that
|
||||||
can cause confusion is that the "sparse directory" does NOT match
|
can cause confusion is that the "sparse directory" does NOT match
|
||||||
|
|
@ -52,7 +56,8 @@ sparse directory: An entry in the index corresponding to a directory, which
|
||||||
working tree. May be renamed in the future (e.g. to "skipped
|
working tree. May be renamed in the future (e.g. to "skipped
|
||||||
directory").
|
directory").
|
||||||
|
|
||||||
sparse index: A special mode for sparse-checkout that also makes the
|
*`sparse index`*::
|
||||||
|
A special mode for sparse-checkout that also makes the
|
||||||
index sparse by recording a directory entry in lieu of all the
|
index sparse by recording a directory entry in lieu of all the
|
||||||
files underneath that directory (thus making that a "skipped
|
files underneath that directory (thus making that a "skipped
|
||||||
directory" which unfortunately has also been called a "sparse
|
directory" which unfortunately has also been called a "sparse
|
||||||
|
|
@ -60,7 +65,8 @@ sparse index: A special mode for sparse-checkout that also makes the
|
||||||
directories. Controlled by the --[no-]sparse-index option to
|
directories. Controlled by the --[no-]sparse-index option to
|
||||||
init|set|reapply.
|
init|set|reapply.
|
||||||
|
|
||||||
sparsity patterns: patterns from $GIT_DIR/info/sparse-checkout used to
|
*`sparsity patterns`*::
|
||||||
|
patterns from $GIT_DIR/info/sparse-checkout used to
|
||||||
define the set of files of interest. A warning: It is easy to
|
define the set of files of interest. A warning: It is easy to
|
||||||
over-use this term (or the shortened "patterns" term), for two
|
over-use this term (or the shortened "patterns" term), for two
|
||||||
reasons: (1) users in cone mode specify directories rather than
|
reasons: (1) users in cone mode specify directories rather than
|
||||||
|
|
@ -70,7 +76,8 @@ sparsity patterns: patterns from $GIT_DIR/info/sparse-checkout used to
|
||||||
transiently differ in the working tree or index from the sparsity
|
transiently differ in the working tree or index from the sparsity
|
||||||
patterns (see "Sparse specification vs. sparsity patterns").
|
patterns (see "Sparse specification vs. sparsity patterns").
|
||||||
|
|
||||||
sparse specification: The set of paths in the user's area of focus. This
|
*`sparse specification`*::
|
||||||
|
The set of paths in the user's area of focus. This
|
||||||
is typically just the tracked files that match the sparsity
|
is typically just the tracked files that match the sparsity
|
||||||
patterns, but the sparse specification can temporarily differ and
|
patterns, but the sparse specification can temporarily differ and
|
||||||
include additional files. (See also "Sparse specification
|
include additional files. (See also "Sparse specification
|
||||||
|
|
@ -87,12 +94,13 @@ sparse specification: The set of paths in the user's area of focus. This
|
||||||
* If working with the index and the working copy, the sparse
|
* If working with the index and the working copy, the sparse
|
||||||
specification is the union of the paths from above.
|
specification is the union of the paths from above.
|
||||||
|
|
||||||
vivifying: When a command restores a tracked file to the working tree (and
|
*`vivifying`*::
|
||||||
|
When a command restores a tracked file to the working tree (and
|
||||||
hopefully also clears the SKIP_WORKTREE bit in the index for that
|
hopefully also clears the SKIP_WORKTREE bit in the index for that
|
||||||
file), this is referred to as "vivifying" the file.
|
file), this is referred to as "vivifying" the file.
|
||||||
|
|
||||||
|
|
||||||
=== Purpose of sparse-checkouts ===
|
== Purpose of sparse-checkouts ==
|
||||||
|
|
||||||
sparse-checkouts exist to allow users to work with a subset of their
|
sparse-checkouts exist to allow users to work with a subset of their
|
||||||
files.
|
files.
|
||||||
|
|
@ -120,14 +128,12 @@ those usecases, sparse-checkouts can modify different subcommands in over a
|
||||||
half dozen different ways. Let's start by considering the high level
|
half dozen different ways. Let's start by considering the high level
|
||||||
usecases:
|
usecases:
|
||||||
|
|
||||||
A) Users are _only_ interested in the sparse portion of the repo
|
[horizontal]
|
||||||
|
A):: Users are _only_ interested in the sparse portion of the repo
|
||||||
A*) Users are _only_ interested in the sparse portion of the repo
|
A*):: Users are _only_ interested in the sparse portion of the repo
|
||||||
that they have downloaded so far
|
that they have downloaded so far
|
||||||
|
B):: Users want a sparse working tree, but are working in a larger whole
|
||||||
B) Users want a sparse working tree, but are working in a larger whole
|
C):: sparse-checkout is a behind-the-scenes implementation detail allowing
|
||||||
|
|
||||||
C) sparse-checkout is a behind-the-scenes implementation detail allowing
|
|
||||||
Git to work with a specially crafted in-house virtual file system;
|
Git to work with a specially crafted in-house virtual file system;
|
||||||
users are actually working with a "full" working tree that is
|
users are actually working with a "full" working tree that is
|
||||||
lazily populated, and sparse-checkout helps with the lazy population
|
lazily populated, and sparse-checkout helps with the lazy population
|
||||||
|
|
@ -136,7 +142,7 @@ usecases:
|
||||||
It may be worth explaining each of these in a bit more detail:
|
It may be worth explaining each of these in a bit more detail:
|
||||||
|
|
||||||
|
|
||||||
(Behavior A) Users are _only_ interested in the sparse portion of the repo
|
=== (Behavior A) Users are _only_ interested in the sparse portion of the repo
|
||||||
|
|
||||||
These folks might know there are other things in the repository, but
|
These folks might know there are other things in the repository, but
|
||||||
don't care. They are uninterested in other parts of the repository, and
|
don't care. They are uninterested in other parts of the repository, and
|
||||||
|
|
@ -163,8 +169,7 @@ side-effects of various other commands (such as the printed diffstat
|
||||||
after a merge or pull) can lead to worries about local repository size
|
after a merge or pull) can lead to worries about local repository size
|
||||||
growing unnecessarily[10].
|
growing unnecessarily[10].
|
||||||
|
|
||||||
(Behavior A*) Users are _only_ interested in the sparse portion of the repo
|
=== (Behavior A*) Users are _only_ interested in the sparse portion of the repo that they have downloaded so far (a variant on the first usecase)
|
||||||
that they have downloaded so far (a variant on the first usecase)
|
|
||||||
|
|
||||||
This variant is driven by folks who using partial clones together with
|
This variant is driven by folks who using partial clones together with
|
||||||
sparse checkouts and do disconnected development (so far sounding like a
|
sparse checkouts and do disconnected development (so far sounding like a
|
||||||
|
|
@ -173,15 +178,14 @@ reason for yet another variant is that downloading even just the blobs
|
||||||
through history within their sparse specification may be too much, so they
|
through history within their sparse specification may be too much, so they
|
||||||
only download some. They would still like operations to succeed without
|
only download some. They would still like operations to succeed without
|
||||||
network connectivity, though, so things like `git log -S${SEARCH_TERM} -p`
|
network connectivity, though, so things like `git log -S${SEARCH_TERM} -p`
|
||||||
or `git grep ${SEARCH_TERM} OLDREV ` would need to be prepared to provide
|
or `git grep ${SEARCH_TERM} OLDREV` would need to be prepared to provide
|
||||||
partial results that depend on what happens to have been downloaded.
|
partial results that depend on what happens to have been downloaded.
|
||||||
|
|
||||||
This variant could be viewed as Behavior A with the sparse specification
|
This variant could be viewed as Behavior A with the sparse specification
|
||||||
for history querying operations modified from "sparsity patterns" to
|
for history querying operations modified from "sparsity patterns" to
|
||||||
"sparsity patterns limited to the blobs we have already downloaded".
|
"sparsity patterns limited to the blobs we have already downloaded".
|
||||||
|
|
||||||
(Behavior B) Users want a sparse working tree, but are working in a
|
=== (Behavior B) Users want a sparse working tree, but are working in a larger whole
|
||||||
larger whole
|
|
||||||
|
|
||||||
Stolee described this usecase this way[11]:
|
Stolee described this usecase this way[11]:
|
||||||
|
|
||||||
|
|
@ -229,8 +233,7 @@ those expensive checks when interacting with the working copy, and may
|
||||||
prefer getting "unrelated" results from their history queries over having
|
prefer getting "unrelated" results from their history queries over having
|
||||||
slow commands.
|
slow commands.
|
||||||
|
|
||||||
(Behavior C) sparse-checkout is an implementational detail supporting a
|
=== (Behavior C) sparse-checkout is an implementational detail supporting a special VFS.
|
||||||
special VFS.
|
|
||||||
|
|
||||||
This usecase goes slightly against the traditional definition of
|
This usecase goes slightly against the traditional definition of
|
||||||
sparse-checkout in that it actually tries to present a full or dense
|
sparse-checkout in that it actually tries to present a full or dense
|
||||||
|
|
@ -255,13 +258,13 @@ will perceive the checkout as dense, and commands should thus behave as if
|
||||||
all files are present.
|
all files are present.
|
||||||
|
|
||||||
|
|
||||||
=== Usecases of primary concern ===
|
== Usecases of primary concern ==
|
||||||
|
|
||||||
Most of the rest of this document will focus on Behavior A and Behavior
|
Most of the rest of this document will focus on Behavior A and Behavior
|
||||||
B. Some notes about the other two cases and why we are not focusing on
|
B. Some notes about the other two cases and why we are not focusing on
|
||||||
them:
|
them:
|
||||||
|
|
||||||
(Behavior A*)
|
=== (Behavior A*)
|
||||||
|
|
||||||
Supporting this usecase is estimated to be difficult and a lot of work.
|
Supporting this usecase is estimated to be difficult and a lot of work.
|
||||||
There are no plans to implement it currently, but it may be a potential
|
There are no plans to implement it currently, but it may be a potential
|
||||||
|
|
@ -275,7 +278,7 @@ valid for this usecase, with the only exception being that it redefines the
|
||||||
sparse specification to restrict it to already-downloaded blobs. The hard
|
sparse specification to restrict it to already-downloaded blobs. The hard
|
||||||
part is in making commands capable of respecting that modified definition.
|
part is in making commands capable of respecting that modified definition.
|
||||||
|
|
||||||
(Behavior C)
|
=== (Behavior C)
|
||||||
|
|
||||||
This usecase violates some of the early sparse-checkout documented
|
This usecase violates some of the early sparse-checkout documented
|
||||||
assumptions (since files marked as SKIP_WORKTREE will be displayed to users
|
assumptions (since files marked as SKIP_WORKTREE will be displayed to users
|
||||||
|
|
@ -300,20 +303,20 @@ Behavior C do not assume they are part of the Behavior B camp and propose
|
||||||
patches that break things for the real Behavior B folks.
|
patches that break things for the real Behavior B folks.
|
||||||
|
|
||||||
|
|
||||||
=== Oversimplified mental models ===
|
== Oversimplified mental models ==
|
||||||
|
|
||||||
An oversimplification of the differences in the above behaviors is:
|
An oversimplification of the differences in the above behaviors is:
|
||||||
|
|
||||||
Behavior A: Restrict worktree and history operations to sparse specification
|
(Behavior A):: Restrict worktree and history operations to sparse specification
|
||||||
Behavior B: Restrict worktree operations to sparse specification; have any
|
(Behavior B):: Restrict worktree operations to sparse specification; have any
|
||||||
history operations work across all files
|
history operations work across all files
|
||||||
Behavior C: Do not restrict either worktree or history operations to the
|
(Behavior C):: Do not restrict either worktree or history operations to the
|
||||||
sparse specification...with the exception of branch checkouts or
|
sparse specification...with the exception of branch checkouts or
|
||||||
switches which avoid writing files that will match the index so
|
switches which avoid writing files that will match the index so
|
||||||
they can later lazily be populated instead.
|
they can later lazily be populated instead.
|
||||||
|
|
||||||
|
|
||||||
=== Desired behavior ===
|
== Desired behavior ==
|
||||||
|
|
||||||
As noted previously, despite the simple idea of just working with a subset
|
As noted previously, despite the simple idea of just working with a subset
|
||||||
of files, there are a range of different behavioral changes that need to be
|
of files, there are a range of different behavioral changes that need to be
|
||||||
|
|
@ -326,37 +329,38 @@ understanding these differences can be beneficial.
|
||||||
|
|
||||||
* Commands behaving the same regardless of high-level use-case
|
* Commands behaving the same regardless of high-level use-case
|
||||||
|
|
||||||
* commands that only look at files within the sparsity specification
|
** commands that only look at files within the sparsity specification
|
||||||
|
|
||||||
* diff (without --cached or REVISION arguments)
|
*** diff (without --cached or REVISION arguments)
|
||||||
* grep (without --cached or REVISION arguments)
|
*** grep (without --cached or REVISION arguments)
|
||||||
* diff-files
|
*** diff-files
|
||||||
|
|
||||||
* commands that restore files to the working tree that match sparsity
|
** commands that restore files to the working tree that match sparsity
|
||||||
patterns, and remove unmodified files that don't match those
|
patterns, and remove unmodified files that don't match those
|
||||||
patterns:
|
patterns:
|
||||||
|
|
||||||
* switch
|
*** switch
|
||||||
* checkout (the switch-like half)
|
*** checkout (the switch-like half)
|
||||||
* read-tree
|
*** read-tree
|
||||||
* reset --hard
|
*** reset --hard
|
||||||
|
|
||||||
* commands that write conflicted files to the working tree, but otherwise
|
** commands that write conflicted files to the working tree, but otherwise
|
||||||
will omit writing files to the working tree that do not match the
|
will omit writing files to the working tree that do not match the
|
||||||
sparsity patterns:
|
sparsity patterns:
|
||||||
|
|
||||||
* merge
|
*** merge
|
||||||
* rebase
|
*** rebase
|
||||||
* cherry-pick
|
*** cherry-pick
|
||||||
* revert
|
*** revert
|
||||||
|
|
||||||
* `am` and `apply --cached` should probably be in this section but
|
*** `am` and `apply --cached` should probably be in this section but
|
||||||
are buggy (see the "Known bugs" section below)
|
are buggy (see the "Known bugs" section below)
|
||||||
|
|
||||||
The behavior for these commands somewhat depends upon the merge
|
The behavior for these commands somewhat depends upon the merge
|
||||||
strategy being used:
|
strategy being used:
|
||||||
* `ort` behaves as described above
|
|
||||||
* `octopus` and `resolve` will always vivify any file changed in the merge
|
*** `ort` behaves as described above
|
||||||
|
*** `octopus` and `resolve` will always vivify any file changed in the merge
|
||||||
relative to the first parent, which is rather suboptimal.
|
relative to the first parent, which is rather suboptimal.
|
||||||
|
|
||||||
It is also important to note that these commands WILL update the index
|
It is also important to note that these commands WILL update the index
|
||||||
|
|
@ -372,21 +376,21 @@ understanding these differences can be beneficial.
|
||||||
specification and the sparsity patterns (much like the commands in the
|
specification and the sparsity patterns (much like the commands in the
|
||||||
previous section).
|
previous section).
|
||||||
|
|
||||||
* commands that always ignore sparsity since commits must be full-tree
|
** commands that always ignore sparsity since commits must be full-tree
|
||||||
|
|
||||||
* archive
|
*** archive
|
||||||
* bundle
|
*** bundle
|
||||||
* commit
|
*** commit
|
||||||
* format-patch
|
*** format-patch
|
||||||
* fast-export
|
*** fast-export
|
||||||
* fast-import
|
*** fast-import
|
||||||
* commit-tree
|
*** commit-tree
|
||||||
|
|
||||||
* commands that write any modified file to the working tree (conflicted
|
** commands that write any modified file to the working tree (conflicted
|
||||||
or not, and whether those paths match sparsity patterns or not):
|
or not, and whether those paths match sparsity patterns or not):
|
||||||
|
|
||||||
* stash
|
*** stash
|
||||||
* apply (without `--index` or `--cached`)
|
*** apply (without `--index` or `--cached`)
|
||||||
|
|
||||||
* Commands that may slightly differ for behavior A vs. behavior B:
|
* Commands that may slightly differ for behavior A vs. behavior B:
|
||||||
|
|
||||||
|
|
@ -394,19 +398,20 @@ understanding these differences can be beneficial.
|
||||||
behaviors, but may differ in verbosity and types of warning and error
|
behaviors, but may differ in verbosity and types of warning and error
|
||||||
messages.
|
messages.
|
||||||
|
|
||||||
* commands that make modifications to which files are tracked:
|
** commands that make modifications to which files are tracked:
|
||||||
* add
|
|
||||||
* rm
|
*** add
|
||||||
* mv
|
*** rm
|
||||||
* update-index
|
*** mv
|
||||||
|
*** update-index
|
||||||
|
|
||||||
The fact that files can move between the 'tracked' and 'untracked'
|
The fact that files can move between the 'tracked' and 'untracked'
|
||||||
categories means some commands will have to treat untracked files
|
categories means some commands will have to treat untracked files
|
||||||
differently. But if we have to treat untracked files differently,
|
differently. But if we have to treat untracked files differently,
|
||||||
then additional commands may also need changes:
|
then additional commands may also need changes:
|
||||||
|
|
||||||
* status
|
*** status
|
||||||
* clean
|
*** clean
|
||||||
|
|
||||||
In particular, `status` may need to report any untracked files outside
|
In particular, `status` may need to report any untracked files outside
|
||||||
the sparsity specification as an erroneous condition (especially to
|
the sparsity specification as an erroneous condition (especially to
|
||||||
|
|
@ -420,9 +425,10 @@ understanding these differences can be beneficial.
|
||||||
may need to ignore the sparse specification by its nature. Also, its
|
may need to ignore the sparse specification by its nature. Also, its
|
||||||
current --[no-]ignore-skip-worktree-entries default is totally bogus.
|
current --[no-]ignore-skip-worktree-entries default is totally bogus.
|
||||||
|
|
||||||
* commands for manually tweaking paths in both the index and the working tree
|
** commands for manually tweaking paths in both the index and the working tree
|
||||||
* `restore`
|
|
||||||
* the restore-like half of `checkout`
|
*** `restore`
|
||||||
|
*** the restore-like half of `checkout`
|
||||||
|
|
||||||
These commands should be similar to add/rm/mv in that they should
|
These commands should be similar to add/rm/mv in that they should
|
||||||
only operate on the sparse specification by default, and require a
|
only operate on the sparse specification by default, and require a
|
||||||
|
|
@ -433,18 +439,19 @@ understanding these differences can be beneficial.
|
||||||
|
|
||||||
* Commands that significantly differ for behavior A vs. behavior B:
|
* Commands that significantly differ for behavior A vs. behavior B:
|
||||||
|
|
||||||
* commands that query history
|
** commands that query history
|
||||||
* diff (with --cached or REVISION arguments)
|
|
||||||
* grep (with --cached or REVISION arguments)
|
*** diff (with --cached or REVISION arguments)
|
||||||
* show (when given commit arguments)
|
*** grep (with --cached or REVISION arguments)
|
||||||
* blame (only matters when one or more -C flags are passed)
|
*** show (when given commit arguments)
|
||||||
* and annotate
|
*** blame (only matters when one or more -C flags are passed)
|
||||||
* log
|
**** and annotate
|
||||||
* whatchanged (may not exist anymore)
|
*** log
|
||||||
* ls-files
|
*** whatchanged (may not exist anymore)
|
||||||
* diff-index
|
*** ls-files
|
||||||
* diff-tree
|
*** diff-index
|
||||||
* ls-tree
|
*** diff-tree
|
||||||
|
*** ls-tree
|
||||||
|
|
||||||
Note: for log and whatchanged, revision walking logic is unaffected
|
Note: for log and whatchanged, revision walking logic is unaffected
|
||||||
but displaying of patches is affected by scoping the command to the
|
but displaying of patches is affected by scoping the command to the
|
||||||
|
|
@ -458,91 +465,91 @@ understanding these differences can be beneficial.
|
||||||
|
|
||||||
* Commands I don't know how to classify
|
* Commands I don't know how to classify
|
||||||
|
|
||||||
* range-diff
|
** range-diff
|
||||||
|
|
||||||
Is this like `log` or `format-patch`?
|
Is this like `log` or `format-patch`?
|
||||||
|
|
||||||
* cherry
|
** cherry
|
||||||
|
|
||||||
See range-diff
|
See range-diff
|
||||||
|
|
||||||
* Commands unaffected by sparse-checkouts
|
* Commands unaffected by sparse-checkouts
|
||||||
|
|
||||||
* shortlog
|
** shortlog
|
||||||
* show-branch
|
** show-branch
|
||||||
* rev-list
|
** rev-list
|
||||||
* bisect
|
** bisect
|
||||||
|
|
||||||
* branch
|
** branch
|
||||||
* describe
|
** describe
|
||||||
* fetch
|
** fetch
|
||||||
* gc
|
** gc
|
||||||
* init
|
** init
|
||||||
* maintenance
|
** maintenance
|
||||||
* notes
|
** notes
|
||||||
* pull (merge & rebase have the necessary changes)
|
** pull (merge & rebase have the necessary changes)
|
||||||
* push
|
** push
|
||||||
* submodule
|
** submodule
|
||||||
* tag
|
** tag
|
||||||
|
|
||||||
* config
|
** config
|
||||||
* filter-branch (works in separate checkout without sparse-checkout setup)
|
** filter-branch (works in separate checkout without sparse-checkout setup)
|
||||||
* pack-refs
|
** pack-refs
|
||||||
* prune
|
** prune
|
||||||
* remote
|
** remote
|
||||||
* repack
|
** repack
|
||||||
* replace
|
** replace
|
||||||
|
|
||||||
* bugreport
|
** bugreport
|
||||||
* count-objects
|
** count-objects
|
||||||
* fsck
|
** fsck
|
||||||
* gitweb
|
** gitweb
|
||||||
* help
|
** help
|
||||||
* instaweb
|
** instaweb
|
||||||
* merge-tree (doesn't touch worktree or index, and merges always compute full-tree)
|
** merge-tree (doesn't touch worktree or index, and merges always compute full-tree)
|
||||||
* rerere
|
** rerere
|
||||||
* verify-commit
|
** verify-commit
|
||||||
* verify-tag
|
** verify-tag
|
||||||
|
|
||||||
* commit-graph
|
** commit-graph
|
||||||
* hash-object
|
** hash-object
|
||||||
* index-pack
|
** index-pack
|
||||||
* mktag
|
** mktag
|
||||||
* mktree
|
** mktree
|
||||||
* multi-pack-index
|
** multi-pack-index
|
||||||
* pack-objects
|
** pack-objects
|
||||||
* prune-packed
|
** prune-packed
|
||||||
* symbolic-ref
|
** symbolic-ref
|
||||||
* unpack-objects
|
** unpack-objects
|
||||||
* update-ref
|
** update-ref
|
||||||
* write-tree (operates on index, possibly optimized to use sparse dir entries)
|
** write-tree (operates on index, possibly optimized to use sparse dir entries)
|
||||||
|
|
||||||
* for-each-ref
|
** for-each-ref
|
||||||
* get-tar-commit-id
|
** get-tar-commit-id
|
||||||
* ls-remote
|
** ls-remote
|
||||||
* merge-base (merges are computed full tree, so merge base should be too)
|
** merge-base (merges are computed full tree, so merge base should be too)
|
||||||
* name-rev
|
** name-rev
|
||||||
* pack-redundant
|
** pack-redundant
|
||||||
* rev-parse
|
** rev-parse
|
||||||
* show-index
|
** show-index
|
||||||
* show-ref
|
** show-ref
|
||||||
* unpack-file
|
** unpack-file
|
||||||
* var
|
** var
|
||||||
* verify-pack
|
** verify-pack
|
||||||
|
|
||||||
* <Everything under 'Interacting with Others' in 'git help --all'>
|
** <Everything under 'Interacting with Others' in 'git help --all'>
|
||||||
* <Everything under 'Low-level...Syncing' in 'git help --all'>
|
** <Everything under 'Low-level...Syncing' in 'git help --all'>
|
||||||
* <Everything under 'Low-level...Internal Helpers' in 'git help --all'>
|
** <Everything under 'Low-level...Internal Helpers' in 'git help --all'>
|
||||||
* <Everything under 'External commands' in 'git help --all'>
|
** <Everything under 'External commands' in 'git help --all'>
|
||||||
|
|
||||||
* Commands that might be affected, but who cares?
|
* Commands that might be affected, but who cares?
|
||||||
|
|
||||||
* merge-file
|
** merge-file
|
||||||
* merge-index
|
** merge-index
|
||||||
* gitk?
|
** gitk?
|
||||||
|
|
||||||
|
|
||||||
=== Behavior classes ===
|
== Behavior classes ==
|
||||||
|
|
||||||
From the above there are a few classes of behavior:
|
From the above there are a few classes of behavior:
|
||||||
|
|
||||||
|
|
@ -573,6 +580,7 @@ From the above there are a few classes of behavior:
|
||||||
|
|
||||||
Commands in this class generally behave like the "restrict" class,
|
Commands in this class generally behave like the "restrict" class,
|
||||||
except that:
|
except that:
|
||||||
|
|
||||||
(1) they will ignore the sparse specification and write files with
|
(1) they will ignore the sparse specification and write files with
|
||||||
conflicts to the working tree (thus temporarily expanding the
|
conflicts to the working tree (thus temporarily expanding the
|
||||||
sparse specification to include such files.)
|
sparse specification to include such files.)
|
||||||
|
|
@ -609,37 +617,39 @@ From the above there are a few classes of behavior:
|
||||||
specification.
|
specification.
|
||||||
|
|
||||||
|
|
||||||
=== Subcommand-dependent defaults ===
|
== Subcommand-dependent defaults ==
|
||||||
|
|
||||||
Note that we have different defaults depending on the command for the
|
Note that we have different defaults depending on the command for the
|
||||||
desired behavior :
|
desired behavior :
|
||||||
|
|
||||||
* Commands defaulting to "restrict":
|
* Commands defaulting to "restrict":
|
||||||
* diff-files
|
|
||||||
* diff (without --cached or REVISION arguments)
|
|
||||||
* grep (without --cached or REVISION arguments)
|
|
||||||
* switch
|
|
||||||
* checkout (the switch-like half)
|
|
||||||
* reset (<commit>)
|
|
||||||
|
|
||||||
* restore
|
** diff-files
|
||||||
* checkout (the restore-like half)
|
** diff (without --cached or REVISION arguments)
|
||||||
* checkout-index
|
** grep (without --cached or REVISION arguments)
|
||||||
* reset (with pathspec)
|
** switch
|
||||||
|
** checkout (the switch-like half)
|
||||||
|
** reset (<commit>)
|
||||||
|
|
||||||
|
** restore
|
||||||
|
** checkout (the restore-like half)
|
||||||
|
** checkout-index
|
||||||
|
** reset (with pathspec)
|
||||||
|
|
||||||
This behavior makes sense; these interact with the working tree.
|
This behavior makes sense; these interact with the working tree.
|
||||||
|
|
||||||
* Commands defaulting to "restrict modulo conflicts":
|
* Commands defaulting to "restrict modulo conflicts":
|
||||||
* merge
|
|
||||||
* rebase
|
|
||||||
* cherry-pick
|
|
||||||
* revert
|
|
||||||
|
|
||||||
* am
|
** merge
|
||||||
* apply --index (which is kind of like an `am --no-commit`)
|
** rebase
|
||||||
|
** cherry-pick
|
||||||
|
** revert
|
||||||
|
|
||||||
* read-tree (especially with -m or -u; is kind of like a --no-commit merge)
|
** am
|
||||||
* reset (<tree-ish>, due to similarity to read-tree)
|
** apply --index (which is kind of like an `am --no-commit`)
|
||||||
|
|
||||||
|
** read-tree (especially with -m or -u; is kind of like a --no-commit merge)
|
||||||
|
** reset (<tree-ish>, due to similarity to read-tree)
|
||||||
|
|
||||||
These also interact with the working tree, but require slightly
|
These also interact with the working tree, but require slightly
|
||||||
different behavior either so that (a) conflicts can be resolved or (b)
|
different behavior either so that (a) conflicts can be resolved or (b)
|
||||||
|
|
@ -648,16 +658,17 @@ desired behavior :
|
||||||
(See also the "Known bugs" section below regarding `am` and `apply`)
|
(See also the "Known bugs" section below regarding `am` and `apply`)
|
||||||
|
|
||||||
* Commands defaulting to "no restrict":
|
* Commands defaulting to "no restrict":
|
||||||
* archive
|
|
||||||
* bundle
|
|
||||||
* commit
|
|
||||||
* format-patch
|
|
||||||
* fast-export
|
|
||||||
* fast-import
|
|
||||||
* commit-tree
|
|
||||||
|
|
||||||
* stash
|
** archive
|
||||||
* apply (without `--index`)
|
** bundle
|
||||||
|
** commit
|
||||||
|
** format-patch
|
||||||
|
** fast-export
|
||||||
|
** fast-import
|
||||||
|
** commit-tree
|
||||||
|
|
||||||
|
** stash
|
||||||
|
** apply (without `--index`)
|
||||||
|
|
||||||
These have completely different defaults and perhaps deserve the most
|
These have completely different defaults and perhaps deserve the most
|
||||||
detailed explanation:
|
detailed explanation:
|
||||||
|
|
@ -679,15 +690,18 @@ desired behavior :
|
||||||
sparse specification then we'll lose changes from the user.
|
sparse specification then we'll lose changes from the user.
|
||||||
|
|
||||||
* Commands defaulting to "restrict also specially applied to untracked files":
|
* Commands defaulting to "restrict also specially applied to untracked files":
|
||||||
* add
|
|
||||||
* rm
|
|
||||||
* mv
|
|
||||||
* update-index
|
|
||||||
* status
|
|
||||||
* clean (?)
|
|
||||||
|
|
||||||
|
** add
|
||||||
|
** rm
|
||||||
|
** mv
|
||||||
|
** update-index
|
||||||
|
** status
|
||||||
|
** clean (?)
|
||||||
|
|
||||||
|
....
|
||||||
Our original implementation for the first three of these commands was
|
Our original implementation for the first three of these commands was
|
||||||
"no restrict", but it had some severe usability issues:
|
"no restrict", but it had some severe usability issues:
|
||||||
|
|
||||||
* `git add <somefile>` if honored and outside the sparse
|
* `git add <somefile>` if honored and outside the sparse
|
||||||
specification, can result in the file randomly disappearing later
|
specification, can result in the file randomly disappearing later
|
||||||
when some subsequent command is run (since various commands
|
when some subsequent command is run (since various commands
|
||||||
|
|
@ -701,8 +715,10 @@ desired behavior :
|
||||||
So, we switched `add` and `rm` to default to "restrict", which made
|
So, we switched `add` and `rm` to default to "restrict", which made
|
||||||
usability problems much less severe and less frequent, but we still got
|
usability problems much less severe and less frequent, but we still got
|
||||||
complaints because commands like:
|
complaints because commands like:
|
||||||
|
|
||||||
git add <file-outside-sparse-specification>
|
git add <file-outside-sparse-specification>
|
||||||
git rm <file-outside-sparse-specification>
|
git rm <file-outside-sparse-specification>
|
||||||
|
|
||||||
would silently do nothing. We should instead print an error in those
|
would silently do nothing. We should instead print an error in those
|
||||||
cases to get usability right.
|
cases to get usability right.
|
||||||
|
|
||||||
|
|
@ -711,21 +727,22 @@ desired behavior :
|
||||||
|
|
||||||
There may be a difference in here between behavior A and behavior B in
|
There may be a difference in here between behavior A and behavior B in
|
||||||
terms of verboseness of errors or additional warnings.
|
terms of verboseness of errors or additional warnings.
|
||||||
|
....
|
||||||
|
|
||||||
* Commands falling under "restrict or no restrict dependent upon behavior
|
* Commands falling under "restrict or no restrict dependent upon behavior
|
||||||
A vs. behavior B"
|
A vs. behavior B"
|
||||||
|
|
||||||
* diff (with --cached or REVISION arguments)
|
** diff (with --cached or REVISION arguments)
|
||||||
* grep (with --cached or REVISION arguments)
|
** grep (with --cached or REVISION arguments)
|
||||||
* show (when given commit arguments)
|
** show (when given commit arguments)
|
||||||
* blame (only matters when one or more -C flags passed)
|
** blame (only matters when one or more -C flags passed)
|
||||||
* and annotate
|
*** and annotate
|
||||||
* log
|
** log
|
||||||
* and variants: shortlog, gitk, show-branch, whatchanged, rev-list
|
*** and variants: shortlog, gitk, show-branch, whatchanged, rev-list
|
||||||
* ls-files
|
** ls-files
|
||||||
* diff-index
|
** diff-index
|
||||||
* diff-tree
|
** diff-tree
|
||||||
* ls-tree
|
** ls-tree
|
||||||
|
|
||||||
For now, we default to behavior B for these, which want a default of
|
For now, we default to behavior B for these, which want a default of
|
||||||
"no restrict".
|
"no restrict".
|
||||||
|
|
@ -749,7 +766,7 @@ desired behavior :
|
||||||
implemented.
|
implemented.
|
||||||
|
|
||||||
|
|
||||||
=== Sparse specification vs. sparsity patterns ===
|
== Sparse specification vs. sparsity patterns ==
|
||||||
|
|
||||||
In a well-behaved situation, the sparse specification is given directly
|
In a well-behaved situation, the sparse specification is given directly
|
||||||
by the $GIT_DIR/info/sparse-checkout file. However, it can transiently
|
by the $GIT_DIR/info/sparse-checkout file. However, it can transiently
|
||||||
|
|
@ -821,45 +838,48 @@ under behavior B index operations are lumped with history and tend to
|
||||||
operate full-tree.
|
operate full-tree.
|
||||||
|
|
||||||
|
|
||||||
=== Implementation Questions ===
|
== Implementation Questions ==
|
||||||
|
|
||||||
* Do the options --scope={sparse,all} sound good to others? Are there better
|
* Do the options --scope={sparse,all} sound good to others? Are there better options?
|
||||||
options?
|
|
||||||
* Names in use, or appearing in patches, or previously suggested:
|
** Names in use, or appearing in patches, or previously suggested:
|
||||||
* --sparse/--dense
|
|
||||||
* --ignore-skip-worktree-bits
|
*** --sparse/--dense
|
||||||
* --ignore-skip-worktree-entries
|
*** --ignore-skip-worktree-bits
|
||||||
* --ignore-sparsity
|
*** --ignore-skip-worktree-entries
|
||||||
* --[no-]restrict-to-sparse-paths
|
*** --ignore-sparsity
|
||||||
* --full-tree/--sparse-tree
|
*** --[no-]restrict-to-sparse-paths
|
||||||
* --[no-]restrict
|
*** --full-tree/--sparse-tree
|
||||||
* --scope={sparse,all}
|
*** --[no-]restrict
|
||||||
* --focus/--unfocus
|
*** --scope={sparse,all}
|
||||||
* --limit/--unlimited
|
*** --focus/--unfocus
|
||||||
* Rationale making me lean slightly towards --scope={sparse,all}:
|
*** --limit/--unlimited
|
||||||
* We want a name that works for many commands, so we need a name that
|
|
||||||
|
** Rationale making me lean slightly towards --scope={sparse,all}:
|
||||||
|
|
||||||
|
*** We want a name that works for many commands, so we need a name that
|
||||||
does not conflict
|
does not conflict
|
||||||
* We know that we have more than two possible usecases, so it is best
|
*** We know that we have more than two possible usecases, so it is best
|
||||||
to avoid a flag that appears to be binary.
|
to avoid a flag that appears to be binary.
|
||||||
* --scope={sparse,all} isn't overly long and seems relatively
|
*** --scope={sparse,all} isn't overly long and seems relatively
|
||||||
explanatory
|
explanatory
|
||||||
* `--sparse`, as used in add/rm/mv, is totally backwards for
|
*** `--sparse`, as used in add/rm/mv, is totally backwards for
|
||||||
grep/log/etc. Changing the meaning of `--sparse` for these
|
grep/log/etc. Changing the meaning of `--sparse` for these
|
||||||
commands would fix the backwardness, but possibly break existing
|
commands would fix the backwardness, but possibly break existing
|
||||||
scripts. Using a new name pairing would allow us to treat
|
scripts. Using a new name pairing would allow us to treat
|
||||||
`--sparse` in these commands as a deprecated alias.
|
`--sparse` in these commands as a deprecated alias.
|
||||||
* There is a different `--sparse`/`--dense` pair for commands using
|
*** There is a different `--sparse`/`--dense` pair for commands using
|
||||||
revision machinery, so using that naming might cause confusion
|
revision machinery, so using that naming might cause confusion
|
||||||
* There is also a `--sparse` in both pack-objects and show-branch, which
|
*** There is also a `--sparse` in both pack-objects and show-branch, which
|
||||||
don't conflict but do suggest that `--sparse` is overloaded
|
don't conflict but do suggest that `--sparse` is overloaded
|
||||||
* The name --ignore-skip-worktree-bits is a double negative, is
|
*** The name --ignore-skip-worktree-bits is a double negative, is
|
||||||
quite a mouthful, refers to an implementation detail that many
|
quite a mouthful, refers to an implementation detail that many
|
||||||
users may not be familiar with, and we'd need a negation for it
|
users may not be familiar with, and we'd need a negation for it
|
||||||
which would probably be even more ridiculously long. (But we
|
which would probably be even more ridiculously long. (But we
|
||||||
can make --ignore-skip-worktree-bits a deprecated alias for
|
can make --ignore-skip-worktree-bits a deprecated alias for
|
||||||
--no-restrict.)
|
--no-restrict.)
|
||||||
|
|
||||||
* If a config option is added (sparse.scope?) what should the values and
|
** If a config option is added (sparse.scope?) what should the values and
|
||||||
description be? "sparse" (behavior A), "worktree-sparse-history-dense"
|
description be? "sparse" (behavior A), "worktree-sparse-history-dense"
|
||||||
(behavior B), "dense" (behavior C)? There's a risk of confusion,
|
(behavior B), "dense" (behavior C)? There's a risk of confusion,
|
||||||
because even for Behaviors A and B we want some commands to be
|
because even for Behaviors A and B we want some commands to be
|
||||||
|
|
@ -868,19 +888,20 @@ operate full-tree.
|
||||||
the primary difference we are focusing is just the history-querying
|
the primary difference we are focusing is just the history-querying
|
||||||
commands (log/diff/grep). Previous config suggestion here: [13]
|
commands (log/diff/grep). Previous config suggestion here: [13]
|
||||||
|
|
||||||
* Is `--no-expand` a good alias for ls-files's `--sparse` option?
|
** Is `--no-expand` a good alias for ls-files's `--sparse` option?
|
||||||
(`--sparse` does not map to either `--scope=sparse` or `--scope=all`,
|
(`--sparse` does not map to either `--scope=sparse` or `--scope=all`,
|
||||||
because in non-cone mode it does nothing and in cone-mode it shows the
|
because in non-cone mode it does nothing and in cone-mode it shows the
|
||||||
sparse directory entries which are technically outside the sparse
|
sparse directory entries which are technically outside the sparse
|
||||||
specification)
|
specification)
|
||||||
|
|
||||||
* Under Behavior A:
|
** Under Behavior A:
|
||||||
* Does ls-files' `--no-expand` override the default `--scope=all`, or
|
|
||||||
does it need an extra flag?
|
|
||||||
* Does ls-files' `-t` option imply `--scope=all`?
|
|
||||||
* Does update-index's `--[no-]skip-worktree` option imply `--scope=all`?
|
|
||||||
|
|
||||||
* sparse-checkout: once behavior A is fully implemented, should we take
|
*** Does ls-files' `--no-expand` override the default `--scope=all`, or
|
||||||
|
does it need an extra flag?
|
||||||
|
*** Does ls-files' `-t` option imply `--scope=all`?
|
||||||
|
*** Does update-index's `--[no-]skip-worktree` option imply `--scope=all`?
|
||||||
|
|
||||||
|
** sparse-checkout: once behavior A is fully implemented, should we take
|
||||||
an interim measure to ease people into switching the default? Namely,
|
an interim measure to ease people into switching the default? Namely,
|
||||||
if folks are not already in a sparse checkout, then require
|
if folks are not already in a sparse checkout, then require
|
||||||
`sparse-checkout init/set` to take a
|
`sparse-checkout init/set` to take a
|
||||||
|
|
@ -892,7 +913,7 @@ operate full-tree.
|
||||||
is seamless for them.
|
is seamless for them.
|
||||||
|
|
||||||
|
|
||||||
=== Implementation Goals/Plans ===
|
== Implementation Goals/Plans ==
|
||||||
|
|
||||||
* Get buy-in on this document in general.
|
* Get buy-in on this document in general.
|
||||||
|
|
||||||
|
|
@ -910,25 +931,26 @@ operate full-tree.
|
||||||
request that they not trigger this bug." flag
|
request that they not trigger this bug." flag
|
||||||
|
|
||||||
* Flags & Config
|
* Flags & Config
|
||||||
* Make `--sparse` in add/rm/mv a deprecated alias for `--scope=all`
|
|
||||||
* Make `--ignore-skip-worktree-bits` in checkout-index/checkout/restore
|
** Make `--sparse` in add/rm/mv a deprecated alias for `--scope=all`
|
||||||
|
** Make `--ignore-skip-worktree-bits` in checkout-index/checkout/restore
|
||||||
a deprecated aliases for `--scope=all`
|
a deprecated aliases for `--scope=all`
|
||||||
* Create config option (sparse.scope?), tie it to the "Cliff notes"
|
** Create config option (sparse.scope?), tie it to the "Cliff notes"
|
||||||
overview
|
overview
|
||||||
|
|
||||||
* Add --scope=sparse (and --scope=all) flag to each of the history querying
|
** Add --scope=sparse (and --scope=all) flag to each of the history querying
|
||||||
commands. IMPORTANT: make sure diff machinery changes don't mess with
|
commands. IMPORTANT: make sure diff machinery changes don't mess with
|
||||||
format-patch, fast-export, etc.
|
format-patch, fast-export, etc.
|
||||||
|
|
||||||
=== Known bugs ===
|
== Known bugs ==
|
||||||
|
|
||||||
This list used to be a lot longer (see e.g. [1,2,3,4,5,6,7,8,9]), but we've
|
This list used to be a lot longer (see e.g. [1,2,3,4,5,6,7,8,9]), but we've
|
||||||
been working on it.
|
been working on it.
|
||||||
|
|
||||||
0. Behavior A is not well supported in Git. (Behavior B didn't used to
|
1. Behavior A is not well supported in Git. (Behavior B didn't used to
|
||||||
be either, but was the easier of the two to implement.)
|
be either, but was the easier of the two to implement.)
|
||||||
|
|
||||||
1. am and apply:
|
2. am and apply:
|
||||||
|
|
||||||
apply, without `--index` or `--cached`, relies on files being present
|
apply, without `--index` or `--cached`, relies on files being present
|
||||||
in the working copy, and also writes to them unconditionally. As
|
in the working copy, and also writes to them unconditionally. As
|
||||||
|
|
@ -948,7 +970,7 @@ been working on it.
|
||||||
files and then complain that those vivified files would be
|
files and then complain that those vivified files would be
|
||||||
overwritten by merge.
|
overwritten by merge.
|
||||||
|
|
||||||
2. reset --hard:
|
3. reset --hard:
|
||||||
|
|
||||||
reset --hard provides confusing error message (works correctly, but
|
reset --hard provides confusing error message (works correctly, but
|
||||||
misleads the user into believing it didn't):
|
misleads the user into believing it didn't):
|
||||||
|
|
@ -971,13 +993,13 @@ been working on it.
|
||||||
`git reset --hard` DID remove addme from the index and the working tree, contrary
|
`git reset --hard` DID remove addme from the index and the working tree, contrary
|
||||||
to the error message, but in line with how reset --hard should behave.
|
to the error message, but in line with how reset --hard should behave.
|
||||||
|
|
||||||
3. read-tree
|
4. read-tree
|
||||||
|
|
||||||
`read-tree` doesn't apply the 'SKIP_WORKTREE' bit to *any* of the
|
`read-tree` doesn't apply the 'SKIP_WORKTREE' bit to *any* of the
|
||||||
entries it reads into the index, resulting in all your files suddenly
|
entries it reads into the index, resulting in all your files suddenly
|
||||||
appearing to be "deleted".
|
appearing to be "deleted".
|
||||||
|
|
||||||
4. Checkout, restore:
|
5. Checkout, restore:
|
||||||
|
|
||||||
These command do not handle path & revision arguments appropriately:
|
These command do not handle path & revision arguments appropriately:
|
||||||
|
|
||||||
|
|
@ -1030,7 +1052,7 @@ been working on it.
|
||||||
S tracked
|
S tracked
|
||||||
H tracked-but-maybe-skipped
|
H tracked-but-maybe-skipped
|
||||||
|
|
||||||
5. checkout and restore --staged, continued:
|
6. checkout and restore --staged, continued:
|
||||||
|
|
||||||
These commands do not correctly scope operations to the sparse
|
These commands do not correctly scope operations to the sparse
|
||||||
specification, and make it worse by not setting important SKIP_WORKTREE
|
specification, and make it worse by not setting important SKIP_WORKTREE
|
||||||
|
|
@ -1046,56 +1068,82 @@ been working on it.
|
||||||
the sparse specification, but then it will be important to set the
|
the sparse specification, but then it will be important to set the
|
||||||
SKIP_WORKTREE bits appropriately.
|
SKIP_WORKTREE bits appropriately.
|
||||||
|
|
||||||
6. Performance issues; see:
|
7. Performance issues; see:
|
||||||
|
|
||||||
https://lore.kernel.org/git/CABPp-BEkJQoKZsQGCYioyga_uoDQ6iBeW+FKr8JhyuuTMK1RDw@mail.gmail.com/
|
https://lore.kernel.org/git/CABPp-BEkJQoKZsQGCYioyga_uoDQ6iBeW+FKr8JhyuuTMK1RDw@mail.gmail.com/
|
||||||
|
|
||||||
|
|
||||||
=== Reference Emails ===
|
== Reference Emails ==
|
||||||
|
|
||||||
Emails that detail various bugs we've had in sparse-checkout:
|
Emails that detail various bugs we've had in sparse-checkout:
|
||||||
|
|
||||||
[1] (Original descriptions of behavior A & behavior B)
|
[1] (Original descriptions of behavior A & behavior B):
|
||||||
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
|
|
||||||
[2] (Fix stash applications in sparse checkouts; bugs from behavioral differences)
|
|
||||||
https://lore.kernel.org/git/ccfedc7140dbf63ba26a15f93bd3885180b26517.1606861519.git.gitgitgadget@gmail.com/
|
|
||||||
[3] (Present-despite-skipped entries)
|
|
||||||
https://lore.kernel.org/git/11d46a399d26c913787b704d2b7169cafc28d639.1642175983.git.gitgitgadget@gmail.com/
|
|
||||||
[4] (Clone --no-checkout interaction)
|
|
||||||
https://lore.kernel.org/git/pull.801.v2.git.git.1591324899170.gitgitgadget@gmail.com/ (clone --no-checkout)
|
|
||||||
[5] (The need for update_sparsity() and avoiding `read-tree -mu HEAD`)
|
|
||||||
https://lore.kernel.org/git/3a1f084641eb47515b5a41ed4409a36128913309.1585270142.git.gitgitgadget@gmail.com/
|
|
||||||
[6] (SKIP_WORKTREE is advisory, not mandatory)
|
|
||||||
https://lore.kernel.org/git/844306c3e86ef67591cc086decb2b760e7d710a3.1585270142.git.gitgitgadget@gmail.com/
|
|
||||||
[7] (`worktree add` should copy sparsity settings from current worktree)
|
|
||||||
https://lore.kernel.org/git/c51cb3714e7b1d2f8c9370fe87eca9984ff4859f.1644269584.git.gitgitgadget@gmail.com/
|
|
||||||
[8] (Avoid negative surprises in add, rm, and mv)
|
|
||||||
https://lore.kernel.org/git/cover.1617914011.git.matheus.bernardino@usp.br/
|
|
||||||
https://lore.kernel.org/git/pull.1018.v4.git.1632497954.gitgitgadget@gmail.com/
|
|
||||||
[9] (Move from out-of-cone to in-cone)
|
|
||||||
https://lore.kernel.org/git/20220630023737.473690-6-shaoxuan.yuan02@gmail.com/
|
|
||||||
https://lore.kernel.org/git/20220630023737.473690-4-shaoxuan.yuan02@gmail.com/
|
|
||||||
[10] (Unnecessarily downloading objects outside sparse specification)
|
|
||||||
https://lore.kernel.org/git/CAOLTT8QfwOi9yx_qZZgyGa8iL8kHWutEED7ok_jxwTcYT_hf9Q@mail.gmail.com/
|
|
||||||
|
|
||||||
[11] (Stolee's comments on high-level usecases)
|
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
|
||||||
https://lore.kernel.org/git/1a1e33f6-3514-9afc-0a28-5a6b85bd8014@gmail.com/
|
|
||||||
|
[2] (Fix stash applications in sparse checkouts; bugs from behavioral differences):
|
||||||
|
|
||||||
|
https://lore.kernel.org/git/ccfedc7140dbf63ba26a15f93bd3885180b26517.1606861519.git.gitgitgadget@gmail.com/
|
||||||
|
|
||||||
|
[3] (Present-despite-skipped entries):
|
||||||
|
|
||||||
|
https://lore.kernel.org/git/11d46a399d26c913787b704d2b7169cafc28d639.1642175983.git.gitgitgadget@gmail.com/
|
||||||
|
|
||||||
|
[4] (Clone --no-checkout interaction):
|
||||||
|
|
||||||
|
https://lore.kernel.org/git/pull.801.v2.git.git.1591324899170.gitgitgadget@gmail.com/ (clone --no-checkout)
|
||||||
|
|
||||||
|
[5] (The need for update_sparsity() and avoiding `read-tree -mu HEAD`):
|
||||||
|
|
||||||
|
https://lore.kernel.org/git/3a1f084641eb47515b5a41ed4409a36128913309.1585270142.git.gitgitgadget@gmail.com/
|
||||||
|
|
||||||
|
[6] (SKIP_WORKTREE is advisory, not mandatory):
|
||||||
|
|
||||||
|
https://lore.kernel.org/git/844306c3e86ef67591cc086decb2b760e7d710a3.1585270142.git.gitgitgadget@gmail.com/
|
||||||
|
|
||||||
|
[7] (`worktree add` should copy sparsity settings from current worktree):
|
||||||
|
|
||||||
|
https://lore.kernel.org/git/c51cb3714e7b1d2f8c9370fe87eca9984ff4859f.1644269584.git.gitgitgadget@gmail.com/
|
||||||
|
|
||||||
|
[8] (Avoid negative surprises in add, rm, and mv):
|
||||||
|
|
||||||
|
* https://lore.kernel.org/git/cover.1617914011.git.matheus.bernardino@usp.br/
|
||||||
|
* https://lore.kernel.org/git/pull.1018.v4.git.1632497954.gitgitgadget@gmail.com/
|
||||||
|
|
||||||
|
[9] (Move from out-of-cone to in-cone):
|
||||||
|
|
||||||
|
* https://lore.kernel.org/git/20220630023737.473690-6-shaoxuan.yuan02@gmail.com/
|
||||||
|
* https://lore.kernel.org/git/20220630023737.473690-4-shaoxuan.yuan02@gmail.com/
|
||||||
|
|
||||||
|
[10] (Unnecessarily downloading objects outside sparse specification):
|
||||||
|
|
||||||
|
https://lore.kernel.org/git/CAOLTT8QfwOi9yx_qZZgyGa8iL8kHWutEED7ok_jxwTcYT_hf9Q@mail.gmail.com/
|
||||||
|
|
||||||
|
[11] (Stolee's comments on high-level usecases):
|
||||||
|
|
||||||
|
https://lore.kernel.org/git/1a1e33f6-3514-9afc-0a28-5a6b85bd8014@gmail.com/
|
||||||
|
|
||||||
[12] Others commenting on eventually switching default to behavior A:
|
[12] Others commenting on eventually switching default to behavior A:
|
||||||
|
|
||||||
* https://lore.kernel.org/git/xmqqh719pcoo.fsf@gitster.g/
|
* https://lore.kernel.org/git/xmqqh719pcoo.fsf@gitster.g/
|
||||||
* https://lore.kernel.org/git/xmqqzgeqw0sy.fsf@gitster.g/
|
* https://lore.kernel.org/git/xmqqzgeqw0sy.fsf@gitster.g/
|
||||||
* https://lore.kernel.org/git/a86af661-cf58-a4e5-0214-a67d3a794d7e@github.com/
|
* https://lore.kernel.org/git/a86af661-cf58-a4e5-0214-a67d3a794d7e@github.com/
|
||||||
|
|
||||||
[13] Previous config name suggestion and description
|
[13] Previous config name suggestion and description:
|
||||||
* https://lore.kernel.org/git/CABPp-BE6zW0nJSStcVU=_DoDBnPgLqOR8pkTXK3dW11=T01OhA@mail.gmail.com/
|
|
||||||
|
https://lore.kernel.org/git/CABPp-BE6zW0nJSStcVU=_DoDBnPgLqOR8pkTXK3dW11=T01OhA@mail.gmail.com/
|
||||||
|
|
||||||
[14] Tangential issue: switch to cone mode as default sparse specification mechanism:
|
[14] Tangential issue: switch to cone mode as default sparse specification mechanism:
|
||||||
https://lore.kernel.org/git/a1b68fd6126eb341ef3637bb93fedad4309b36d0.1650594746.git.gitgitgadget@gmail.com/
|
|
||||||
|
https://lore.kernel.org/git/a1b68fd6126eb341ef3637bb93fedad4309b36d0.1650594746.git.gitgitgadget@gmail.com/
|
||||||
|
|
||||||
[15] Lengthy email on grep behavior, covering what should be searched:
|
[15] Lengthy email on grep behavior, covering what should be searched:
|
||||||
* https://lore.kernel.org/git/CABPp-BGVO3QdbfE84uF_3QDF0-y2iHHh6G5FAFzNRfeRitkuHw@mail.gmail.com/
|
|
||||||
|
https://lore.kernel.org/git/CABPp-BGVO3QdbfE84uF_3QDF0-y2iHHh6G5FAFzNRfeRitkuHw@mail.gmail.com/
|
||||||
|
|
||||||
[16] Email explaining sparsity patterns vs. SKIP_WORKTREE and history operations,
|
[16] Email explaining sparsity patterns vs. SKIP_WORKTREE and history operations,
|
||||||
search for the parenthetical comment starting "We do not check".
|
search for the parenthetical comment starting "We do not check".
|
||||||
https://lore.kernel.org/git/CABPp-BFsCPPNOZ92JQRJeGyNd0e-TCW-LcLyr0i_+VSQJP+GCg@mail.gmail.com/
|
|
||||||
|
https://lore.kernel.org/git/CABPp-BFsCPPNOZ92JQRJeGyNd0e-TCW-LcLyr0i_+VSQJP+GCg@mail.gmail.com/
|
||||||
|
|
||||||
[17] https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@google.com/
|
[17] https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@google.com/
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue