Merge branch 'rj/doc-technical-fixes'
Documentation mark-up fixes. * rj/doc-technical-fixes: doc: add large-object-promisors.adoc to the docs build doc: commit-graph.adoc: fix up some formatting doc: sparse-checkout.adoc: fix asciidoc warnings doc: remembering-renames.adoc: fix asciidoc warningsmain
commit
411903ce4c
|
|
@ -123,6 +123,7 @@ TECH_DOCS += technical/bundle-uri
|
||||||
TECH_DOCS += technical/commit-graph
|
TECH_DOCS += technical/commit-graph
|
||||||
TECH_DOCS += technical/directory-rename-detection
|
TECH_DOCS += technical/directory-rename-detection
|
||||||
TECH_DOCS += technical/hash-function-transition
|
TECH_DOCS += technical/hash-function-transition
|
||||||
|
TECH_DOCS += technical/large-object-promisors
|
||||||
TECH_DOCS += technical/long-running-process-protocol
|
TECH_DOCS += technical/long-running-process-protocol
|
||||||
TECH_DOCS += technical/multi-pack-index
|
TECH_DOCS += technical/multi-pack-index
|
||||||
TECH_DOCS += technical/packfile-uri
|
TECH_DOCS += technical/packfile-uri
|
||||||
|
|
|
||||||
|
|
@ -39,6 +39,7 @@ A consumer may load the following info for a commit from the graph:
|
||||||
Values 1-4 satisfy the requirements of parse_commit_gently().
|
Values 1-4 satisfy the requirements of parse_commit_gently().
|
||||||
|
|
||||||
There are two definitions of generation number:
|
There are two definitions of generation number:
|
||||||
|
|
||||||
1. Corrected committer dates (generation number v2)
|
1. Corrected committer dates (generation number v2)
|
||||||
2. Topological levels (generation number v1)
|
2. Topological levels (generation number v1)
|
||||||
|
|
||||||
|
|
@ -158,7 +159,8 @@ number of commits in the full history. By creating a "chain" of commit-graphs,
|
||||||
we enable fast writes of new commit data without rewriting the entire commit
|
we enable fast writes of new commit data without rewriting the entire commit
|
||||||
history -- at least, most of the time.
|
history -- at least, most of the time.
|
||||||
|
|
||||||
## File Layout
|
File Layout
|
||||||
|
~~~~~~~~~~~
|
||||||
|
|
||||||
A commit-graph chain uses multiple files, and we use a fixed naming convention
|
A commit-graph chain uses multiple files, and we use a fixed naming convention
|
||||||
to organize these files. Each commit-graph file has a name
|
to organize these files. Each commit-graph file has a name
|
||||||
|
|
@ -170,11 +172,11 @@ hashes for the files in order from "lowest" to "highest".
|
||||||
|
|
||||||
For example, if the `commit-graph-chain` file contains the lines
|
For example, if the `commit-graph-chain` file contains the lines
|
||||||
|
|
||||||
```
|
----
|
||||||
{hash0}
|
{hash0}
|
||||||
{hash1}
|
{hash1}
|
||||||
{hash2}
|
{hash2}
|
||||||
```
|
----
|
||||||
|
|
||||||
then the commit-graph chain looks like the following diagram:
|
then the commit-graph chain looks like the following diagram:
|
||||||
|
|
||||||
|
|
@ -213,7 +215,8 @@ specifying the hashes of all files in the lower layers. In the above example,
|
||||||
`graph-{hash1}.graph` contains `{hash0}` while `graph-{hash2}.graph` contains
|
`graph-{hash1}.graph` contains `{hash0}` while `graph-{hash2}.graph` contains
|
||||||
`{hash0}` and `{hash1}`.
|
`{hash0}` and `{hash1}`.
|
||||||
|
|
||||||
## Merging commit-graph files
|
Merging commit-graph files
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
If we only added a new commit-graph file on every write, we would run into a
|
If we only added a new commit-graph file on every write, we would run into a
|
||||||
linear search problem through many commit-graph files. Instead, we use a merge
|
linear search problem through many commit-graph files. Instead, we use a merge
|
||||||
|
|
@ -225,6 +228,7 @@ is determined by the merge strategy that the files should collapse to
|
||||||
the commits in `graph-{hash1}` should be combined into a new `graph-{hash3}`
|
the commits in `graph-{hash1}` should be combined into a new `graph-{hash3}`
|
||||||
file.
|
file.
|
||||||
|
|
||||||
|
....
|
||||||
+---------------------+
|
+---------------------+
|
||||||
| |
|
| |
|
||||||
| (new commits) |
|
| (new commits) |
|
||||||
|
|
@ -250,6 +254,7 @@ file.
|
||||||
| |
|
| |
|
||||||
| |
|
| |
|
||||||
+-----------------------+
|
+-----------------------+
|
||||||
|
....
|
||||||
|
|
||||||
During this process, the commits to write are combined, sorted and we write the
|
During this process, the commits to write are combined, sorted and we write the
|
||||||
contents to a temporary file, all while holding a `commit-graph-chain.lock`
|
contents to a temporary file, all while holding a `commit-graph-chain.lock`
|
||||||
|
|
@ -257,14 +262,15 @@ lock-file. When the file is flushed, we rename it to `graph-{hash3}`
|
||||||
according to the computed `{hash3}`. Finally, we write the new chain data to
|
according to the computed `{hash3}`. Finally, we write the new chain data to
|
||||||
`commit-graph-chain.lock`:
|
`commit-graph-chain.lock`:
|
||||||
|
|
||||||
```
|
----
|
||||||
{hash3}
|
{hash3}
|
||||||
{hash0}
|
{hash0}
|
||||||
```
|
----
|
||||||
|
|
||||||
We then close the lock-file.
|
We then close the lock-file.
|
||||||
|
|
||||||
## Merge Strategy
|
Merge Strategy
|
||||||
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
When writing a set of commits that do not exist in the commit-graph stack of
|
When writing a set of commits that do not exist in the commit-graph stack of
|
||||||
height N, we default to creating a new file at level N + 1. We then decide to
|
height N, we default to creating a new file at level N + 1. We then decide to
|
||||||
|
|
@ -289,7 +295,8 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum
|
||||||
number of commits) could be extracted into config settings for full
|
number of commits) could be extracted into config settings for full
|
||||||
flexibility.
|
flexibility.
|
||||||
|
|
||||||
## Handling Mixed Generation Number Chains
|
Handling Mixed Generation Number Chains
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
With the introduction of generation number v2 and generation data chunk, the
|
With the introduction of generation number v2 and generation data chunk, the
|
||||||
following scenario is possible:
|
following scenario is possible:
|
||||||
|
|
@ -318,7 +325,8 @@ have corrected commit dates when written by compatible versions of Git. Thus,
|
||||||
rewriting split commit-graph as a single file (`--split=replace`) creates a
|
rewriting split commit-graph as a single file (`--split=replace`) creates a
|
||||||
single layer with corrected commit dates.
|
single layer with corrected commit dates.
|
||||||
|
|
||||||
## Deleting graph-{hash} files
|
Deleting graph-\{hash\} files
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
After a new tip file is written, some `graph-{hash}` files may no longer
|
After a new tip file is written, some `graph-{hash}` files may no longer
|
||||||
be part of a chain. It is important to remove these files from disk, eventually.
|
be part of a chain. It is important to remove these files from disk, eventually.
|
||||||
|
|
@ -333,7 +341,8 @@ files whose modified times are older than a given expiry window. This window
|
||||||
defaults to zero, but can be changed using command-line arguments or a config
|
defaults to zero, but can be changed using command-line arguments or a config
|
||||||
setting.
|
setting.
|
||||||
|
|
||||||
## Chains across multiple object directories
|
Chains across multiple object directories
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
In a repo with alternates, we look for the `commit-graph-chain` file starting
|
In a repo with alternates, we look for the `commit-graph-chain` file starting
|
||||||
in the local object directory and then in each alternate. The first file that
|
in the local object directory and then in each alternate. The first file that
|
||||||
|
|
|
||||||
|
|
@ -34,8 +34,8 @@ a new object representation for large blobs as discussed in:
|
||||||
|
|
||||||
https://lore.kernel.org/git/xmqqbkdometi.fsf@gitster.g/
|
https://lore.kernel.org/git/xmqqbkdometi.fsf@gitster.g/
|
||||||
|
|
||||||
0) Non goals
|
Non goals
|
||||||
------------
|
---------
|
||||||
|
|
||||||
- We will not discuss those client side improvements here, as they
|
- We will not discuss those client side improvements here, as they
|
||||||
would require changes in different parts of Git than this effort.
|
would require changes in different parts of Git than this effort.
|
||||||
|
|
@ -90,8 +90,8 @@ later in this document:
|
||||||
even more to host content with larger blobs or more large blobs
|
even more to host content with larger blobs or more large blobs
|
||||||
than currently.
|
than currently.
|
||||||
|
|
||||||
I) Issues with the current situation
|
I Issues with the current situation
|
||||||
------------------------------------
|
-----------------------------------
|
||||||
|
|
||||||
- Some statistics made on GitLab repos have shown that more than 75%
|
- Some statistics made on GitLab repos have shown that more than 75%
|
||||||
of the disk space is used by blobs that are larger than 1MB and
|
of the disk space is used by blobs that are larger than 1MB and
|
||||||
|
|
@ -138,8 +138,8 @@ I) Issues with the current situation
|
||||||
complaining that these tools require significant effort to set up,
|
complaining that these tools require significant effort to set up,
|
||||||
learn and use correctly.
|
learn and use correctly.
|
||||||
|
|
||||||
II) Main features of the "Large Object Promisors" solution
|
II Main features of the "Large Object Promisors" solution
|
||||||
----------------------------------------------------------
|
---------------------------------------------------------
|
||||||
|
|
||||||
The main features below should give a rough overview of how the
|
The main features below should give a rough overview of how the
|
||||||
solution may work. Details about needed elements can be found in
|
solution may work. Details about needed elements can be found in
|
||||||
|
|
@ -166,7 +166,7 @@ format. They should be used along with main remotes that contain the
|
||||||
other objects.
|
other objects.
|
||||||
|
|
||||||
Note 1
|
Note 1
|
||||||
++++++
|
^^^^^^
|
||||||
|
|
||||||
To clarify, a LOP is a normal promisor remote, except that:
|
To clarify, a LOP is a normal promisor remote, except that:
|
||||||
|
|
||||||
|
|
@ -178,7 +178,7 @@ To clarify, a LOP is a normal promisor remote, except that:
|
||||||
itself.
|
itself.
|
||||||
|
|
||||||
Note 2
|
Note 2
|
||||||
++++++
|
^^^^^^
|
||||||
|
|
||||||
Git already makes it possible for a main remote to also be a promisor
|
Git already makes it possible for a main remote to also be a promisor
|
||||||
remote storing both regular objects and large blobs for a client that
|
remote storing both regular objects and large blobs for a client that
|
||||||
|
|
@ -186,13 +186,13 @@ clones from it with a filter on blob size. But here we explicitly want
|
||||||
to avoid that.
|
to avoid that.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
LOPs aim to be good at handling large blobs while main remotes are
|
LOPs aim to be good at handling large blobs while main remotes are
|
||||||
already good at handling other objects.
|
already good at handling other objects.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
Git already has support for multiple promisor remotes, see
|
Git already has support for multiple promisor remotes, see
|
||||||
link:partial-clone.html#using-many-promisor-remotes[the partial clone documentation].
|
link:partial-clone.html#using-many-promisor-remotes[the partial clone documentation].
|
||||||
|
|
@ -213,19 +213,19 @@ remote helper (see linkgit:gitremote-helpers[7]) which makes the
|
||||||
underlying object storage appear like a remote to Git.
|
underlying object storage appear like a remote to Git.
|
||||||
|
|
||||||
Note
|
Note
|
||||||
++++
|
^^^^
|
||||||
|
|
||||||
A LOP can be a promisor remote accessed using a remote helper by
|
A LOP can be a promisor remote accessed using a remote helper by
|
||||||
both some clients and the main remote.
|
both some clients and the main remote.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
This looks like the simplest way to create LOPs that can cheaply
|
This looks like the simplest way to create LOPs that can cheaply
|
||||||
handle many large blobs.
|
handle many large blobs.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
Remote helpers are quite easy to write as shell scripts, but it might
|
Remote helpers are quite easy to write as shell scripts, but it might
|
||||||
be more efficient and maintainable to write them using other languages
|
be more efficient and maintainable to write them using other languages
|
||||||
|
|
@ -247,7 +247,7 @@ The underlying object storage that a LOP uses could also serve as
|
||||||
storage for large files handled by Git LFS.
|
storage for large files handled by Git LFS.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
This would simplify the server side if it wants to both use a LOP and
|
This would simplify the server side if it wants to both use a LOP and
|
||||||
act as a Git LFS server.
|
act as a Git LFS server.
|
||||||
|
|
@ -259,7 +259,7 @@ On the server side, a main remote should have a way to offload to a
|
||||||
LOP all its blobs with a size over a configurable threshold.
|
LOP all its blobs with a size over a configurable threshold.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
This makes it easy to set things up and to clean things up. For
|
This makes it easy to set things up and to clean things up. For
|
||||||
example, an admin could use this to manually convert a repo not using
|
example, an admin could use this to manually convert a repo not using
|
||||||
|
|
@ -268,7 +268,7 @@ some users would sometimes push large blobs, a cron job could use this
|
||||||
to regularly make sure the large blobs are moved to the LOP.
|
to regularly make sure the large blobs are moved to the LOP.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
Using something based on `git repack --filter=...` to separate the
|
Using something based on `git repack --filter=...` to separate the
|
||||||
blobs we want to offload from the other Git objects could be a good
|
blobs we want to offload from the other Git objects could be a good
|
||||||
|
|
@ -284,13 +284,13 @@ should have ways to prevent oversize blobs to be fetched, and also
|
||||||
perhaps pushed, into it.
|
perhaps pushed, into it.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
A main remote containing many oversize blobs would defeat the purpose
|
A main remote containing many oversize blobs would defeat the purpose
|
||||||
of LOPs.
|
of LOPs.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
The way to offload to a LOP discussed in 4) above can be used to
|
The way to offload to a LOP discussed in 4) above can be used to
|
||||||
regularly offload oversize blobs. About preventing oversize blobs from
|
regularly offload oversize blobs. About preventing oversize blobs from
|
||||||
|
|
@ -326,18 +326,18 @@ large blobs directly from the LOP and the server would not need to
|
||||||
fetch those blobs from the LOP to be able to serve the client.
|
fetch those blobs from the LOP to be able to serve the client.
|
||||||
|
|
||||||
Note
|
Note
|
||||||
++++
|
^^^^
|
||||||
|
|
||||||
For fetches instead of clones, a protocol negotiation might not always
|
For fetches instead of clones, a protocol negotiation might not always
|
||||||
happen, see the "What about fetches?" FAQ entry below for details.
|
happen, see the "What about fetches?" FAQ entry below for details.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
Security, configurability and efficiency of setting things up.
|
Security, configurability and efficiency of setting things up.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
A "promisor-remote" protocol v2 capability looks like a good way to
|
A "promisor-remote" protocol v2 capability looks like a good way to
|
||||||
implement this. The way the client and server use this capability
|
implement this. The way the client and server use this capability
|
||||||
|
|
@ -356,7 +356,7 @@ the client should be able to offload some large blobs it has fetched,
|
||||||
but might not need anymore, to the LOP.
|
but might not need anymore, to the LOP.
|
||||||
|
|
||||||
Note
|
Note
|
||||||
++++
|
^^^^
|
||||||
|
|
||||||
It might depend on the context if it should be OK or not for clients
|
It might depend on the context if it should be OK or not for clients
|
||||||
to offload large blobs they have created, instead of fetched, directly
|
to offload large blobs they have created, instead of fetched, directly
|
||||||
|
|
@ -367,13 +367,13 @@ This should be discussed and refined when we get closer to
|
||||||
implementing this feature.
|
implementing this feature.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
+++++++++
|
^^^^^^^^^
|
||||||
|
|
||||||
On the client, the easiest way to deal with unneeded large blobs is to
|
On the client, the easiest way to deal with unneeded large blobs is to
|
||||||
offload them.
|
offload them.
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
++++++++++++++
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
This is very similar to what 4) above is about, except on the client
|
This is very similar to what 4) above is about, except on the client
|
||||||
side instead of the server side. So a good solution to 4) could likely
|
side instead of the server side. So a good solution to 4) could likely
|
||||||
|
|
@ -385,8 +385,8 @@ when cloning (see 6) above). Also if the large blobs were fetched from
|
||||||
a LOP, it is likely, and can easily be confirmed, that the LOP still
|
a LOP, it is likely, and can easily be confirmed, that the LOP still
|
||||||
has them, so that they can just be removed from the client.
|
has them, so that they can just be removed from the client.
|
||||||
|
|
||||||
III) Benefits of using LOPs
|
III Benefits of using LOPs
|
||||||
---------------------------
|
--------------------------
|
||||||
|
|
||||||
Many benefits are related to the issues discussed in "I) Issues with
|
Many benefits are related to the issues discussed in "I) Issues with
|
||||||
the current situation" above:
|
the current situation" above:
|
||||||
|
|
@ -406,8 +406,8 @@ the current situation" above:
|
||||||
|
|
||||||
- Reduced storage needs on the client side.
|
- Reduced storage needs on the client side.
|
||||||
|
|
||||||
IV) FAQ
|
IV FAQ
|
||||||
-------
|
------
|
||||||
|
|
||||||
What about using multiple LOPs on the server and client side?
|
What about using multiple LOPs on the server and client side?
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
@ -533,7 +533,7 @@ some objects it already knows about but doesn't have because they are
|
||||||
on a promisor remote.
|
on a promisor remote.
|
||||||
|
|
||||||
Regular fetch
|
Regular fetch
|
||||||
+++++++++++++
|
^^^^^^^^^^^^^
|
||||||
|
|
||||||
In a regular fetch, the client will contact the main remote and a
|
In a regular fetch, the client will contact the main remote and a
|
||||||
protocol negotiation will happen between them. It's a good thing that
|
protocol negotiation will happen between them. It's a good thing that
|
||||||
|
|
@ -551,7 +551,7 @@ new fetch will happen in the same way as the previous clone or fetch,
|
||||||
using, or not using, the same LOP(s) as last time.
|
using, or not using, the same LOP(s) as last time.
|
||||||
|
|
||||||
"Backfill" or "lazy" fetch
|
"Backfill" or "lazy" fetch
|
||||||
++++++++++++++++++++++++++
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
When there is a backfill fetch, the client doesn't necessarily contact
|
When there is a backfill fetch, the client doesn't necessarily contact
|
||||||
the main remote first. It will try to fetch from its promisor remotes
|
the main remote first. It will try to fetch from its promisor remotes
|
||||||
|
|
@ -576,8 +576,8 @@ from the client when it fetches from them. The client could get the
|
||||||
token when performing a protocol negotiation with the main remote (see
|
token when performing a protocol negotiation with the main remote (see
|
||||||
section II.6 above).
|
section II.6 above).
|
||||||
|
|
||||||
V) Future improvements
|
V Future improvements
|
||||||
----------------------
|
---------------------
|
||||||
|
|
||||||
It is expected that at the beginning using LOPs will be mostly worth
|
It is expected that at the beginning using LOPs will be mostly worth
|
||||||
it either in a corporate context where the Git version that clients
|
it either in a corporate context where the Git version that clients
|
||||||
|
|
|
||||||
|
|
@ -13,6 +13,7 @@ articles = [
|
||||||
'commit-graph.adoc',
|
'commit-graph.adoc',
|
||||||
'directory-rename-detection.adoc',
|
'directory-rename-detection.adoc',
|
||||||
'hash-function-transition.adoc',
|
'hash-function-transition.adoc',
|
||||||
|
'large-object-promisors.adoc',
|
||||||
'long-running-process-protocol.adoc',
|
'long-running-process-protocol.adoc',
|
||||||
'multi-pack-index.adoc',
|
'multi-pack-index.adoc',
|
||||||
'packfile-uri.adoc',
|
'packfile-uri.adoc',
|
||||||
|
|
|
||||||
|
|
@ -10,32 +10,32 @@ history as an optimization, assuming all merges are automatic and clean
|
||||||
|
|
||||||
Outline:
|
Outline:
|
||||||
|
|
||||||
0. Assumptions
|
1. Assumptions
|
||||||
|
|
||||||
1. How rebasing and cherry-picking work
|
2. How rebasing and cherry-picking work
|
||||||
|
|
||||||
2. Why the renames on MERGE_SIDE1 in any given pick are *always* a
|
3. Why the renames on MERGE_SIDE1 in any given pick are *always* a
|
||||||
superset of the renames on MERGE_SIDE1 for the next pick.
|
superset of the renames on MERGE_SIDE1 for the next pick.
|
||||||
|
|
||||||
3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also
|
4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also
|
||||||
a rename on MERGE_SIDE1 for the next pick
|
a rename on MERGE_SIDE1 for the next pick
|
||||||
|
|
||||||
4. A detailed description of the counter-examples to #3.
|
5. A detailed description of the counter-examples to #4.
|
||||||
|
|
||||||
5. Why the special cases in #4 are still fully reasonable to use to pair
|
6. Why the special cases in #5 are still fully reasonable to use to pair
|
||||||
up files for three-way content merging in the merge machinery, and why
|
up files for three-way content merging in the merge machinery, and why
|
||||||
they do not affect the correctness of the merge.
|
they do not affect the correctness of the merge.
|
||||||
|
|
||||||
6. Interaction with skipping of "irrelevant" renames
|
7. Interaction with skipping of "irrelevant" renames
|
||||||
|
|
||||||
7. Additional items that need to be cached
|
8. Additional items that need to be cached
|
||||||
|
|
||||||
8. How directory rename detection interacts with the above and why this
|
9. How directory rename detection interacts with the above and why this
|
||||||
optimization is still safe even if merge.directoryRenames is set to
|
optimization is still safe even if merge.directoryRenames is set to
|
||||||
"true".
|
"true".
|
||||||
|
|
||||||
|
|
||||||
=== 0. Assumptions ===
|
== 1. Assumptions ==
|
||||||
|
|
||||||
There are two assumptions that will hold throughout this document:
|
There are two assumptions that will hold throughout this document:
|
||||||
|
|
||||||
|
|
@ -44,8 +44,8 @@ There are two assumptions that will hold throughout this document:
|
||||||
|
|
||||||
* All merges are fully automatic
|
* All merges are fully automatic
|
||||||
|
|
||||||
and a third that will hold in sections 2-5 for simplicity, that I'll later
|
and a third that will hold in sections 3-6 for simplicity, that I'll later
|
||||||
address in section 8:
|
address in section 9:
|
||||||
|
|
||||||
* No directory renames occur
|
* No directory renames occur
|
||||||
|
|
||||||
|
|
@ -77,9 +77,9 @@ conflicts that the user needs to resolve), the cache of renames is not
|
||||||
stored on disk, and thus is thrown away as soon as the rebase or cherry
|
stored on disk, and thus is thrown away as soon as the rebase or cherry
|
||||||
pick stops for the user to resolve the operation.
|
pick stops for the user to resolve the operation.
|
||||||
|
|
||||||
The third assumption makes sections 2-5 simpler, and allows people to
|
The third assumption makes sections 3-6 simpler, and allows people to
|
||||||
understand the basics of why this optimization is safe and effective, and
|
understand the basics of why this optimization is safe and effective, and
|
||||||
then I can go back and address the specifics in section 8. It is probably
|
then I can go back and address the specifics in section 9. It is probably
|
||||||
also worth noting that if directory renames do occur, then the default of
|
also worth noting that if directory renames do occur, then the default of
|
||||||
merge.directoryRenames being set to "conflict" means that the operation
|
merge.directoryRenames being set to "conflict" means that the operation
|
||||||
will stop for users to resolve the conflicts and the cache will be thrown
|
will stop for users to resolve the conflicts and the cache will be thrown
|
||||||
|
|
@ -88,22 +88,26 @@ reason we need to address directory renames specifically, is that some
|
||||||
users will have set merge.directoryRenames to "true" to allow the merges to
|
users will have set merge.directoryRenames to "true" to allow the merges to
|
||||||
continue to proceed automatically. The optimization is still safe with
|
continue to proceed automatically. The optimization is still safe with
|
||||||
this config setting, but we have to discuss a few more cases to show why;
|
this config setting, but we have to discuss a few more cases to show why;
|
||||||
this discussion is deferred until section 8.
|
this discussion is deferred until section 9.
|
||||||
|
|
||||||
|
|
||||||
=== 1. How rebasing and cherry-picking work ===
|
== 2. How rebasing and cherry-picking work ==
|
||||||
|
|
||||||
Consider the following setup (from the git-rebase manpage):
|
Consider the following setup (from the git-rebase manpage):
|
||||||
|
|
||||||
|
------------
|
||||||
A---B---C topic
|
A---B---C topic
|
||||||
/
|
/
|
||||||
D---E---F---G main
|
D---E---F---G main
|
||||||
|
------------
|
||||||
|
|
||||||
After rebasing or cherry-picking topic onto main, this will appear as:
|
After rebasing or cherry-picking topic onto main, this will appear as:
|
||||||
|
|
||||||
|
------------
|
||||||
A'--B'--C' topic
|
A'--B'--C' topic
|
||||||
/
|
/
|
||||||
D---E---F---G main
|
D---E---F---G main
|
||||||
|
------------
|
||||||
|
|
||||||
The way the commits A', B', and C' are created is through a series of
|
The way the commits A', B', and C' are created is through a series of
|
||||||
merges, where rebase or cherry-pick sequentially uses each of the three
|
merges, where rebase or cherry-pick sequentially uses each of the three
|
||||||
|
|
@ -111,6 +115,7 @@ A-B-C commits in a special merge operation. Let's label the three commits
|
||||||
in the merge operation as MERGE_BASE, MERGE_SIDE1, and MERGE_SIDE2. For
|
in the merge operation as MERGE_BASE, MERGE_SIDE1, and MERGE_SIDE2. For
|
||||||
this picture, the three commits for each of the three merges would be:
|
this picture, the three commits for each of the three merges would be:
|
||||||
|
|
||||||
|
....
|
||||||
To create A':
|
To create A':
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -125,6 +130,7 @@ To create C':
|
||||||
MERGE_BASE: B
|
MERGE_BASE: B
|
||||||
MERGE_SIDE1: B'
|
MERGE_SIDE1: B'
|
||||||
MERGE_SIDE2: C
|
MERGE_SIDE2: C
|
||||||
|
....
|
||||||
|
|
||||||
Sometimes, folks are surprised that these three-way merges are done. It
|
Sometimes, folks are surprised that these three-way merges are done. It
|
||||||
can be useful in understanding these three-way merges to view them in a
|
can be useful in understanding these three-way merges to view them in a
|
||||||
|
|
@ -138,8 +144,7 @@ Conceptually the two statements above are the same as a three-way merge of
|
||||||
B, B', and C, at least the parts before you decide to record a commit.
|
B, B', and C, at least the parts before you decide to record a commit.
|
||||||
|
|
||||||
|
|
||||||
=== 2. Why the renames on MERGE_SIDE1 in any given pick are always a ===
|
== 3. Why the renames on MERGE_SIDE1 in any given pick are always a superset of the renames on MERGE_SIDE1 for the next pick. ==
|
||||||
=== superset of the renames on MERGE_SIDE1 for the next pick. ===
|
|
||||||
|
|
||||||
The merge machinery uses the filenames it is fed from MERGE_BASE,
|
The merge machinery uses the filenames it is fed from MERGE_BASE,
|
||||||
MERGE_SIDE1, and MERGE_SIDE2. It will only move content to a different
|
MERGE_SIDE1, and MERGE_SIDE2. It will only move content to a different
|
||||||
|
|
@ -156,6 +161,7 @@ filename under one of three conditions:
|
||||||
First, let's remember what commits are involved in the first and second
|
First, let's remember what commits are involved in the first and second
|
||||||
picks of the cherry-pick or rebase sequence:
|
picks of the cherry-pick or rebase sequence:
|
||||||
|
|
||||||
|
....
|
||||||
To create A':
|
To create A':
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -165,6 +171,7 @@ To create B':
|
||||||
MERGE_BASE: A
|
MERGE_BASE: A
|
||||||
MERGE_SIDE1: A'
|
MERGE_SIDE1: A'
|
||||||
MERGE_SIDE2: B
|
MERGE_SIDE2: B
|
||||||
|
....
|
||||||
|
|
||||||
So, in particular, we need to show that the renames between E and G are a
|
So, in particular, we need to show that the renames between E and G are a
|
||||||
superset of those between A and A'.
|
superset of those between A and A'.
|
||||||
|
|
@ -181,11 +188,11 @@ are a subset of those between E and G. Equivalently, all renames between E
|
||||||
and G are a superset of those between A and A'.
|
and G are a superset of those between A and A'.
|
||||||
|
|
||||||
|
|
||||||
=== 3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ ===
|
== 4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also a rename on MERGE_SIDE1 for the next pick. ==
|
||||||
=== always also a rename on MERGE_SIDE1 for the next pick. ===
|
|
||||||
|
|
||||||
Let's again look at the first two picks:
|
Let's again look at the first two picks:
|
||||||
|
|
||||||
|
....
|
||||||
To create A':
|
To create A':
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -195,17 +202,25 @@ To create B':
|
||||||
MERGE_BASE: A
|
MERGE_BASE: A
|
||||||
MERGE_SIDE1: A'
|
MERGE_SIDE1: A'
|
||||||
MERGE_SIDE2: B
|
MERGE_SIDE2: B
|
||||||
|
....
|
||||||
|
|
||||||
Now let's look at any given rename from MERGE_SIDE1 of the first pick, i.e.
|
Now let's look at any given rename from MERGE_SIDE1 of the first pick, i.e.
|
||||||
any given rename from E to G. Let's use the filenames 'oldfile' and
|
any given rename from E to G. Let's use the filenames 'oldfile' and
|
||||||
'newfile' for demonstration purposes. That first pick will function as
|
'newfile' for demonstration purposes. That first pick will function as
|
||||||
follows; when the rename is detected, the merge machinery will do a
|
follows; when the rename is detected, the merge machinery will do a
|
||||||
three-way content merge of the following:
|
three-way content merge of the following:
|
||||||
|
|
||||||
|
....
|
||||||
E:oldfile
|
E:oldfile
|
||||||
G:newfile
|
G:newfile
|
||||||
A:oldfile
|
A:oldfile
|
||||||
|
....
|
||||||
|
|
||||||
and produce a new result:
|
and produce a new result:
|
||||||
|
|
||||||
|
....
|
||||||
A':newfile
|
A':newfile
|
||||||
|
....
|
||||||
|
|
||||||
Note above that I've assumed that E->A did not rename oldfile. If that
|
Note above that I've assumed that E->A did not rename oldfile. If that
|
||||||
side did rename, then we most likely have a rename/rename(1to2) conflict
|
side did rename, then we most likely have a rename/rename(1to2) conflict
|
||||||
|
|
@ -254,19 +269,21 @@ were detected as renames, A:oldfile and A':newfile should also be
|
||||||
detectable as renames almost always.
|
detectable as renames almost always.
|
||||||
|
|
||||||
|
|
||||||
=== 4. A detailed description of the counter-examples to #3. ===
|
== 5. A detailed description of the counter-examples to #4. ==
|
||||||
|
|
||||||
We already noted in section 3 that rename/rename(1to1) (i.e. both sides
|
We already noted in section 4 that rename/rename(1to1) (i.e. both sides
|
||||||
renaming a file the same way) was one counter-example. The more
|
renaming a file the same way) was one counter-example. The more
|
||||||
interesting bit, though, is why did we need to use the "almost" qualifier
|
interesting bit, though, is why did we need to use the "almost" qualifier
|
||||||
when stating that A:oldfile and A':newfile are "almost" always detectable
|
when stating that A:oldfile and A':newfile are "almost" always detectable
|
||||||
as renames?
|
as renames?
|
||||||
|
|
||||||
Let's repeat an earlier point that section 3 made:
|
Let's repeat an earlier point that section 4 made:
|
||||||
|
|
||||||
|
....
|
||||||
A':newfile was created by applying the changes between E:oldfile and
|
A':newfile was created by applying the changes between E:oldfile and
|
||||||
G:newfile to A:oldfile. The changes between E:oldfile and G:newfile were
|
G:newfile to A:oldfile. The changes between E:oldfile and G:newfile were
|
||||||
<50% of the size of E:oldfile.
|
<50% of the size of E:oldfile.
|
||||||
|
....
|
||||||
|
|
||||||
If those changes that were <50% of the size of E:oldfile are also <50% of
|
If those changes that were <50% of the size of E:oldfile are also <50% of
|
||||||
the size of A:oldfile, then A:oldfile and A':newfile will be detectable as
|
the size of A:oldfile, then A:oldfile and A':newfile will be detectable as
|
||||||
|
|
@ -276,18 +293,21 @@ still somehow merge cleanly), then traditional rename detection would not
|
||||||
detect A:oldfile and A':newfile as renames.
|
detect A:oldfile and A':newfile as renames.
|
||||||
|
|
||||||
Here's an example where that can happen:
|
Here's an example where that can happen:
|
||||||
|
|
||||||
* E:oldfile had 20 lines
|
* E:oldfile had 20 lines
|
||||||
* G:newfile added 10 new lines at the beginning of the file
|
* G:newfile added 10 new lines at the beginning of the file
|
||||||
* A:oldfile kept the first 3 lines of the file, and deleted all the rest
|
* A:oldfile kept the first 3 lines of the file, and deleted all the rest
|
||||||
|
|
||||||
then
|
then
|
||||||
|
|
||||||
|
....
|
||||||
=> A':newfile would have 13 lines, 3 of which matches those in A:oldfile.
|
=> A':newfile would have 13 lines, 3 of which matches those in A:oldfile.
|
||||||
E:oldfile -> G:newfile would be detected as a rename, but A:oldfile and
|
E:oldfile -> G:newfile would be detected as a rename, but A:oldfile and
|
||||||
A':newfile would not be.
|
A':newfile would not be.
|
||||||
|
....
|
||||||
|
|
||||||
|
|
||||||
=== 5. Why the special cases in #4 are still fully reasonable to use to ===
|
== 6. Why the special cases in #5 are still fully reasonable to use to pair up files for three-way content merging in the merge machinery, and why they do not affect the correctness of the merge. ==
|
||||||
=== pair up files for three-way content merging in the merge machinery, ===
|
|
||||||
=== and why they do not affect the correctness of the merge. ===
|
|
||||||
|
|
||||||
In the rename/rename(1to1) case, A:newfile and A':newfile are not renames
|
In the rename/rename(1to1) case, A:newfile and A':newfile are not renames
|
||||||
since they use the *same* filename. However, files with the same filename
|
since they use the *same* filename. However, files with the same filename
|
||||||
|
|
@ -295,14 +315,14 @@ are obviously fine to pair up for three-way content merging (the merge
|
||||||
machinery has never employed break detection). The interesting
|
machinery has never employed break detection). The interesting
|
||||||
counter-example case is thus not the rename/rename(1to1) case, but the case
|
counter-example case is thus not the rename/rename(1to1) case, but the case
|
||||||
where A did not rename oldfile. That was the case that we spent most of
|
where A did not rename oldfile. That was the case that we spent most of
|
||||||
the time discussing in sections 3 and 4. The remainder of this section
|
the time discussing in sections 4 and 5. The remainder of this section
|
||||||
will be devoted to that case as well.
|
will be devoted to that case as well.
|
||||||
|
|
||||||
So, even if A:oldfile and A':newfile aren't detectable as renames, why is
|
So, even if A:oldfile and A':newfile aren't detectable as renames, why is
|
||||||
it still reasonable to pair them up for three-way content merging in the
|
it still reasonable to pair them up for three-way content merging in the
|
||||||
merge machinery? There are multiple reasons:
|
merge machinery? There are multiple reasons:
|
||||||
|
|
||||||
* As noted in sections 3 and 4, the diff between A:oldfile and A':newfile
|
* As noted in sections 4 and 5, the diff between A:oldfile and A':newfile
|
||||||
is *exactly* the same as the diff between E:oldfile and G:newfile. The
|
is *exactly* the same as the diff between E:oldfile and G:newfile. The
|
||||||
latter pair were detected as renames, so it seems unlikely to surprise
|
latter pair were detected as renames, so it seems unlikely to surprise
|
||||||
users for us to treat A:oldfile and A':newfile as renames.
|
users for us to treat A:oldfile and A':newfile as renames.
|
||||||
|
|
@ -394,7 +414,7 @@ cases 1 and 3 seem to provide as good or better behavior with the
|
||||||
optimization than without.
|
optimization than without.
|
||||||
|
|
||||||
|
|
||||||
=== 6. Interaction with skipping of "irrelevant" renames ===
|
== 7. Interaction with skipping of "irrelevant" renames ==
|
||||||
|
|
||||||
Previous optimizations involved skipping rename detection for paths
|
Previous optimizations involved skipping rename detection for paths
|
||||||
considered to be "irrelevant". See for example the following commits:
|
considered to be "irrelevant". See for example the following commits:
|
||||||
|
|
@ -421,24 +441,27 @@ detection -- though we can limit it to the paths for which we have not
|
||||||
already detected renames.
|
already detected renames.
|
||||||
|
|
||||||
|
|
||||||
=== 7. Additional items that need to be cached ===
|
== 8. Additional items that need to be cached ==
|
||||||
|
|
||||||
It turns out we have to cache more than just renames; we also cache:
|
It turns out we have to cache more than just renames; we also cache:
|
||||||
|
|
||||||
|
....
|
||||||
A) non-renames (i.e. unpaired deletes)
|
A) non-renames (i.e. unpaired deletes)
|
||||||
B) counts of renames within directories
|
B) counts of renames within directories
|
||||||
C) sources that were marked as RELEVANT_LOCATION, but which were
|
C) sources that were marked as RELEVANT_LOCATION, but which were
|
||||||
downgraded to RELEVANT_NO_MORE
|
downgraded to RELEVANT_NO_MORE
|
||||||
D) the toplevel trees involved in the merge
|
D) the toplevel trees involved in the merge
|
||||||
|
....
|
||||||
|
|
||||||
These are all stored in struct rename_info, and respectively appear in
|
These are all stored in struct rename_info, and respectively appear in
|
||||||
|
|
||||||
* cached_pairs (along side actual renames, just with a value of NULL)
|
* cached_pairs (along side actual renames, just with a value of NULL)
|
||||||
* dir_rename_counts
|
* dir_rename_counts
|
||||||
* cached_irrelevant
|
* cached_irrelevant
|
||||||
* merge_trees
|
* merge_trees
|
||||||
|
|
||||||
The reason for (A) comes from the irrelevant renames skipping
|
The reason for `(A)` comes from the irrelevant renames skipping
|
||||||
optimization discussed in section 6. The fact that irrelevant renames
|
optimization discussed in section 7. The fact that irrelevant renames
|
||||||
are skipped means we only get a subset of the potential renames
|
are skipped means we only get a subset of the potential renames
|
||||||
detected and subsequent commits may need to run rename detection on
|
detected and subsequent commits may need to run rename detection on
|
||||||
the upstream side on a subset of the remaining renames (to get the
|
the upstream side on a subset of the remaining renames (to get the
|
||||||
|
|
@ -447,23 +470,24 @@ deletes are involved in rename detection too, we don't want to
|
||||||
repeatedly check that those paths remain unpaired on the upstream side
|
repeatedly check that those paths remain unpaired on the upstream side
|
||||||
with every commit we are transplanting.
|
with every commit we are transplanting.
|
||||||
|
|
||||||
The reason for (B) is that diffcore_rename_extended() is what
|
The reason for `(B)` is that diffcore_rename_extended() is what
|
||||||
generates the counts of renames by directory which is needed in
|
generates the counts of renames by directory which is needed in
|
||||||
directory rename detection, and if we don't run
|
directory rename detection, and if we don't run
|
||||||
diffcore_rename_extended() again then we need to have the output from
|
diffcore_rename_extended() again then we need to have the output from
|
||||||
it, including dir_rename_counts, from the previous run.
|
it, including dir_rename_counts, from the previous run.
|
||||||
|
|
||||||
The reason for (C) is that merge-ort's tree traversal will again think
|
The reason for `(C)` is that merge-ort's tree traversal will again think
|
||||||
those paths are relevant (marking them as RELEVANT_LOCATION), but the
|
those paths are relevant (marking them as RELEVANT_LOCATION), but the
|
||||||
fact that they were downgraded to RELEVANT_NO_MORE means that
|
fact that they were downgraded to RELEVANT_NO_MORE means that
|
||||||
dir_rename_counts already has the information we need for directory
|
dir_rename_counts already has the information we need for directory
|
||||||
rename detection. (A path which becomes RELEVANT_CONTENT in a
|
rename detection. (A path which becomes RELEVANT_CONTENT in a
|
||||||
subsequent commit will be removed from cached_irrelevant.)
|
subsequent commit will be removed from cached_irrelevant.)
|
||||||
|
|
||||||
The reason for (D) is that is how we determine whether the remember
|
The reason for `(D)` is that is how we determine whether the remember
|
||||||
renames optimization can be used. In particular, remembering that our
|
renames optimization can be used. In particular, remembering that our
|
||||||
sequence of merges looks like:
|
sequence of merges looks like:
|
||||||
|
|
||||||
|
....
|
||||||
Merge 1:
|
Merge 1:
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -475,6 +499,7 @@ sequence of merges looks like:
|
||||||
MERGE_SIDE1: A'
|
MERGE_SIDE1: A'
|
||||||
MERGE_SIDE2: B
|
MERGE_SIDE2: B
|
||||||
=> Creates B'
|
=> Creates B'
|
||||||
|
....
|
||||||
|
|
||||||
It is the fact that the trees A and A' appear both in Merge 1 and in
|
It is the fact that the trees A and A' appear both in Merge 1 and in
|
||||||
Merge 2, with A as a parent of A' that allows this optimization. So
|
Merge 2, with A as a parent of A' that allows this optimization. So
|
||||||
|
|
@ -482,12 +507,11 @@ we store the trees to compare with what we are asked to merge next
|
||||||
time.
|
time.
|
||||||
|
|
||||||
|
|
||||||
=== 8. How directory rename detection interacts with the above and ===
|
== 9. How directory rename detection interacts with the above and why this optimization is still safe even if merge.directoryRenames is set to "true". ==
|
||||||
=== why this optimization is still safe even if ===
|
|
||||||
=== merge.directoryRenames is set to "true". ===
|
|
||||||
|
|
||||||
As noted in the assumptions section:
|
As noted in the assumptions section:
|
||||||
|
|
||||||
|
....
|
||||||
"""
|
"""
|
||||||
...if directory renames do occur, then the default of
|
...if directory renames do occur, then the default of
|
||||||
merge.directoryRenames being set to "conflict" means that the operation
|
merge.directoryRenames being set to "conflict" means that the operation
|
||||||
|
|
@ -497,11 +521,13 @@ As noted in the assumptions section:
|
||||||
is that some users will have set merge.directoryRenames to "true" to
|
is that some users will have set merge.directoryRenames to "true" to
|
||||||
allow the merges to continue to proceed automatically.
|
allow the merges to continue to proceed automatically.
|
||||||
"""
|
"""
|
||||||
|
....
|
||||||
|
|
||||||
Let's remember that we need to look at how any given pick affects the next
|
Let's remember that we need to look at how any given pick affects the next
|
||||||
one. So let's again use the first two picks from the diagram in section
|
one. So let's again use the first two picks from the diagram in section
|
||||||
one:
|
one:
|
||||||
|
|
||||||
|
....
|
||||||
First pick does this three-way merge:
|
First pick does this three-way merge:
|
||||||
MERGE_BASE: E
|
MERGE_BASE: E
|
||||||
MERGE_SIDE1: G
|
MERGE_SIDE1: G
|
||||||
|
|
@ -513,6 +539,7 @@ one:
|
||||||
MERGE_SIDE1: A'
|
MERGE_SIDE1: A'
|
||||||
MERGE_SIDE2: B
|
MERGE_SIDE2: B
|
||||||
=> creates B'
|
=> creates B'
|
||||||
|
....
|
||||||
|
|
||||||
Now, directory rename detection exists so that if one side of history
|
Now, directory rename detection exists so that if one side of history
|
||||||
renames a directory, and the other side adds a new file to the old
|
renames a directory, and the other side adds a new file to the old
|
||||||
|
|
@ -545,7 +572,7 @@ while considering all of these cases:
|
||||||
concerned; see the assumptions section). Two interesting sub-notes
|
concerned; see the assumptions section). Two interesting sub-notes
|
||||||
about these counts:
|
about these counts:
|
||||||
|
|
||||||
* If we need to perform rename-detection again on the given side (e.g.
|
** If we need to perform rename-detection again on the given side (e.g.
|
||||||
some paths are relevant for rename detection that weren't before),
|
some paths are relevant for rename detection that weren't before),
|
||||||
then we clear dir_rename_counts and recompute it, making use of
|
then we clear dir_rename_counts and recompute it, making use of
|
||||||
cached_pairs. The reason it is important to do this is optimizations
|
cached_pairs. The reason it is important to do this is optimizations
|
||||||
|
|
@ -556,7 +583,7 @@ while considering all of these cases:
|
||||||
easiest way to "fix up" dir_rename_counts in such cases is to just
|
easiest way to "fix up" dir_rename_counts in such cases is to just
|
||||||
recompute it.
|
recompute it.
|
||||||
|
|
||||||
* If we prune rename/rename(1to1) entries from the cache, then we also
|
** If we prune rename/rename(1to1) entries from the cache, then we also
|
||||||
need to update dir_rename_counts to decrement the counts for the
|
need to update dir_rename_counts to decrement the counts for the
|
||||||
involved directory and any relevant parent directories (to undo what
|
involved directory and any relevant parent directories (to undo what
|
||||||
update_dir_rename_counts() in diffcore-rename.c incremented when the
|
update_dir_rename_counts() in diffcore-rename.c incremented when the
|
||||||
|
|
@ -578,6 +605,7 @@ in order:
|
||||||
|
|
||||||
Case 1: MERGE_SIDE1 renames old dir, MERGE_SIDE2 adds new file to old dir
|
Case 1: MERGE_SIDE1 renames old dir, MERGE_SIDE2 adds new file to old dir
|
||||||
|
|
||||||
|
....
|
||||||
This case looks like this:
|
This case looks like this:
|
||||||
|
|
||||||
MERGE_BASE: E, Has olddir/
|
MERGE_BASE: E, Has olddir/
|
||||||
|
|
@ -595,10 +623,13 @@ Case 1: MERGE_SIDE1 renames old dir, MERGE_SIDE2 adds new file to old dir
|
||||||
* MERGE_SIDE1 has cached olddir/newfile -> newdir/newfile
|
* MERGE_SIDE1 has cached olddir/newfile -> newdir/newfile
|
||||||
Given the cached rename noted above, the second merge can proceed as
|
Given the cached rename noted above, the second merge can proceed as
|
||||||
expected without needing to perform rename detection from A -> A'.
|
expected without needing to perform rename detection from A -> A'.
|
||||||
|
....
|
||||||
|
|
||||||
Case 2: MERGE_SIDE1 renames old dir, MERGE_SIDE2 renames file into old dir
|
Case 2: MERGE_SIDE1 renames old dir, MERGE_SIDE2 renames file into old dir
|
||||||
|
|
||||||
|
....
|
||||||
This case looks like this:
|
This case looks like this:
|
||||||
|
|
||||||
MERGE_BASE: E oldfile, olddir/
|
MERGE_BASE: E oldfile, olddir/
|
||||||
MERGE_SIDE1: G oldfile, olddir/ -> newdir/
|
MERGE_SIDE1: G oldfile, olddir/ -> newdir/
|
||||||
MERGE_SIDE2: A oldfile -> olddir/newfile
|
MERGE_SIDE2: A oldfile -> olddir/newfile
|
||||||
|
|
@ -617,9 +648,11 @@ Case 2: MERGE_SIDE1 renames old dir, MERGE_SIDE2 renames file into old dir
|
||||||
|
|
||||||
Given the cached rename noted above, the second merge can proceed as
|
Given the cached rename noted above, the second merge can proceed as
|
||||||
expected without needing to perform rename detection from A -> A'.
|
expected without needing to perform rename detection from A -> A'.
|
||||||
|
....
|
||||||
|
|
||||||
Case 3: MERGE_SIDE1 adds new file to old dir, MERGE_SIDE2 renames old dir
|
Case 3: MERGE_SIDE1 adds new file to old dir, MERGE_SIDE2 renames old dir
|
||||||
|
|
||||||
|
....
|
||||||
This case looks like this:
|
This case looks like this:
|
||||||
|
|
||||||
MERGE_BASE: E, Has olddir/
|
MERGE_BASE: E, Has olddir/
|
||||||
|
|
@ -635,9 +668,11 @@ Case 3: MERGE_SIDE1 adds new file to old dir, MERGE_SIDE2 renames old dir
|
||||||
In this case, with the optimization, note that after the first commit there
|
In this case, with the optimization, note that after the first commit there
|
||||||
were no renames on MERGE_SIDE1, and any renames on MERGE_SIDE2 are tossed.
|
were no renames on MERGE_SIDE1, and any renames on MERGE_SIDE2 are tossed.
|
||||||
But the second merge didn't need any renames so this is fine.
|
But the second merge didn't need any renames so this is fine.
|
||||||
|
....
|
||||||
|
|
||||||
Case 4: MERGE_SIDE1 renames file into old dir, MERGE_SIDE2 renames old dir
|
Case 4: MERGE_SIDE1 renames file into old dir, MERGE_SIDE2 renames old dir
|
||||||
|
|
||||||
|
....
|
||||||
This case looks like this:
|
This case looks like this:
|
||||||
|
|
||||||
MERGE_BASE: E, Has olddir/
|
MERGE_BASE: E, Has olddir/
|
||||||
|
|
@ -658,6 +693,7 @@ Case 4: MERGE_SIDE1 renames file into old dir, MERGE_SIDE2 renames old dir
|
||||||
|
|
||||||
Given the cached rename noted above, the second merge can proceed as
|
Given the cached rename noted above, the second merge can proceed as
|
||||||
expected without needing to perform rename detection from A -> A'.
|
expected without needing to perform rename detection from A -> A'.
|
||||||
|
....
|
||||||
|
|
||||||
Finally, I'll just note here that interactions with the
|
Finally, I'll just note here that interactions with the
|
||||||
skip-irrelevant-renames optimization means we sometimes don't detect
|
skip-irrelevant-renames optimization means we sometimes don't detect
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue