
2 changed files with 483 additions and 0 deletions
@ -0,0 +1,11 @@
@@ -0,0 +1,11 @@
|
||||
all: |
||||
|
||||
clean: |
||||
rm -f Subpro.html |
||||
|
||||
|
||||
all: Subpro.html |
||||
|
||||
%.html: %.txt |
||||
asciidoc -bxhtml11 $*.txt |
||||
|
@ -0,0 +1,472 @@
@@ -0,0 +1,472 @@
|
||||
Notes on Subproject Support |
||||
=========================== |
||||
Junio C Hamano |
||||
|
||||
Scenario |
||||
-------- |
||||
|
||||
The examples in the following discussion show how this proposal |
||||
plans to help this: |
||||
|
||||
. A project to build an embedded Linux appliance "gadget" is |
||||
maintained with git. |
||||
|
||||
. The project uses linux-2.6 kernel as its subcomponent. It |
||||
starts from a particular version of the mainline kernel, but |
||||
adds its own code and build infrastructure to fit the |
||||
appliance's needs. |
||||
|
||||
. The working tree of the project is laid out this way: |
||||
+ |
||||
------------ |
||||
Makefile - Builds the whole thing. |
||||
linux-2.6/ - The kernel, perhaps modified for the project. |
||||
appliance/ - Applications that run on the appliance, and |
||||
other bits. |
||||
------------ |
||||
|
||||
. The project is willing to maintain its own changes out of tree |
||||
of the Linux kernel project, but would want to be able to feed |
||||
the changes upstream, and incorporate upstream changes to its |
||||
own tree, taking advantage of the fact that both itself and |
||||
the Linux kernel project are version controlled with git. |
||||
|
||||
. To make the story a bit more interesting, later in the history |
||||
of development, `linux-2.6/` and `appliance/` directories will |
||||
be renamed to `kernel/` and `gadget/`. |
||||
|
||||
The idea here is to: |
||||
|
||||
. Keep `linux-2.6/` part as an independent project. The work by |
||||
the project on the kernel part can be naturally exchanged with |
||||
the other kernel developers this way. Specifically, a tree |
||||
object contained in commit objects belonging to this project |
||||
does *not* have `linux-2.6/` directory at the top. |
||||
|
||||
. Keep the `appliance/` part as another independent project. |
||||
Applications are supposed to be more or less independent from |
||||
the kernel version, but some other bits might be tied to a |
||||
specific kernel version. Again, a tree object contained in |
||||
commit objects belonging to this project does *not* have |
||||
`appliance/` directory at the top. |
||||
|
||||
. Have another project that combines the whole thing together, |
||||
so that the project can keep track of which versions of the |
||||
parts are built together. |
||||
|
||||
We will call the project that binds things together the |
||||
'toplevel project'. Other projects that hold `linux-2.6/` part |
||||
and `appliance/` part are called 'subprojects'. |
||||
|
||||
|
||||
Setting up |
||||
---------- |
||||
|
||||
Let's say we have been working on the appliance software, |
||||
independently version controlled with git. Also the kernel part |
||||
has been version controlled separately, like this: |
||||
------------ |
||||
$ ls -dF current/*/.git current/* |
||||
current/Makefile current/appliance/.git/ current/linux-2.6/.git/ |
||||
current/appliance/ current/linux-2.6/ |
||||
------------ |
||||
|
||||
Now we would want to get a combined project. First we would |
||||
clone from these repositories (which is not strictly needed -- |
||||
we could use `$GIT_ALTERNATE_OBJECT_DIRECTORIES` instead): |
||||
|
||||
------------ |
||||
$ mkdir combined && cd combined |
||||
$ cp ../current/Makefile . |
||||
$ git init-db |
||||
$ mkdir -p .git/refs/subs/{kernel,gadget}/{heads,tags} |
||||
$ git clone-pack ../current/linux-2.6/ master | read kernel_commit junk |
||||
$ git clone-pack ../current/appliance/ master | read gadget_commit junk |
||||
------------ |
||||
|
||||
We will introduce a new command to set up a combined project: |
||||
|
||||
------------ |
||||
$ git bind-projects \ |
||||
$kernel_commit linux-2.6/ \ |
||||
$gadget_commit appliance/ |
||||
------------ |
||||
|
||||
This would probably do an equivalent of: |
||||
|
||||
------------ |
||||
$ rm -f "$GIT_DIR/index" |
||||
$ git read-tree --prefix=linux-2.6/ $kernel_commit |
||||
$ git read-tree --prefix=appliance/ $gadget_commit |
||||
$ git update-index --bind linux-2.6/ $kernel_commit |
||||
$ git update-index --bind appliance/ $gadget_commit |
||||
------------ |
||||
[NOTE] |
||||
============ |
||||
Earlier outlines sent to the git mailing list talked |
||||
about `$GIT_DIR/bind` to record what subproject are bound to |
||||
which subtree in the current working tree and index. This |
||||
proposal instead records that information in the index file |
||||
with `update-index --bind` command. |
||||
|
||||
Also note that in this round of proposal, there is no separate |
||||
branches that keep track of heads of subprojects. |
||||
============ |
||||
|
||||
Let's not forget to add the `Makefile`, and check the whole |
||||
thing out from the index file. |
||||
------------ |
||||
$ git add Makefile |
||||
$ git checkout-index -f -u -q -a |
||||
------------ |
||||
|
||||
Now our directory should be identical with the `current` |
||||
directory. After making sure of that, we should be able to |
||||
commit the whole thing: |
||||
|
||||
------------ |
||||
$ diff -x .git -r ../current ../combined |
||||
$ git commit -m 'Initial toplevel project commit' |
||||
------------ |
||||
|
||||
Which should create a new commit object that records what is in |
||||
the index file as its tree, with `bind` lines to record which |
||||
subproject commit objects are bound at what subdirectory, and |
||||
updates the `$GIT_DIR/refs/heads/master`. Such a commit object |
||||
might look like this: |
||||
------------ |
||||
tree 04803b09c300c8325258ccf2744115acc4c57067 |
||||
bind 5b2bcc7b2d546c636f79490655b3347acc91d17f linux-2.6/ |
||||
bind 0bdd79af62e8621359af08f0afca0ce977348ac7 appliance/ |
||||
author Junio C Hamano <junio@kernel.org> 1137965565 -0800 |
||||
committer Junio C Hamano <junio@kernel.org> 1137965565 -0800 |
||||
|
||||
Initial toplevel project commit |
||||
------------ |
||||
|
||||
Notice that `Makefile` at the top is part of the toplevel |
||||
project in this example, but it is not necessary. We could |
||||
instead have the appliance subproject include this file. In |
||||
such a setup, the appliance subproject would have had `Makefile` |
||||
and `appliance/` directory at the toplevel. The `bind` line for |
||||
that project would have said "the rest is bound at `/`" and |
||||
`write-tree \--exclude=linux-2.6/` would have been used to write |
||||
the tree for that subproject out of the combined index. |
||||
|
||||
|
||||
Making further commits |
||||
---------------------- |
||||
|
||||
The easiest case is when you updated the Makefile without |
||||
changing anything in the subprojects. In such a case, we just |
||||
need to create a new commmit object that records the new tree |
||||
with the current `HEAD` as its parent, and with the same set of |
||||
`bind` lines. |
||||
|
||||
When we have changes to the subproject part, we would make a |
||||
separate commit to the subproject part and then record the whole |
||||
thing by making a commit to the toplevel project. The user |
||||
interaction might go this way: |
||||
------------ |
||||
$ git commit |
||||
error: you have changes to the subproject bound at linux-2.6/. |
||||
$ git commit --subproject linux-2.6/ |
||||
$ git commit |
||||
------------ |
||||
|
||||
With the new `\--subproject` option, the directory structure |
||||
rooted at `linux-2.6/` part is written out as a tree, and a new |
||||
commit object that records that tree object with the commit |
||||
bound to that portion of the tree (`5b2bcc7b` in the above |
||||
example) as its parent is created. Then the final `git commit` |
||||
would record the whole tree with updated `bind` line for the |
||||
`linux-2.6/` part. |
||||
|
||||
|
||||
Checking out |
||||
------------ |
||||
|
||||
After cloning such a toplevel project, `git clone` without `-n` |
||||
option would check out the working tree. This is done by |
||||
reading the tree object recorded in the commit object (which |
||||
records the whole thing), and adding the information from the |
||||
"bind" line to the index file. |
||||
|
||||
------------ |
||||
$ cd .. |
||||
$ git clone -n combined cloned ;# clone the one we created earlier |
||||
$ cd cloned |
||||
$ git checkout |
||||
------------ |
||||
|
||||
This round of proposal does not maintain separate branch heads |
||||
for subprojects. The bound commits and their subdirectories |
||||
are recorded in the index file from the commit object, so there |
||||
is no need to do anything other than updating the index and the |
||||
working tree. |
||||
|
||||
|
||||
Switching branches |
||||
------------------ |
||||
|
||||
Along with the traditional two-way merge by `read-tree -m -u`, |
||||
we would need to look at: |
||||
|
||||
. `bind` lines in the current `HEAD` commit. |
||||
|
||||
. `bind` lines in the commit we are switching to. |
||||
|
||||
. subproject binding information in the index file. |
||||
|
||||
to make sure we do sensible things. |
||||
|
||||
Just like until very recently we did not allow switching |
||||
branches when two-way merge would lose local changes, we can |
||||
start by refusing to switch branches when the subprojects bound |
||||
in the index do not match what is recorded in the `HEAD` commit. |
||||
|
||||
Because in this round of the proposal we do not use the |
||||
`$GIT_DIR/bind` file nor separate branches to keep track of |
||||
heads of the subprojects, there is nothing else other than the |
||||
working tree and the index file that needs to be updated when |
||||
switching branches. |
||||
|
||||
|
||||
Merging |
||||
------- |
||||
|
||||
Merging two branches of the toplevel projects can use the |
||||
traditional merging mechanism mostly unchanged. The merge base |
||||
computation can be done using the `parent` ancestry information |
||||
taken from the two toplevel project branch heads being merged, |
||||
and merging of the whole tree can be done with a three-way merge |
||||
of the whole tree using the merge base and two head commits. |
||||
For reasons described later, we would not merge the subproject |
||||
parts of the trees during this step, though. |
||||
|
||||
When the two branch heads use different versions of subproject, |
||||
things get a bit tricky. First, let's forget for a moment about |
||||
the case where they bind the same project at different location. |
||||
We would refuse if they do not have the same number of `bind` |
||||
lines that bind something at the same subdirectories. |
||||
|
||||
------------ |
||||
$ git merge 'Merge in a side branch' HEAD side |
||||
error: the merged heads have subprojects bound at different places. |
||||
ours: |
||||
linux-2.6/ |
||||
appliance/ |
||||
theirs: |
||||
kernel/ |
||||
gadget/ |
||||
manual/ |
||||
------------ |
||||
|
||||
Such renaming can be handled by first moving the bind points in |
||||
our branch, and redoing the merge (this is a rare operation |
||||
anyway). It might go like this: |
||||
|
||||
------------ |
||||
$ git reset |
||||
$ git update-index --unbind linux-2.6/ |
||||
$ git update-index --unbind appliance/ |
||||
$ git update-index --bind $kernel_commit kernel/ |
||||
$ git update-index --bind $gadget_commit gadget/ |
||||
$ git commit -m 'Prepare for merge with side branch' |
||||
$ git merge 'Merge in a side branch' HEAD side |
||||
error: the merged heads have subprojects bound at different places. |
||||
ours: |
||||
kernel/ |
||||
gadget/ |
||||
theirs: |
||||
kernel/ |
||||
gadget/ |
||||
manual/ |
||||
------------ |
||||
|
||||
Their branch added another subproject, so this did not work (or |
||||
it could be the other way around -- we might have been the one |
||||
with `manual/` subproject while they didn't). This suggests |
||||
that we may want an option to `git merge` to allow taking a |
||||
union of subprojects. Again, this is a rare operation, and |
||||
always taking a union would have created a toplevel project that |
||||
had both `kernel/` and `linux-2.6/` bound to the same Linux |
||||
kernel project from possibly different vintage, so it would be |
||||
prudent to require the set of bound subprojects to exactly match |
||||
and give the user an option to take a union. |
||||
|
||||
------------ |
||||
$ git merge --union-subprojects 'Merge in a side branch HEAD side |
||||
error: the subproject at 'kernel/' needs to be merged first. |
||||
------------ |
||||
|
||||
Here, the version of the Linux kernel project in the `side` |
||||
branch was different from what our branch had on our `bind` |
||||
line. On what kind of difference should we give this error? |
||||
Initially, I think we could require one is the fast forward of |
||||
the other (ours might be ahead of theirs, or the other way |
||||
around), and take the descendant. |
||||
|
||||
Or we could do an independent merge of subprojects heads, using |
||||
the `parent` ancestry of the bound subproject heads to find |
||||
their merge-base and doing a three-way merge. This would leave |
||||
the merge result in the subproject part of the working tree and |
||||
the index. |
||||
|
||||
[NOTE] |
||||
This is the reason we did not do the whole-tree three way merge |
||||
earlier. The subproject commit bound to the merge base commit |
||||
used for the toplevel project may not be the merge base between |
||||
the subproject commits bound to the two toplevel project |
||||
commits. |
||||
|
||||
So let's deal with the case to merge only a subproject part into |
||||
our tree first. |
||||
|
||||
|
||||
Merging subprojects |
||||
------------------- |
||||
|
||||
An operation of more practical importance is to be able to merge |
||||
in changes done outside to the projects bound to our toplevel |
||||
project. |
||||
|
||||
------------ |
||||
$ git pull --subproject=kernel/ git://git.kernel.org/.../linux-2.6/ |
||||
------------ |
||||
|
||||
might do: |
||||
|
||||
. fetch the current `HEAD` commit from Linus. |
||||
. find the subproject commit bound at kernel/ subtree. |
||||
. perform the usual three-way merge of these two commits, in |
||||
`kernel/` part of the working tree. |
||||
|
||||
After that, `git commit \--subproject` option would be needed to |
||||
make a commit. |
||||
|
||||
[NOTE] |
||||
This suggests that we would need to have something similar to |
||||
`MERGE_HEAD` for merging the subproject part. In the case of |
||||
merging two toplevel project commits, we probably can read the |
||||
`bind` lines from the `MERGE_HEAD` commit and either our `HEAD` |
||||
commit or our index file. Further, we probably would require |
||||
that the latter two must match, just as we currently require the |
||||
index file matches our `HEAD` commit before `git merge`. |
||||
|
||||
Just like the current `pull = fetch + merge` semantics, the |
||||
subproject aware version `git pull \--subproject=frotz/` would be |
||||
a `git fetch \--subproject=frotz/` followed by a `git merge |
||||
\--subproject=frotz/`. So the above would be: |
||||
|
||||
. Fetch the head. |
||||
+ |
||||
------------ |
||||
$ git fetch --subproject=kernel/ git://git.kernel.org/.../linux-2.6/ |
||||
------------ |
||||
+ |
||||
which would fetch the commit chain from the remote repository, and |
||||
write something like this to `FETCH_HEAD`: |
||||
+ |
||||
------------ |
||||
3ee68c4...\tfor-merge-into kernel/\tbranch 'master' of git://.../linux-2.6 |
||||
------------ |
||||
|
||||
. Run `git merge`. |
||||
+ |
||||
------------ |
||||
$ git merge --subproject=kernel/ \ |
||||
'Merge git://.../linux-2.6 into kernel/' HEAD 3ee68c4... |
||||
------------ |
||||
|
||||
. In case it does not cleanly automerge, `git merge` would write |
||||
the necessary information for a later `git commit` to use in |
||||
`MERGE_HEAD`. It may look like this: |
||||
+ |
||||
------------ |
||||
3ee68c4af3fd7228c1be63254b9f884614f9ebb2 kernel/ |
||||
------------ |
||||
+ |
||||
Similarly, `MERGE_MSG` file will hold the merge message. |
||||
|
||||
With this, a later invocation of `git commit` to record the |
||||
result of hand resolving would be able to notice that: |
||||
|
||||
. We should be first resolving `kernel/` subproject, not the |
||||
whole thing. |
||||
. The remote `HEAD` is `3ee68c4\...` commit. |
||||
. The merge message is `Merge git://\.../linux-2.6 into kernel/`. |
||||
|
||||
and would make a merge commit, and register that resulting |
||||
commit in the index file using `update-index \--bind` instead of |
||||
updating *any* branch head. |
||||
|
||||
|
||||
Management of Subprojects |
||||
------------------------- |
||||
|
||||
While the above as a mechanism would support version controlling |
||||
of subprojects as a part of *one* larger toplevel project, it |
||||
probably is worth pointing out that having a separate repository |
||||
to manage the subproject independently would be a good idea. |
||||
The same subproject can be incorporated into more than one |
||||
toplevel projects, and after all, a subproject should be |
||||
something that can stand on its own. In our example scenario, |
||||
the `kernel/` project is used as a subproject for the "gadget" |
||||
product, but at the same time, the organizaton that runs the |
||||
"gadget" project may use Linux on their development machines, |
||||
and have their own kernel hackers, not necessarily related to |
||||
the use of the kernel in the "gadget" product. |
||||
|
||||
What this suggests is that not just we need to be able to pull |
||||
the kernel development history *into* the subproject of the |
||||
"gadget" project, but also we need to be able to push the |
||||
development history of the kernel part alone *out* *of* the |
||||
"gadget" project to another repository that deals only with the |
||||
kernel part. |
||||
|
||||
It might go this way. First the setup: |
||||
|
||||
------------ |
||||
$ git clone git://git.kernel.org/.../linux-2.6 Linux |
||||
$ ls -dF * |
||||
cloned/ combined/ current/ Linux/ |
||||
------------ |
||||
|
||||
That is, in addition to the `combined/` which we have been using |
||||
to develop the "gadget" product in, we now have a repository for |
||||
the kernel, cloned from Linus. In the previous section, we have |
||||
outlined how we update the kernel subproject part of `combined/` |
||||
repository from the `kernel.org` repository. The same procedure |
||||
would work for pulling from `Linux/` repository here. |
||||
|
||||
We are now going the other way; propagate the kernel work done |
||||
in the "gadget" project repository `combined/` back to `Linux/`. |
||||
We might do this at the lowest level: |
||||
|
||||
------------ |
||||
$ cd combined |
||||
$ git cat-file commit HEAD | |
||||
sed -ne 's|^bind \([0-9a-f]*\) kernel/$|\1|p' >.git/refs/heads/linux26 |
||||
$ git push ../Linux linux26:master |
||||
------------ |
||||
|
||||
Or, more realistically, since the `Linux` project might already |
||||
have their own commits on its `master`: |
||||
|
||||
------------ |
||||
$ cd Linux |
||||
$ git pull ../combined linux26 |
||||
------------ |
||||
|
||||
Either way we would need an easy way to maintain the `linux26` |
||||
branch in the above example, and that will have to be part of |
||||
the wrapper scripts like `git commit` (more likely, that would |
||||
be a job for `git commit \--subproject`) for the usability's |
||||
sake; in other words, the `cat-file commit` piped to `sed` above |
||||
is not something the end user would do, but something that is |
||||
done by the wrapper scripts. |
||||
|
||||
Hopefully the people who work in `Linux/` repository would run |
||||
`format-patch` and feed their changes back to the kernel |
||||
community. |
Loading…
Reference in new issue