You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
472 lines
16 KiB
472 lines
16 KiB
Notes on Subproject Support |
|
=========================== |
|
Junio C Hamano |
|
|
|
Scenario |
|
-------- |
|
|
|
The examples in the following discussion show how this proposal |
|
plans to help this: |
|
|
|
. A project to build an embedded Linux appliance "gadget" is |
|
maintained with git. |
|
|
|
. The project uses linux-2.6 kernel as its subcomponent. It |
|
starts from a particular version of the mainline kernel, but |
|
adds its own code and build infrastructure to fit the |
|
appliance's needs. |
|
|
|
. The working tree of the project is laid out this way: |
|
+ |
|
------------ |
|
Makefile - Builds the whole thing. |
|
linux-2.6/ - The kernel, perhaps modified for the project. |
|
appliance/ - Applications that run on the appliance, and |
|
other bits. |
|
------------ |
|
|
|
. The project is willing to maintain its own changes out of tree |
|
of the Linux kernel project, but would want to be able to feed |
|
the changes upstream, and incorporate upstream changes to its |
|
own tree, taking advantage of the fact that both itself and |
|
the Linux kernel project are version controlled with git. |
|
|
|
. To make the story a bit more interesting, later in the history |
|
of development, `linux-2.6/` and `appliance/` directories will |
|
be renamed to `kernel/` and `gadget/`. |
|
|
|
The idea here is to: |
|
|
|
. Keep `linux-2.6/` part as an independent project. The work by |
|
the project on the kernel part can be naturally exchanged with |
|
the other kernel developers this way. Specifically, a tree |
|
object contained in commit objects belonging to this project |
|
does *not* have `linux-2.6/` directory at the top. |
|
|
|
. Keep the `appliance/` part as another independent project. |
|
Applications are supposed to be more or less independent from |
|
the kernel version, but some other bits might be tied to a |
|
specific kernel version. Again, a tree object contained in |
|
commit objects belonging to this project does *not* have |
|
`appliance/` directory at the top. |
|
|
|
. Have another project that combines the whole thing together, |
|
so that the project can keep track of which versions of the |
|
parts are built together. |
|
|
|
We will call the project that binds things together the |
|
'toplevel project'. Other projects that hold `linux-2.6/` part |
|
and `appliance/` part are called 'subprojects'. |
|
|
|
|
|
Setting up |
|
---------- |
|
|
|
Let's say we have been working on the appliance software, |
|
independently version controlled with git. Also the kernel part |
|
has been version controlled separately, like this: |
|
------------ |
|
$ ls -dF current/*/.git current/* |
|
current/Makefile current/appliance/.git/ current/linux-2.6/.git/ |
|
current/appliance/ current/linux-2.6/ |
|
------------ |
|
|
|
Now we would want to get a combined project. First we would |
|
clone from these repositories (which is not strictly needed -- |
|
we could use `$GIT_ALTERNATE_OBJECT_DIRECTORIES` instead): |
|
|
|
------------ |
|
$ mkdir combined && cd combined |
|
$ cp ../current/Makefile . |
|
$ git init-db |
|
$ mkdir -p .git/refs/subs/{kernel,gadget}/{heads,tags} |
|
$ git clone-pack ../current/linux-2.6/ master | read kernel_commit junk |
|
$ git clone-pack ../current/appliance/ master | read gadget_commit junk |
|
------------ |
|
|
|
We will introduce a new command to set up a combined project: |
|
|
|
------------ |
|
$ git bind-projects \ |
|
$kernel_commit linux-2.6/ \ |
|
$gadget_commit appliance/ |
|
------------ |
|
|
|
This would probably do an equivalent of: |
|
|
|
------------ |
|
$ rm -f "$GIT_DIR/index" |
|
$ git read-tree --prefix=linux-2.6/ $kernel_commit |
|
$ git read-tree --prefix=appliance/ $gadget_commit |
|
$ git update-index --bind linux-2.6/ $kernel_commit |
|
$ git update-index --bind appliance/ $gadget_commit |
|
------------ |
|
[NOTE] |
|
============ |
|
Earlier outlines sent to the git mailing list talked |
|
about `$GIT_DIR/bind` to record what subproject are bound to |
|
which subtree in the current working tree and index. This |
|
proposal instead records that information in the index file |
|
with `update-index --bind` command. |
|
|
|
Also note that in this round of proposal, there is no separate |
|
branches that keep track of heads of subprojects. |
|
============ |
|
|
|
Let's not forget to add the `Makefile`, and check the whole |
|
thing out from the index file. |
|
------------ |
|
$ git add Makefile |
|
$ git checkout-index -f -u -q -a |
|
------------ |
|
|
|
Now our directory should be identical with the `current` |
|
directory. After making sure of that, we should be able to |
|
commit the whole thing: |
|
|
|
------------ |
|
$ diff -x .git -r ../current ../combined |
|
$ git commit -m 'Initial toplevel project commit' |
|
------------ |
|
|
|
Which should create a new commit object that records what is in |
|
the index file as its tree, with `bind` lines to record which |
|
subproject commit objects are bound at what subdirectory, and |
|
updates the `$GIT_DIR/refs/heads/master`. Such a commit object |
|
might look like this: |
|
------------ |
|
tree 04803b09c300c8325258ccf2744115acc4c57067 |
|
bind 5b2bcc7b2d546c636f79490655b3347acc91d17f linux-2.6/ |
|
bind 0bdd79af62e8621359af08f0afca0ce977348ac7 appliance/ |
|
author Junio C Hamano <junio@kernel.org> 1137965565 -0800 |
|
committer Junio C Hamano <junio@kernel.org> 1137965565 -0800 |
|
|
|
Initial toplevel project commit |
|
------------ |
|
|
|
Notice that `Makefile` at the top is part of the toplevel |
|
project in this example, but it is not necessary. We could |
|
instead have the appliance subproject include this file. In |
|
such a setup, the appliance subproject would have had `Makefile` |
|
and `appliance/` directory at the toplevel. The `bind` line for |
|
that project would have said "the rest is bound at `/`" and |
|
`write-tree \--exclude=linux-2.6/` would have been used to write |
|
the tree for that subproject out of the combined index. |
|
|
|
|
|
Making further commits |
|
---------------------- |
|
|
|
The easiest case is when you updated the Makefile without |
|
changing anything in the subprojects. In such a case, we just |
|
need to create a new commmit object that records the new tree |
|
with the current `HEAD` as its parent, and with the same set of |
|
`bind` lines. |
|
|
|
When we have changes to the subproject part, we would make a |
|
separate commit to the subproject part and then record the whole |
|
thing by making a commit to the toplevel project. The user |
|
interaction might go this way: |
|
------------ |
|
$ git commit |
|
error: you have changes to the subproject bound at linux-2.6/. |
|
$ git commit --subproject linux-2.6/ |
|
$ git commit |
|
------------ |
|
|
|
With the new `\--subproject` option, the directory structure |
|
rooted at `linux-2.6/` part is written out as a tree, and a new |
|
commit object that records that tree object with the commit |
|
bound to that portion of the tree (`5b2bcc7b` in the above |
|
example) as its parent is created. Then the final `git commit` |
|
would record the whole tree with updated `bind` line for the |
|
`linux-2.6/` part. |
|
|
|
|
|
Checking out |
|
------------ |
|
|
|
After cloning such a toplevel project, `git clone` without `-n` |
|
option would check out the working tree. This is done by |
|
reading the tree object recorded in the commit object (which |
|
records the whole thing), and adding the information from the |
|
"bind" line to the index file. |
|
|
|
------------ |
|
$ cd .. |
|
$ git clone -n combined cloned ;# clone the one we created earlier |
|
$ cd cloned |
|
$ git checkout |
|
------------ |
|
|
|
This round of proposal does not maintain separate branch heads |
|
for subprojects. The bound commits and their subdirectories |
|
are recorded in the index file from the commit object, so there |
|
is no need to do anything other than updating the index and the |
|
working tree. |
|
|
|
|
|
Switching branches |
|
------------------ |
|
|
|
Along with the traditional two-way merge by `read-tree -m -u`, |
|
we would need to look at: |
|
|
|
. `bind` lines in the current `HEAD` commit. |
|
|
|
. `bind` lines in the commit we are switching to. |
|
|
|
. subproject binding information in the index file. |
|
|
|
to make sure we do sensible things. |
|
|
|
Just like until very recently we did not allow switching |
|
branches when two-way merge would lose local changes, we can |
|
start by refusing to switch branches when the subprojects bound |
|
in the index do not match what is recorded in the `HEAD` commit. |
|
|
|
Because in this round of the proposal we do not use the |
|
`$GIT_DIR/bind` file nor separate branches to keep track of |
|
heads of the subprojects, there is nothing else other than the |
|
working tree and the index file that needs to be updated when |
|
switching branches. |
|
|
|
|
|
Merging |
|
------- |
|
|
|
Merging two branches of the toplevel projects can use the |
|
traditional merging mechanism mostly unchanged. The merge base |
|
computation can be done using the `parent` ancestry information |
|
taken from the two toplevel project branch heads being merged, |
|
and merging of the whole tree can be done with a three-way merge |
|
of the whole tree using the merge base and two head commits. |
|
For reasons described later, we would not merge the subproject |
|
parts of the trees during this step, though. |
|
|
|
When the two branch heads use different versions of subproject, |
|
things get a bit tricky. First, let's forget for a moment about |
|
the case where they bind the same project at different location. |
|
We would refuse if they do not have the same number of `bind` |
|
lines that bind something at the same subdirectories. |
|
|
|
------------ |
|
$ git merge 'Merge in a side branch' HEAD side |
|
error: the merged heads have subprojects bound at different places. |
|
ours: |
|
linux-2.6/ |
|
appliance/ |
|
theirs: |
|
kernel/ |
|
gadget/ |
|
manual/ |
|
------------ |
|
|
|
Such renaming can be handled by first moving the bind points in |
|
our branch, and redoing the merge (this is a rare operation |
|
anyway). It might go like this: |
|
|
|
------------ |
|
$ git reset |
|
$ git update-index --unbind linux-2.6/ |
|
$ git update-index --unbind appliance/ |
|
$ git update-index --bind $kernel_commit kernel/ |
|
$ git update-index --bind $gadget_commit gadget/ |
|
$ git commit -m 'Prepare for merge with side branch' |
|
$ git merge 'Merge in a side branch' HEAD side |
|
error: the merged heads have subprojects bound at different places. |
|
ours: |
|
kernel/ |
|
gadget/ |
|
theirs: |
|
kernel/ |
|
gadget/ |
|
manual/ |
|
------------ |
|
|
|
Their branch added another subproject, so this did not work (or |
|
it could be the other way around -- we might have been the one |
|
with `manual/` subproject while they didn't). This suggests |
|
that we may want an option to `git merge` to allow taking a |
|
union of subprojects. Again, this is a rare operation, and |
|
always taking a union would have created a toplevel project that |
|
had both `kernel/` and `linux-2.6/` bound to the same Linux |
|
kernel project from possibly different vintage, so it would be |
|
prudent to require the set of bound subprojects to exactly match |
|
and give the user an option to take a union. |
|
|
|
------------ |
|
$ git merge --union-subprojects 'Merge in a side branch HEAD side |
|
error: the subproject at 'kernel/' needs to be merged first. |
|
------------ |
|
|
|
Here, the version of the Linux kernel project in the `side` |
|
branch was different from what our branch had on our `bind` |
|
line. On what kind of difference should we give this error? |
|
Initially, I think we could require one is the fast forward of |
|
the other (ours might be ahead of theirs, or the other way |
|
around), and take the descendant. |
|
|
|
Or we could do an independent merge of subprojects heads, using |
|
the `parent` ancestry of the bound subproject heads to find |
|
their merge-base and doing a three-way merge. This would leave |
|
the merge result in the subproject part of the working tree and |
|
the index. |
|
|
|
[NOTE] |
|
This is the reason we did not do the whole-tree three way merge |
|
earlier. The subproject commit bound to the merge base commit |
|
used for the toplevel project may not be the merge base between |
|
the subproject commits bound to the two toplevel project |
|
commits. |
|
|
|
So let's deal with the case to merge only a subproject part into |
|
our tree first. |
|
|
|
|
|
Merging subprojects |
|
------------------- |
|
|
|
An operation of more practical importance is to be able to merge |
|
in changes done outside to the projects bound to our toplevel |
|
project. |
|
|
|
------------ |
|
$ git pull --subproject=kernel/ git://git.kernel.org/.../linux-2.6/ |
|
------------ |
|
|
|
might do: |
|
|
|
. fetch the current `HEAD` commit from Linus. |
|
. find the subproject commit bound at kernel/ subtree. |
|
. perform the usual three-way merge of these two commits, in |
|
`kernel/` part of the working tree. |
|
|
|
After that, `git commit \--subproject` option would be needed to |
|
make a commit. |
|
|
|
[NOTE] |
|
This suggests that we would need to have something similar to |
|
`MERGE_HEAD` for merging the subproject part. In the case of |
|
merging two toplevel project commits, we probably can read the |
|
`bind` lines from the `MERGE_HEAD` commit and either our `HEAD` |
|
commit or our index file. Further, we probably would require |
|
that the latter two must match, just as we currently require the |
|
index file matches our `HEAD` commit before `git merge`. |
|
|
|
Just like the current `pull = fetch + merge` semantics, the |
|
subproject aware version `git pull \--subproject=frotz/` would be |
|
a `git fetch \--subproject=frotz/` followed by a `git merge |
|
\--subproject=frotz/`. So the above would be: |
|
|
|
. Fetch the head. |
|
+ |
|
------------ |
|
$ git fetch --subproject=kernel/ git://git.kernel.org/.../linux-2.6/ |
|
------------ |
|
+ |
|
which would fetch the commit chain from the remote repository, and |
|
write something like this to `FETCH_HEAD`: |
|
+ |
|
------------ |
|
3ee68c4...\tfor-merge-into kernel/\tbranch 'master' of git://.../linux-2.6 |
|
------------ |
|
|
|
. Run `git merge`. |
|
+ |
|
------------ |
|
$ git merge --subproject=kernel/ \ |
|
'Merge git://.../linux-2.6 into kernel/' HEAD 3ee68c4... |
|
------------ |
|
|
|
. In case it does not cleanly automerge, `git merge` would write |
|
the necessary information for a later `git commit` to use in |
|
`MERGE_HEAD`. It may look like this: |
|
+ |
|
------------ |
|
3ee68c4af3fd7228c1be63254b9f884614f9ebb2 kernel/ |
|
------------ |
|
+ |
|
Similarly, `MERGE_MSG` file will hold the merge message. |
|
|
|
With this, a later invocation of `git commit` to record the |
|
result of hand resolving would be able to notice that: |
|
|
|
. We should be first resolving `kernel/` subproject, not the |
|
whole thing. |
|
. The remote `HEAD` is `3ee68c4\...` commit. |
|
. The merge message is `Merge git://\.../linux-2.6 into kernel/`. |
|
|
|
and would make a merge commit, and register that resulting |
|
commit in the index file using `update-index \--bind` instead of |
|
updating *any* branch head. |
|
|
|
|
|
Management of Subprojects |
|
------------------------- |
|
|
|
While the above as a mechanism would support version controlling |
|
of subprojects as a part of *one* larger toplevel project, it |
|
probably is worth pointing out that having a separate repository |
|
to manage the subproject independently would be a good idea. |
|
The same subproject can be incorporated into more than one |
|
toplevel projects, and after all, a subproject should be |
|
something that can stand on its own. In our example scenario, |
|
the `kernel/` project is used as a subproject for the "gadget" |
|
product, but at the same time, the organizaton that runs the |
|
"gadget" project may use Linux on their development machines, |
|
and have their own kernel hackers, not necessarily related to |
|
the use of the kernel in the "gadget" product. |
|
|
|
What this suggests is that not just we need to be able to pull |
|
the kernel development history *into* the subproject of the |
|
"gadget" project, but also we need to be able to push the |
|
development history of the kernel part alone *out* *of* the |
|
"gadget" project to another repository that deals only with the |
|
kernel part. |
|
|
|
It might go this way. First the setup: |
|
|
|
------------ |
|
$ git clone git://git.kernel.org/.../linux-2.6 Linux |
|
$ ls -dF * |
|
cloned/ combined/ current/ Linux/ |
|
------------ |
|
|
|
That is, in addition to the `combined/` which we have been using |
|
to develop the "gadget" product in, we now have a repository for |
|
the kernel, cloned from Linus. In the previous section, we have |
|
outlined how we update the kernel subproject part of `combined/` |
|
repository from the `kernel.org` repository. The same procedure |
|
would work for pulling from `Linux/` repository here. |
|
|
|
We are now going the other way; propagate the kernel work done |
|
in the "gadget" project repository `combined/` back to `Linux/`. |
|
We might do this at the lowest level: |
|
|
|
------------ |
|
$ cd combined |
|
$ git cat-file commit HEAD | |
|
sed -ne 's|^bind \([0-9a-f]*\) kernel/$|\1|p' >.git/refs/heads/linux26 |
|
$ git push ../Linux linux26:master |
|
------------ |
|
|
|
Or, more realistically, since the `Linux` project might already |
|
have their own commits on its `master`: |
|
|
|
------------ |
|
$ cd Linux |
|
$ git pull ../combined linux26 |
|
------------ |
|
|
|
Either way we would need an easy way to maintain the `linux26` |
|
branch in the above example, and that will have to be part of |
|
the wrapper scripts like `git commit` (more likely, that would |
|
be a job for `git commit \--subproject`) for the usability's |
|
sake; in other words, the `cat-file commit` piped to `sed` above |
|
is not something the end user would do, but something that is |
|
done by the wrapper scripts. |
|
|
|
Hopefully the people who work in `Linux/` repository would run |
|
`format-patch` and feed their changes back to the kernel |
|
community.
|
|
|