You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
473 lines
16 KiB
473 lines
16 KiB
![]()
19 years ago
|
Notes on Subproject Support
|
||
|
===========================
|
||
|
Junio C Hamano
|
||
|
|
||
|
Scenario
|
||
|
--------
|
||
|
|
||
|
The examples in the following discussion show how this proposal
|
||
|
plans to help this:
|
||
|
|
||
|
. A project to build an embedded Linux appliance "gadget" is
|
||
|
maintained with git.
|
||
|
|
||
|
. The project uses linux-2.6 kernel as its subcomponent. It
|
||
|
starts from a particular version of the mainline kernel, but
|
||
|
adds its own code and build infrastructure to fit the
|
||
|
appliance's needs.
|
||
|
|
||
|
. The working tree of the project is laid out this way:
|
||
|
+
|
||
|
------------
|
||
|
Makefile - Builds the whole thing.
|
||
|
linux-2.6/ - The kernel, perhaps modified for the project.
|
||
|
appliance/ - Applications that run on the appliance, and
|
||
|
other bits.
|
||
|
------------
|
||
|
|
||
|
. The project is willing to maintain its own changes out of tree
|
||
|
of the Linux kernel project, but would want to be able to feed
|
||
|
the changes upstream, and incorporate upstream changes to its
|
||
|
own tree, taking advantage of the fact that both itself and
|
||
|
the Linux kernel project are version controlled with git.
|
||
|
|
||
|
. To make the story a bit more interesting, later in the history
|
||
|
of development, `linux-2.6/` and `appliance/` directories will
|
||
|
be renamed to `kernel/` and `gadget/`.
|
||
|
|
||
|
The idea here is to:
|
||
|
|
||
|
. Keep `linux-2.6/` part as an independent project. The work by
|
||
|
the project on the kernel part can be naturally exchanged with
|
||
|
the other kernel developers this way. Specifically, a tree
|
||
|
object contained in commit objects belonging to this project
|
||
|
does *not* have `linux-2.6/` directory at the top.
|
||
|
|
||
|
. Keep the `appliance/` part as another independent project.
|
||
|
Applications are supposed to be more or less independent from
|
||
|
the kernel version, but some other bits might be tied to a
|
||
|
specific kernel version. Again, a tree object contained in
|
||
|
commit objects belonging to this project does *not* have
|
||
|
`appliance/` directory at the top.
|
||
|
|
||
|
. Have another project that combines the whole thing together,
|
||
|
so that the project can keep track of which versions of the
|
||
|
parts are built together.
|
||
|
|
||
|
We will call the project that binds things together the
|
||
|
'toplevel project'. Other projects that hold `linux-2.6/` part
|
||
|
and `appliance/` part are called 'subprojects'.
|
||
|
|
||
|
|
||
|
Setting up
|
||
|
----------
|
||
|
|
||
|
Let's say we have been working on the appliance software,
|
||
|
independently version controlled with git. Also the kernel part
|
||
|
has been version controlled separately, like this:
|
||
|
------------
|
||
|
$ ls -dF current/*/.git current/*
|
||
|
current/Makefile current/appliance/.git/ current/linux-2.6/.git/
|
||
|
current/appliance/ current/linux-2.6/
|
||
|
------------
|
||
|
|
||
|
Now we would want to get a combined project. First we would
|
||
|
clone from these repositories (which is not strictly needed --
|
||
|
we could use `$GIT_ALTERNATE_OBJECT_DIRECTORIES` instead):
|
||
|
|
||
|
------------
|
||
|
$ mkdir combined && cd combined
|
||
|
$ cp ../current/Makefile .
|
||
|
$ git init-db
|
||
|
$ mkdir -p .git/refs/subs/{kernel,gadget}/{heads,tags}
|
||
|
$ git clone-pack ../current/linux-2.6/ master | read kernel_commit junk
|
||
|
$ git clone-pack ../current/appliance/ master | read gadget_commit junk
|
||
|
------------
|
||
|
|
||
|
We will introduce a new command to set up a combined project:
|
||
|
|
||
|
------------
|
||
|
$ git bind-projects \
|
||
|
$kernel_commit linux-2.6/ \
|
||
|
$gadget_commit appliance/
|
||
|
------------
|
||
|
|
||
|
This would probably do an equivalent of:
|
||
|
|
||
|
------------
|
||
|
$ rm -f "$GIT_DIR/index"
|
||
|
$ git read-tree --prefix=linux-2.6/ $kernel_commit
|
||
|
$ git read-tree --prefix=appliance/ $gadget_commit
|
||
|
$ git update-index --bind linux-2.6/ $kernel_commit
|
||
|
$ git update-index --bind appliance/ $gadget_commit
|
||
|
------------
|
||
|
[NOTE]
|
||
|
============
|
||
|
Earlier outlines sent to the git mailing list talked
|
||
|
about `$GIT_DIR/bind` to record what subproject are bound to
|
||
|
which subtree in the current working tree and index. This
|
||
|
proposal instead records that information in the index file
|
||
|
with `update-index --bind` command.
|
||
|
|
||
|
Also note that in this round of proposal, there is no separate
|
||
|
branches that keep track of heads of subprojects.
|
||
|
============
|
||
|
|
||
|
Let's not forget to add the `Makefile`, and check the whole
|
||
|
thing out from the index file.
|
||
|
------------
|
||
|
$ git add Makefile
|
||
|
$ git checkout-index -f -u -q -a
|
||
|
------------
|
||
|
|
||
|
Now our directory should be identical with the `current`
|
||
|
directory. After making sure of that, we should be able to
|
||
|
commit the whole thing:
|
||
|
|
||
|
------------
|
||
|
$ diff -x .git -r ../current ../combined
|
||
|
$ git commit -m 'Initial toplevel project commit'
|
||
|
------------
|
||
|
|
||
|
Which should create a new commit object that records what is in
|
||
|
the index file as its tree, with `bind` lines to record which
|
||
|
subproject commit objects are bound at what subdirectory, and
|
||
|
updates the `$GIT_DIR/refs/heads/master`. Such a commit object
|
||
|
might look like this:
|
||
|
------------
|
||
|
tree 04803b09c300c8325258ccf2744115acc4c57067
|
||
|
bind 5b2bcc7b2d546c636f79490655b3347acc91d17f linux-2.6/
|
||
|
bind 0bdd79af62e8621359af08f0afca0ce977348ac7 appliance/
|
||
|
author Junio C Hamano <junio@kernel.org> 1137965565 -0800
|
||
|
committer Junio C Hamano <junio@kernel.org> 1137965565 -0800
|
||
|
|
||
|
Initial toplevel project commit
|
||
|
------------
|
||
|
|
||
|
Notice that `Makefile` at the top is part of the toplevel
|
||
|
project in this example, but it is not necessary. We could
|
||
|
instead have the appliance subproject include this file. In
|
||
|
such a setup, the appliance subproject would have had `Makefile`
|
||
|
and `appliance/` directory at the toplevel. The `bind` line for
|
||
|
that project would have said "the rest is bound at `/`" and
|
||
|
`write-tree \--exclude=linux-2.6/` would have been used to write
|
||
|
the tree for that subproject out of the combined index.
|
||
|
|
||
|
|
||
|
Making further commits
|
||
|
----------------------
|
||
|
|
||
|
The easiest case is when you updated the Makefile without
|
||
|
changing anything in the subprojects. In such a case, we just
|
||
|
need to create a new commmit object that records the new tree
|
||
|
with the current `HEAD` as its parent, and with the same set of
|
||
|
`bind` lines.
|
||
|
|
||
|
When we have changes to the subproject part, we would make a
|
||
|
separate commit to the subproject part and then record the whole
|
||
|
thing by making a commit to the toplevel project. The user
|
||
|
interaction might go this way:
|
||
|
------------
|
||
|
$ git commit
|
||
|
error: you have changes to the subproject bound at linux-2.6/.
|
||
|
$ git commit --subproject linux-2.6/
|
||
|
$ git commit
|
||
|
------------
|
||
|
|
||
|
With the new `\--subproject` option, the directory structure
|
||
|
rooted at `linux-2.6/` part is written out as a tree, and a new
|
||
|
commit object that records that tree object with the commit
|
||
|
bound to that portion of the tree (`5b2bcc7b` in the above
|
||
|
example) as its parent is created. Then the final `git commit`
|
||
|
would record the whole tree with updated `bind` line for the
|
||
|
`linux-2.6/` part.
|
||
|
|
||
|
|
||
|
Checking out
|
||
|
------------
|
||
|
|
||
|
After cloning such a toplevel project, `git clone` without `-n`
|
||
|
option would check out the working tree. This is done by
|
||
|
reading the tree object recorded in the commit object (which
|
||
|
records the whole thing), and adding the information from the
|
||
|
"bind" line to the index file.
|
||
|
|
||
|
------------
|
||
|
$ cd ..
|
||
|
$ git clone -n combined cloned ;# clone the one we created earlier
|
||
|
$ cd cloned
|
||
|
$ git checkout
|
||
|
------------
|
||
|
|
||
|
This round of proposal does not maintain separate branch heads
|
||
|
for subprojects. The bound commits and their subdirectories
|
||
|
are recorded in the index file from the commit object, so there
|
||
|
is no need to do anything other than updating the index and the
|
||
|
working tree.
|
||
|
|
||
|
|
||
|
Switching branches
|
||
|
------------------
|
||
|
|
||
|
Along with the traditional two-way merge by `read-tree -m -u`,
|
||
|
we would need to look at:
|
||
|
|
||
|
. `bind` lines in the current `HEAD` commit.
|
||
|
|
||
|
. `bind` lines in the commit we are switching to.
|
||
|
|
||
|
. subproject binding information in the index file.
|
||
|
|
||
|
to make sure we do sensible things.
|
||
|
|
||
|
Just like until very recently we did not allow switching
|
||
|
branches when two-way merge would lose local changes, we can
|
||
|
start by refusing to switch branches when the subprojects bound
|
||
|
in the index do not match what is recorded in the `HEAD` commit.
|
||
|
|
||
|
Because in this round of the proposal we do not use the
|
||
|
`$GIT_DIR/bind` file nor separate branches to keep track of
|
||
|
heads of the subprojects, there is nothing else other than the
|
||
|
working tree and the index file that needs to be updated when
|
||
|
switching branches.
|
||
|
|
||
|
|
||
|
Merging
|
||
|
-------
|
||
|
|
||
|
Merging two branches of the toplevel projects can use the
|
||
|
traditional merging mechanism mostly unchanged. The merge base
|
||
|
computation can be done using the `parent` ancestry information
|
||
|
taken from the two toplevel project branch heads being merged,
|
||
|
and merging of the whole tree can be done with a three-way merge
|
||
|
of the whole tree using the merge base and two head commits.
|
||
|
For reasons described later, we would not merge the subproject
|
||
|
parts of the trees during this step, though.
|
||
|
|
||
|
When the two branch heads use different versions of subproject,
|
||
|
things get a bit tricky. First, let's forget for a moment about
|
||
|
the case where they bind the same project at different location.
|
||
|
We would refuse if they do not have the same number of `bind`
|
||
|
lines that bind something at the same subdirectories.
|
||
|
|
||
|
------------
|
||
|
$ git merge 'Merge in a side branch' HEAD side
|
||
|
error: the merged heads have subprojects bound at different places.
|
||
|
ours:
|
||
|
linux-2.6/
|
||
|
appliance/
|
||
|
theirs:
|
||
|
kernel/
|
||
|
gadget/
|
||
|
manual/
|
||
|
------------
|
||
|
|
||
|
Such renaming can be handled by first moving the bind points in
|
||
|
our branch, and redoing the merge (this is a rare operation
|
||
|
anyway). It might go like this:
|
||
|
|
||
|
------------
|
||
|
$ git reset
|
||
|
$ git update-index --unbind linux-2.6/
|
||
|
$ git update-index --unbind appliance/
|
||
|
$ git update-index --bind $kernel_commit kernel/
|
||
|
$ git update-index --bind $gadget_commit gadget/
|
||
|
$ git commit -m 'Prepare for merge with side branch'
|
||
|
$ git merge 'Merge in a side branch' HEAD side
|
||
|
error: the merged heads have subprojects bound at different places.
|
||
|
ours:
|
||
|
kernel/
|
||
|
gadget/
|
||
|
theirs:
|
||
|
kernel/
|
||
|
gadget/
|
||
|
manual/
|
||
|
------------
|
||
|
|
||
|
Their branch added another subproject, so this did not work (or
|
||
|
it could be the other way around -- we might have been the one
|
||
|
with `manual/` subproject while they didn't). This suggests
|
||
|
that we may want an option to `git merge` to allow taking a
|
||
|
union of subprojects. Again, this is a rare operation, and
|
||
|
always taking a union would have created a toplevel project that
|
||
|
had both `kernel/` and `linux-2.6/` bound to the same Linux
|
||
|
kernel project from possibly different vintage, so it would be
|
||
|
prudent to require the set of bound subprojects to exactly match
|
||
|
and give the user an option to take a union.
|
||
|
|
||
|
------------
|
||
|
$ git merge --union-subprojects 'Merge in a side branch HEAD side
|
||
|
error: the subproject at 'kernel/' needs to be merged first.
|
||
|
------------
|
||
|
|
||
|
Here, the version of the Linux kernel project in the `side`
|
||
|
branch was different from what our branch had on our `bind`
|
||
|
line. On what kind of difference should we give this error?
|
||
|
Initially, I think we could require one is the fast forward of
|
||
|
the other (ours might be ahead of theirs, or the other way
|
||
|
around), and take the descendant.
|
||
|
|
||
|
Or we could do an independent merge of subprojects heads, using
|
||
|
the `parent` ancestry of the bound subproject heads to find
|
||
|
their merge-base and doing a three-way merge. This would leave
|
||
|
the merge result in the subproject part of the working tree and
|
||
|
the index.
|
||
|
|
||
|
[NOTE]
|
||
|
This is the reason we did not do the whole-tree three way merge
|
||
|
earlier. The subproject commit bound to the merge base commit
|
||
|
used for the toplevel project may not be the merge base between
|
||
|
the subproject commits bound to the two toplevel project
|
||
|
commits.
|
||
|
|
||
|
So let's deal with the case to merge only a subproject part into
|
||
|
our tree first.
|
||
|
|
||
|
|
||
|
Merging subprojects
|
||
|
-------------------
|
||
|
|
||
|
An operation of more practical importance is to be able to merge
|
||
|
in changes done outside to the projects bound to our toplevel
|
||
|
project.
|
||
|
|
||
|
------------
|
||
|
$ git pull --subproject=kernel/ git://git.kernel.org/.../linux-2.6/
|
||
|
------------
|
||
|
|
||
|
might do:
|
||
|
|
||
|
. fetch the current `HEAD` commit from Linus.
|
||
|
. find the subproject commit bound at kernel/ subtree.
|
||
|
. perform the usual three-way merge of these two commits, in
|
||
|
`kernel/` part of the working tree.
|
||
|
|
||
|
After that, `git commit \--subproject` option would be needed to
|
||
|
make a commit.
|
||
|
|
||
|
[NOTE]
|
||
|
This suggests that we would need to have something similar to
|
||
|
`MERGE_HEAD` for merging the subproject part. In the case of
|
||
|
merging two toplevel project commits, we probably can read the
|
||
|
`bind` lines from the `MERGE_HEAD` commit and either our `HEAD`
|
||
|
commit or our index file. Further, we probably would require
|
||
|
that the latter two must match, just as we currently require the
|
||
|
index file matches our `HEAD` commit before `git merge`.
|
||
|
|
||
|
Just like the current `pull = fetch + merge` semantics, the
|
||
|
subproject aware version `git pull \--subproject=frotz/` would be
|
||
|
a `git fetch \--subproject=frotz/` followed by a `git merge
|
||
|
\--subproject=frotz/`. So the above would be:
|
||
|
|
||
|
. Fetch the head.
|
||
|
+
|
||
|
------------
|
||
|
$ git fetch --subproject=kernel/ git://git.kernel.org/.../linux-2.6/
|
||
|
------------
|
||
|
+
|
||
|
which would fetch the commit chain from the remote repository, and
|
||
|
write something like this to `FETCH_HEAD`:
|
||
|
+
|
||
|
------------
|
||
|
3ee68c4...\tfor-merge-into kernel/\tbranch 'master' of git://.../linux-2.6
|
||
|
------------
|
||
|
|
||
|
. Run `git merge`.
|
||
|
+
|
||
|
------------
|
||
|
$ git merge --subproject=kernel/ \
|
||
|
'Merge git://.../linux-2.6 into kernel/' HEAD 3ee68c4...
|
||
|
------------
|
||
|
|
||
|
. In case it does not cleanly automerge, `git merge` would write
|
||
|
the necessary information for a later `git commit` to use in
|
||
|
`MERGE_HEAD`. It may look like this:
|
||
|
+
|
||
|
------------
|
||
|
3ee68c4af3fd7228c1be63254b9f884614f9ebb2 kernel/
|
||
|
------------
|
||
|
+
|
||
|
Similarly, `MERGE_MSG` file will hold the merge message.
|
||
|
|
||
|
With this, a later invocation of `git commit` to record the
|
||
|
result of hand resolving would be able to notice that:
|
||
|
|
||
|
. We should be first resolving `kernel/` subproject, not the
|
||
|
whole thing.
|
||
|
. The remote `HEAD` is `3ee68c4\...` commit.
|
||
|
. The merge message is `Merge git://\.../linux-2.6 into kernel/`.
|
||
|
|
||
|
and would make a merge commit, and register that resulting
|
||
|
commit in the index file using `update-index \--bind` instead of
|
||
|
updating *any* branch head.
|
||
|
|
||
|
|
||
|
Management of Subprojects
|
||
|
-------------------------
|
||
|
|
||
|
While the above as a mechanism would support version controlling
|
||
|
of subprojects as a part of *one* larger toplevel project, it
|
||
|
probably is worth pointing out that having a separate repository
|
||
|
to manage the subproject independently would be a good idea.
|
||
|
The same subproject can be incorporated into more than one
|
||
|
toplevel projects, and after all, a subproject should be
|
||
|
something that can stand on its own. In our example scenario,
|
||
|
the `kernel/` project is used as a subproject for the "gadget"
|
||
|
product, but at the same time, the organizaton that runs the
|
||
|
"gadget" project may use Linux on their development machines,
|
||
|
and have their own kernel hackers, not necessarily related to
|
||
|
the use of the kernel in the "gadget" product.
|
||
|
|
||
|
What this suggests is that not just we need to be able to pull
|
||
|
the kernel development history *into* the subproject of the
|
||
|
"gadget" project, but also we need to be able to push the
|
||
|
development history of the kernel part alone *out* *of* the
|
||
|
"gadget" project to another repository that deals only with the
|
||
|
kernel part.
|
||
|
|
||
|
It might go this way. First the setup:
|
||
|
|
||
|
------------
|
||
|
$ git clone git://git.kernel.org/.../linux-2.6 Linux
|
||
|
$ ls -dF *
|
||
|
cloned/ combined/ current/ Linux/
|
||
|
------------
|
||
|
|
||
|
That is, in addition to the `combined/` which we have been using
|
||
|
to develop the "gadget" product in, we now have a repository for
|
||
|
the kernel, cloned from Linus. In the previous section, we have
|
||
|
outlined how we update the kernel subproject part of `combined/`
|
||
|
repository from the `kernel.org` repository. The same procedure
|
||
|
would work for pulling from `Linux/` repository here.
|
||
|
|
||
|
We are now going the other way; propagate the kernel work done
|
||
|
in the "gadget" project repository `combined/` back to `Linux/`.
|
||
|
We might do this at the lowest level:
|
||
|
|
||
|
------------
|
||
|
$ cd combined
|
||
|
$ git cat-file commit HEAD |
|
||
|
sed -ne 's|^bind \([0-9a-f]*\) kernel/$|\1|p' >.git/refs/heads/linux26
|
||
|
$ git push ../Linux linux26:master
|
||
|
------------
|
||
|
|
||
|
Or, more realistically, since the `Linux` project might already
|
||
|
have their own commits on its `master`:
|
||
|
|
||
|
------------
|
||
|
$ cd Linux
|
||
|
$ git pull ../combined linux26
|
||
|
------------
|
||
|
|
||
|
Either way we would need an easy way to maintain the `linux26`
|
||
|
branch in the above example, and that will have to be part of
|
||
|
the wrapper scripts like `git commit` (more likely, that would
|
||
|
be a job for `git commit \--subproject`) for the usability's
|
||
|
sake; in other words, the `cat-file commit` piped to `sed` above
|
||
|
is not something the end user would do, but something that is
|
||
|
done by the wrapper scripts.
|
||
|
|
||
|
Hopefully the people who work in `Linux/` repository would run
|
||
|
`format-patch` and feed their changes back to the kernel
|
||
|
community.
|