parent
18ea0bf72c
commit
725ca8a8e7
|
|
@ -0,0 +1,11 @@
|
|||
all:
|
||||
|
||||
clean:
|
||||
rm -f Subpro.html
|
||||
|
||||
|
||||
all: Subpro.html
|
||||
|
||||
%.html: %.txt
|
||||
asciidoc -bxhtml11 $*.txt
|
||||
|
||||
|
|
@ -0,0 +1,472 @@
|
|||
Notes on Subproject Support
|
||||
===========================
|
||||
Junio C Hamano
|
||||
|
||||
Scenario
|
||||
--------
|
||||
|
||||
The examples in the following discussion show how this proposal
|
||||
plans to help this:
|
||||
|
||||
. A project to build an embedded Linux appliance "gadget" is
|
||||
maintained with git.
|
||||
|
||||
. The project uses linux-2.6 kernel as its subcomponent. It
|
||||
starts from a particular version of the mainline kernel, but
|
||||
adds its own code and build infrastructure to fit the
|
||||
appliance's needs.
|
||||
|
||||
. The working tree of the project is laid out this way:
|
||||
+
|
||||
------------
|
||||
Makefile - Builds the whole thing.
|
||||
linux-2.6/ - The kernel, perhaps modified for the project.
|
||||
appliance/ - Applications that run on the appliance, and
|
||||
other bits.
|
||||
------------
|
||||
|
||||
. The project is willing to maintain its own changes out of tree
|
||||
of the Linux kernel project, but would want to be able to feed
|
||||
the changes upstream, and incorporate upstream changes to its
|
||||
own tree, taking advantage of the fact that both itself and
|
||||
the Linux kernel project are version controlled with git.
|
||||
|
||||
. To make the story a bit more interesting, later in the history
|
||||
of development, `linux-2.6/` and `appliance/` directories will
|
||||
be renamed to `kernel/` and `gadget/`.
|
||||
|
||||
The idea here is to:
|
||||
|
||||
. Keep `linux-2.6/` part as an independent project. The work by
|
||||
the project on the kernel part can be naturally exchanged with
|
||||
the other kernel developers this way. Specifically, a tree
|
||||
object contained in commit objects belonging to this project
|
||||
does *not* have `linux-2.6/` directory at the top.
|
||||
|
||||
. Keep the `appliance/` part as another independent project.
|
||||
Applications are supposed to be more or less independent from
|
||||
the kernel version, but some other bits might be tied to a
|
||||
specific kernel version. Again, a tree object contained in
|
||||
commit objects belonging to this project does *not* have
|
||||
`appliance/` directory at the top.
|
||||
|
||||
. Have another project that combines the whole thing together,
|
||||
so that the project can keep track of which versions of the
|
||||
parts are built together.
|
||||
|
||||
We will call the project that binds things together the
|
||||
'toplevel project'. Other projects that hold `linux-2.6/` part
|
||||
and `appliance/` part are called 'subprojects'.
|
||||
|
||||
|
||||
Setting up
|
||||
----------
|
||||
|
||||
Let's say we have been working on the appliance software,
|
||||
independently version controlled with git. Also the kernel part
|
||||
has been version controlled separately, like this:
|
||||
------------
|
||||
$ ls -dF current/*/.git current/*
|
||||
current/Makefile current/appliance/.git/ current/linux-2.6/.git/
|
||||
current/appliance/ current/linux-2.6/
|
||||
------------
|
||||
|
||||
Now we would want to get a combined project. First we would
|
||||
clone from these repositories (which is not strictly needed --
|
||||
we could use `$GIT_ALTERNATE_OBJECT_DIRECTORIES` instead):
|
||||
|
||||
------------
|
||||
$ mkdir combined && cd combined
|
||||
$ cp ../current/Makefile .
|
||||
$ git init-db
|
||||
$ mkdir -p .git/refs/subs/{kernel,gadget}/{heads,tags}
|
||||
$ git clone-pack ../current/linux-2.6/ master | read kernel_commit junk
|
||||
$ git clone-pack ../current/appliance/ master | read gadget_commit junk
|
||||
------------
|
||||
|
||||
We will introduce a new command to set up a combined project:
|
||||
|
||||
------------
|
||||
$ git bind-projects \
|
||||
$kernel_commit linux-2.6/ \
|
||||
$gadget_commit appliance/
|
||||
------------
|
||||
|
||||
This would probably do an equivalent of:
|
||||
|
||||
------------
|
||||
$ rm -f "$GIT_DIR/index"
|
||||
$ git read-tree --prefix=linux-2.6/ $kernel_commit
|
||||
$ git read-tree --prefix=appliance/ $gadget_commit
|
||||
$ git update-index --bind linux-2.6/ $kernel_commit
|
||||
$ git update-index --bind appliance/ $gadget_commit
|
||||
------------
|
||||
[NOTE]
|
||||
============
|
||||
Earlier outlines sent to the git mailing list talked
|
||||
about `$GIT_DIR/bind` to record what subproject are bound to
|
||||
which subtree in the current working tree and index. This
|
||||
proposal instead records that information in the index file
|
||||
with `update-index --bind` command.
|
||||
|
||||
Also note that in this round of proposal, there is no separate
|
||||
branches that keep track of heads of subprojects.
|
||||
============
|
||||
|
||||
Let's not forget to add the `Makefile`, and check the whole
|
||||
thing out from the index file.
|
||||
------------
|
||||
$ git add Makefile
|
||||
$ git checkout-index -f -u -q -a
|
||||
------------
|
||||
|
||||
Now our directory should be identical with the `current`
|
||||
directory. After making sure of that, we should be able to
|
||||
commit the whole thing:
|
||||
|
||||
------------
|
||||
$ diff -x .git -r ../current ../combined
|
||||
$ git commit -m 'Initial toplevel project commit'
|
||||
------------
|
||||
|
||||
Which should create a new commit object that records what is in
|
||||
the index file as its tree, with `bind` lines to record which
|
||||
subproject commit objects are bound at what subdirectory, and
|
||||
updates the `$GIT_DIR/refs/heads/master`. Such a commit object
|
||||
might look like this:
|
||||
------------
|
||||
tree 04803b09c300c8325258ccf2744115acc4c57067
|
||||
bind 5b2bcc7b2d546c636f79490655b3347acc91d17f linux-2.6/
|
||||
bind 0bdd79af62e8621359af08f0afca0ce977348ac7 appliance/
|
||||
author Junio C Hamano <junio@kernel.org> 1137965565 -0800
|
||||
committer Junio C Hamano <junio@kernel.org> 1137965565 -0800
|
||||
|
||||
Initial toplevel project commit
|
||||
------------
|
||||
|
||||
Notice that `Makefile` at the top is part of the toplevel
|
||||
project in this example, but it is not necessary. We could
|
||||
instead have the appliance subproject include this file. In
|
||||
such a setup, the appliance subproject would have had `Makefile`
|
||||
and `appliance/` directory at the toplevel. The `bind` line for
|
||||
that project would have said "the rest is bound at `/`" and
|
||||
`write-tree \--exclude=linux-2.6/` would have been used to write
|
||||
the tree for that subproject out of the combined index.
|
||||
|
||||
|
||||
Making further commits
|
||||
----------------------
|
||||
|
||||
The easiest case is when you updated the Makefile without
|
||||
changing anything in the subprojects. In such a case, we just
|
||||
need to create a new commmit object that records the new tree
|
||||
with the current `HEAD` as its parent, and with the same set of
|
||||
`bind` lines.
|
||||
|
||||
When we have changes to the subproject part, we would make a
|
||||
separate commit to the subproject part and then record the whole
|
||||
thing by making a commit to the toplevel project. The user
|
||||
interaction might go this way:
|
||||
------------
|
||||
$ git commit
|
||||
error: you have changes to the subproject bound at linux-2.6/.
|
||||
$ git commit --subproject linux-2.6/
|
||||
$ git commit
|
||||
------------
|
||||
|
||||
With the new `\--subproject` option, the directory structure
|
||||
rooted at `linux-2.6/` part is written out as a tree, and a new
|
||||
commit object that records that tree object with the commit
|
||||
bound to that portion of the tree (`5b2bcc7b` in the above
|
||||
example) as its parent is created. Then the final `git commit`
|
||||
would record the whole tree with updated `bind` line for the
|
||||
`linux-2.6/` part.
|
||||
|
||||
|
||||
Checking out
|
||||
------------
|
||||
|
||||
After cloning such a toplevel project, `git clone` without `-n`
|
||||
option would check out the working tree. This is done by
|
||||
reading the tree object recorded in the commit object (which
|
||||
records the whole thing), and adding the information from the
|
||||
"bind" line to the index file.
|
||||
|
||||
------------
|
||||
$ cd ..
|
||||
$ git clone -n combined cloned ;# clone the one we created earlier
|
||||
$ cd cloned
|
||||
$ git checkout
|
||||
------------
|
||||
|
||||
This round of proposal does not maintain separate branch heads
|
||||
for subprojects. The bound commits and their subdirectories
|
||||
are recorded in the index file from the commit object, so there
|
||||
is no need to do anything other than updating the index and the
|
||||
working tree.
|
||||
|
||||
|
||||
Switching branches
|
||||
------------------
|
||||
|
||||
Along with the traditional two-way merge by `read-tree -m -u`,
|
||||
we would need to look at:
|
||||
|
||||
. `bind` lines in the current `HEAD` commit.
|
||||
|
||||
. `bind` lines in the commit we are switching to.
|
||||
|
||||
. subproject binding information in the index file.
|
||||
|
||||
to make sure we do sensible things.
|
||||
|
||||
Just like until very recently we did not allow switching
|
||||
branches when two-way merge would lose local changes, we can
|
||||
start by refusing to switch branches when the subprojects bound
|
||||
in the index do not match what is recorded in the `HEAD` commit.
|
||||
|
||||
Because in this round of the proposal we do not use the
|
||||
`$GIT_DIR/bind` file nor separate branches to keep track of
|
||||
heads of the subprojects, there is nothing else other than the
|
||||
working tree and the index file that needs to be updated when
|
||||
switching branches.
|
||||
|
||||
|
||||
Merging
|
||||
-------
|
||||
|
||||
Merging two branches of the toplevel projects can use the
|
||||
traditional merging mechanism mostly unchanged. The merge base
|
||||
computation can be done using the `parent` ancestry information
|
||||
taken from the two toplevel project branch heads being merged,
|
||||
and merging of the whole tree can be done with a three-way merge
|
||||
of the whole tree using the merge base and two head commits.
|
||||
For reasons described later, we would not merge the subproject
|
||||
parts of the trees during this step, though.
|
||||
|
||||
When the two branch heads use different versions of subproject,
|
||||
things get a bit tricky. First, let's forget for a moment about
|
||||
the case where they bind the same project at different location.
|
||||
We would refuse if they do not have the same number of `bind`
|
||||
lines that bind something at the same subdirectories.
|
||||
|
||||
------------
|
||||
$ git merge 'Merge in a side branch' HEAD side
|
||||
error: the merged heads have subprojects bound at different places.
|
||||
ours:
|
||||
linux-2.6/
|
||||
appliance/
|
||||
theirs:
|
||||
kernel/
|
||||
gadget/
|
||||
manual/
|
||||
------------
|
||||
|
||||
Such renaming can be handled by first moving the bind points in
|
||||
our branch, and redoing the merge (this is a rare operation
|
||||
anyway). It might go like this:
|
||||
|
||||
------------
|
||||
$ git reset
|
||||
$ git update-index --unbind linux-2.6/
|
||||
$ git update-index --unbind appliance/
|
||||
$ git update-index --bind $kernel_commit kernel/
|
||||
$ git update-index --bind $gadget_commit gadget/
|
||||
$ git commit -m 'Prepare for merge with side branch'
|
||||
$ git merge 'Merge in a side branch' HEAD side
|
||||
error: the merged heads have subprojects bound at different places.
|
||||
ours:
|
||||
kernel/
|
||||
gadget/
|
||||
theirs:
|
||||
kernel/
|
||||
gadget/
|
||||
manual/
|
||||
------------
|
||||
|
||||
Their branch added another subproject, so this did not work (or
|
||||
it could be the other way around -- we might have been the one
|
||||
with `manual/` subproject while they didn't). This suggests
|
||||
that we may want an option to `git merge` to allow taking a
|
||||
union of subprojects. Again, this is a rare operation, and
|
||||
always taking a union would have created a toplevel project that
|
||||
had both `kernel/` and `linux-2.6/` bound to the same Linux
|
||||
kernel project from possibly different vintage, so it would be
|
||||
prudent to require the set of bound subprojects to exactly match
|
||||
and give the user an option to take a union.
|
||||
|
||||
------------
|
||||
$ git merge --union-subprojects 'Merge in a side branch HEAD side
|
||||
error: the subproject at 'kernel/' needs to be merged first.
|
||||
------------
|
||||
|
||||
Here, the version of the Linux kernel project in the `side`
|
||||
branch was different from what our branch had on our `bind`
|
||||
line. On what kind of difference should we give this error?
|
||||
Initially, I think we could require one is the fast forward of
|
||||
the other (ours might be ahead of theirs, or the other way
|
||||
around), and take the descendant.
|
||||
|
||||
Or we could do an independent merge of subprojects heads, using
|
||||
the `parent` ancestry of the bound subproject heads to find
|
||||
their merge-base and doing a three-way merge. This would leave
|
||||
the merge result in the subproject part of the working tree and
|
||||
the index.
|
||||
|
||||
[NOTE]
|
||||
This is the reason we did not do the whole-tree three way merge
|
||||
earlier. The subproject commit bound to the merge base commit
|
||||
used for the toplevel project may not be the merge base between
|
||||
the subproject commits bound to the two toplevel project
|
||||
commits.
|
||||
|
||||
So let's deal with the case to merge only a subproject part into
|
||||
our tree first.
|
||||
|
||||
|
||||
Merging subprojects
|
||||
-------------------
|
||||
|
||||
An operation of more practical importance is to be able to merge
|
||||
in changes done outside to the projects bound to our toplevel
|
||||
project.
|
||||
|
||||
------------
|
||||
$ git pull --subproject=kernel/ git://git.kernel.org/.../linux-2.6/
|
||||
------------
|
||||
|
||||
might do:
|
||||
|
||||
. fetch the current `HEAD` commit from Linus.
|
||||
. find the subproject commit bound at kernel/ subtree.
|
||||
. perform the usual three-way merge of these two commits, in
|
||||
`kernel/` part of the working tree.
|
||||
|
||||
After that, `git commit \--subproject` option would be needed to
|
||||
make a commit.
|
||||
|
||||
[NOTE]
|
||||
This suggests that we would need to have something similar to
|
||||
`MERGE_HEAD` for merging the subproject part. In the case of
|
||||
merging two toplevel project commits, we probably can read the
|
||||
`bind` lines from the `MERGE_HEAD` commit and either our `HEAD`
|
||||
commit or our index file. Further, we probably would require
|
||||
that the latter two must match, just as we currently require the
|
||||
index file matches our `HEAD` commit before `git merge`.
|
||||
|
||||
Just like the current `pull = fetch + merge` semantics, the
|
||||
subproject aware version `git pull \--subproject=frotz/` would be
|
||||
a `git fetch \--subproject=frotz/` followed by a `git merge
|
||||
\--subproject=frotz/`. So the above would be:
|
||||
|
||||
. Fetch the head.
|
||||
+
|
||||
------------
|
||||
$ git fetch --subproject=kernel/ git://git.kernel.org/.../linux-2.6/
|
||||
------------
|
||||
+
|
||||
which would fetch the commit chain from the remote repository, and
|
||||
write something like this to `FETCH_HEAD`:
|
||||
+
|
||||
------------
|
||||
3ee68c4...\tfor-merge-into kernel/\tbranch 'master' of git://.../linux-2.6
|
||||
------------
|
||||
|
||||
. Run `git merge`.
|
||||
+
|
||||
------------
|
||||
$ git merge --subproject=kernel/ \
|
||||
'Merge git://.../linux-2.6 into kernel/' HEAD 3ee68c4...
|
||||
------------
|
||||
|
||||
. In case it does not cleanly automerge, `git merge` would write
|
||||
the necessary information for a later `git commit` to use in
|
||||
`MERGE_HEAD`. It may look like this:
|
||||
+
|
||||
------------
|
||||
3ee68c4af3fd7228c1be63254b9f884614f9ebb2 kernel/
|
||||
------------
|
||||
+
|
||||
Similarly, `MERGE_MSG` file will hold the merge message.
|
||||
|
||||
With this, a later invocation of `git commit` to record the
|
||||
result of hand resolving would be able to notice that:
|
||||
|
||||
. We should be first resolving `kernel/` subproject, not the
|
||||
whole thing.
|
||||
. The remote `HEAD` is `3ee68c4\...` commit.
|
||||
. The merge message is `Merge git://\.../linux-2.6 into kernel/`.
|
||||
|
||||
and would make a merge commit, and register that resulting
|
||||
commit in the index file using `update-index \--bind` instead of
|
||||
updating *any* branch head.
|
||||
|
||||
|
||||
Management of Subprojects
|
||||
-------------------------
|
||||
|
||||
While the above as a mechanism would support version controlling
|
||||
of subprojects as a part of *one* larger toplevel project, it
|
||||
probably is worth pointing out that having a separate repository
|
||||
to manage the subproject independently would be a good idea.
|
||||
The same subproject can be incorporated into more than one
|
||||
toplevel projects, and after all, a subproject should be
|
||||
something that can stand on its own. In our example scenario,
|
||||
the `kernel/` project is used as a subproject for the "gadget"
|
||||
product, but at the same time, the organizaton that runs the
|
||||
"gadget" project may use Linux on their development machines,
|
||||
and have their own kernel hackers, not necessarily related to
|
||||
the use of the kernel in the "gadget" product.
|
||||
|
||||
What this suggests is that not just we need to be able to pull
|
||||
the kernel development history *into* the subproject of the
|
||||
"gadget" project, but also we need to be able to push the
|
||||
development history of the kernel part alone *out* *of* the
|
||||
"gadget" project to another repository that deals only with the
|
||||
kernel part.
|
||||
|
||||
It might go this way. First the setup:
|
||||
|
||||
------------
|
||||
$ git clone git://git.kernel.org/.../linux-2.6 Linux
|
||||
$ ls -dF *
|
||||
cloned/ combined/ current/ Linux/
|
||||
------------
|
||||
|
||||
That is, in addition to the `combined/` which we have been using
|
||||
to develop the "gadget" product in, we now have a repository for
|
||||
the kernel, cloned from Linus. In the previous section, we have
|
||||
outlined how we update the kernel subproject part of `combined/`
|
||||
repository from the `kernel.org` repository. The same procedure
|
||||
would work for pulling from `Linux/` repository here.
|
||||
|
||||
We are now going the other way; propagate the kernel work done
|
||||
in the "gadget" project repository `combined/` back to `Linux/`.
|
||||
We might do this at the lowest level:
|
||||
|
||||
------------
|
||||
$ cd combined
|
||||
$ git cat-file commit HEAD |
|
||||
sed -ne 's|^bind \([0-9a-f]*\) kernel/$|\1|p' >.git/refs/heads/linux26
|
||||
$ git push ../Linux linux26:master
|
||||
------------
|
||||
|
||||
Or, more realistically, since the `Linux` project might already
|
||||
have their own commits on its `master`:
|
||||
|
||||
------------
|
||||
$ cd Linux
|
||||
$ git pull ../combined linux26
|
||||
------------
|
||||
|
||||
Either way we would need an easy way to maintain the `linux26`
|
||||
branch in the above example, and that will have to be part of
|
||||
the wrapper scripts like `git commit` (more likely, that would
|
||||
be a job for `git commit \--subproject`) for the usability's
|
||||
sake; in other words, the `cat-file commit` piped to `sed` above
|
||||
is not something the end user would do, but something that is
|
||||
done by the wrapper scripts.
|
||||
|
||||
Hopefully the people who work in `Linux/` repository would run
|
||||
`format-patch` and feed their changes back to the kernel
|
||||
community.
|
||||
Loading…
Reference in New Issue