From 71b785f8e26191ff928988d245d4b22e89118ba1 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Sun, 4 Mar 2007 22:56:00 -0800 Subject: [PATCH] Remove old subproject design notes. --- Makefile | 11 -- Subpro.txt | 482 ----------------------------------------------------- 2 files changed, 493 deletions(-) delete mode 100644 Makefile delete mode 100644 Subpro.txt diff --git a/Makefile b/Makefile deleted file mode 100644 index 32c8bd8573..0000000000 --- a/Makefile +++ /dev/null @@ -1,11 +0,0 @@ -all: - -clean: - rm -f Subpro.html - - -all: Subpro.html - -%.html: %.txt - asciidoc -bxhtml11 $*.txt - diff --git a/Subpro.txt b/Subpro.txt deleted file mode 100644 index 713b72eea2..0000000000 --- a/Subpro.txt +++ /dev/null @@ -1,482 +0,0 @@ -Notes on Subproject Support -=========================== -Junio C Hamano - -Scenario --------- - -The examples in the following discussion show how this proposal -plans to help this: - -. A project to build an embedded Linux appliance "gadget" is - maintained with git. - -. The project uses linux-2.6 kernel as its subcomponent. It - starts from a particular version of the mainline kernel, but - adds its own code and build infrastructure to fit the - appliance's needs. - -. The working tree of the project is laid out this way: -+ ------------- - Makefile - Builds the whole thing. - linux-2.6/ - The kernel, perhaps modified for the project. - appliance/ - Applications that run on the appliance, and - other bits. ------------- - -. The project is willing to maintain its own changes out of tree - of the Linux kernel project, but would want to be able to feed - the changes upstream, and incorporate upstream changes to its - own tree, taking advantage of the fact that both itself and - the Linux kernel project are version controlled with git. - -. To make the story a bit more interesting, later in the history - of development, `linux-2.6/` and `appliance/` directories will - be renamed to `kernel/` and `gadget/`. - -The idea here is to: - -. Keep `linux-2.6/` part as an independent project. The work by - the project on the kernel part can be naturally exchanged with - the other kernel developers this way. Specifically, a tree - object contained in commit objects belonging to this sub-project - does *not* have `linux-2.6/` directory at the top. - -. Keep the `appliance/` part as another independent project. - Applications are supposed to be more or less independent from - the kernel version, but some other bits might be tied to a - specific kernel version. Again, a tree object contained in - commit objects belonging to this sub-project does *not* have - `appliance/` directory at the top. - -. Have another project that combines the whole thing together, - so that the project can keep track of which versions of the - parts are built together. The Makefile is illustrated above, - but there might be other files and directories. - -We will call the project that binds things together the -'toplevel project'. Other projects that hold `linux-2.6/` part -and `appliance/` part are called 'subprojects'. - - -Setting up ----------- - -Let's say we have been working on the appliance software, -independently version controlled with git. Also the kernel part -has been version controlled separately, like this: ------------- -$ ls -dF current/*/.git current/* -current/Makefile current/appliance/.git/ current/linux-2.6/.git/ -current/appliance/ current/linux-2.6/ ------------- - -Now we would want to get a combined project. First we would -clone from these repositories (which is not strictly needed -- -we could use `$GIT_ALTERNATE_OBJECT_DIRECTORIES` instead): - ------------- -$ mkdir combined && cd combined -$ cp ../current/Makefile . -$ git init-db -$ mkdir -p .git/refs/subs/{kernel,gadget}/{heads,tags} -$ git clone-pack ../current/linux-2.6/ master | read kernel_commit junk -$ git clone-pack ../current/appliance/ master | read gadget_commit junk ------------- - -We will introduce a new command to set up a combined project: - ------------- -$ git bind-projects \ - $kernel_commit linux-2.6/ \ - $gadget_commit appliance/ ------------- - -This would probably do an equivalent of: - ------------- -$ rm -f "$GIT_DIR/index" -$ git read-tree --prefix=linux-2.6/ $kernel_commit -$ git read-tree --prefix=appliance/ $gadget_commit -$ git update-index --bind linux-2.6/ $kernel_commit -$ git update-index --bind appliance/ $gadget_commit ------------- -[NOTE] -============ -Earlier outlines sent to the git mailing list talked -about `$GIT_DIR/bind` to record what subproject are bound to -which subtree in the current working tree and index. This -proposal instead records that information in the index file -with `update-index --bind` command. - -Also note that in this round of proposal, there is no separate -branches that keep track of heads of subprojects. - -`update-index --bind` is not implemented on the core side yet; -it would involve backward incompatible changes to the index -format. -============ - -Let's not forget to add the `Makefile`, and check the whole -thing out from the index file. ------------- -$ git add Makefile -$ git checkout-index -f -u -q -a ------------- - -Now our directory should be identical with the `current` -directory. After making sure of that, we should be able to -commit the whole thing: - ------------- -$ diff -x .git -r ../current ../combined -$ git commit -m 'Initial toplevel project commit' ------------- - -Which should create a new commit object that records what is in -the index file as its tree, with `bind` lines to record which -subproject commit objects are bound at what subdirectory, and -updates the `$GIT_DIR/refs/heads/master`. Such a commit object -might look like this: ------------- -tree 04803b09c300c8325258ccf2744115acc4c57067 -bind 5b2bcc7b2d546c636f79490655b3347acc91d17f linux-2.6/ -bind 0bdd79af62e8621359af08f0afca0ce977348ac7 appliance/ -author Junio C Hamano 1137965565 -0800 -committer Junio C Hamano 1137965565 -0800 - -Initial toplevel project commit ------------- - -Notice that `Makefile` at the top is part of the toplevel -project in this example, but it is not necessary. We could -instead have the appliance subproject include this file. In -such a setup, the appliance subproject would have had `Makefile` -and `appliance/` directory at the toplevel. The `bind` line for -that project would have said "the rest is bound at `/`" and -`write-tree \--exclude=linux-2.6/` would have been used to write -the tree for that subproject out of the combined index. - - -Making further commits ----------------------- - -The easiest case is when you updated the Makefile without -changing anything in the subprojects. In such a case, we just -need to create a new commmit object that records the new tree -with the current `HEAD` as its parent, and with the same set of -`bind` lines. - -When we have changes to the subproject part, we would make a -separate commit to the subproject part and then record the whole -thing by making a commit to the toplevel project. The user -interaction might go this way: ------------- -$ git commit -error: you have changes to the subproject bound at linux-2.6/. -$ git commit --subproject linux-2.6/ -$ git commit ------------- - -With the new `\--subproject` option, the directory structure -rooted at `linux-2.6/` part is written out as a tree, and a new -commit object that records that tree object with the commit -bound to that portion of the tree (`5b2bcc7b` in the above -example) as its parent is created. Then the final `git commit` -would record the whole tree with updated `bind` line for the -`linux-2.6/` part. - - -Checking out ------------- - -After cloning such a toplevel project, `git clone` without `-n` -option would check out the working tree. This is done by -reading the tree object recorded in the commit object (which -records the whole thing), and adding the information from the -"bind" line to the index file. - ------------- -$ cd .. -$ git clone -n combined cloned ;# clone the one we created earlier -$ cd cloned -$ git checkout ------------- - -This round of proposal does not maintain separate branch heads -for subprojects. The bound commits and their subdirectories -are recorded in the index file from the commit object, so there -is no need to do anything other than updating the index and the -working tree. - - -Switching branches ------------------- - -Along with the traditional two-way merge by `read-tree -m -u`, -we would need to look at: - -. `bind` lines in the current `HEAD` commit. - -. `bind` lines in the commit we are switching to. - -. subproject binding information in the index file. - -to make sure we do sensible things. - -Just like until very recently we did not allow switching -branches when two-way merge would lose local changes, we can -start by refusing to switch branches when the subprojects bound -in the index do not match what is recorded in the `HEAD` commit. - -Because in this round of the proposal we do not use the -`$GIT_DIR/bind` file nor separate branches to keep track of -heads of the subprojects, there is nothing else other than the -working tree and the index file that needs to be updated when -switching branches. - - -Merging -------- - -Merging two branches of the toplevel projects can use the -traditional merging mechanism mostly unchanged. The merge base -computation can be done using the `parent` ancestry information -taken from the two toplevel project branch heads being merged, -and merging of the whole tree can be done with a three-way merge -of the whole tree using the merge base and two head commits. -For reasons described later, we would not merge the subproject -parts of the trees during this step, though. - -When the two branch heads use different versions of subproject, -things get a bit tricky. First, let's forget for a moment about -the case where they bind the same project at different location. -We would refuse if they do not have the same number of `bind` -lines that bind something at the same subdirectories. - ------------- -$ git merge 'Merge in a side branch' HEAD side -error: the merged heads have subprojects bound at different places. - ours: - linux-2.6/ - appliance/ - theirs: - kernel/ - gadget/ - manual/ ------------- - -Such renaming can be handled by first moving the bind points in -our branch, and redoing the merge (this is a rare operation -anyway). It might go like this: - ------------- -$ git reset -$ git update-index --unbind linux-2.6/ -$ git update-index --unbind appliance/ -$ git update-index --bind $kernel_commit kernel/ -$ git update-index --bind $gadget_commit gadget/ -$ git commit -m 'Prepare for merge with side branch' -$ git merge 'Merge in a side branch' HEAD side -error: the merged heads have subprojects bound at different places. - ours: - kernel/ - gadget/ - theirs: - kernel/ - gadget/ - manual/ ------------- -[NOTE] -============ -Again, `update-index --unbind` is not implemented yet -on the core side. -============ - -Their branch added another subproject, so this did not work (or -it could be the other way around -- we might have been the one -with `manual/` subproject while they didn't). This suggests -that we may want an option to `git merge` to allow taking a -union of subprojects. Again, this is a rare operation, and -always taking a union would have created a toplevel project that -had both `kernel/` and `linux-2.6/` bound to the same Linux -kernel project from possibly different vintage, so it would be -prudent to require the set of bound subprojects to exactly match -and give the user an option to take a union. - ------------- -$ git merge --union-subprojects 'Merge in a side branch HEAD side -error: the subproject at 'kernel/' needs to be merged first. ------------- - -Here, the version of the Linux kernel project in the `side` -branch was different from what our branch had on our `bind` -line. On what kind of difference should we give this error? -Initially, I think we could require one is the fast forward of -the other (ours might be ahead of theirs, or the other way -around), and take the descendant. - -Or we could do an independent merge of subprojects heads, using -the `parent` ancestry of the bound subproject heads to find -their merge-base and doing a three-way merge. This would leave -the merge result in the subproject part of the working tree and -the index. - -[NOTE] -This is the reason we did not do the whole-tree three way merge -earlier. The subproject commit bound to the merge base commit -used for the toplevel project may not be the merge base between -the subproject commits bound to the two toplevel project -commits. - -So let's deal with the case to merge only a subproject part into -our tree first. - - -Merging subprojects -------------------- - -An operation of more practical importance is to be able to merge -in changes done outside to the projects bound to our toplevel -project. - ------------- -$ git pull --subproject=kernel/ git://git.kernel.org/.../linux-2.6/ ------------- - -might do: - -. fetch the current `HEAD` commit from Linus. -. find the subproject commit bound at kernel/ subtree. -. perform the usual three-way merge of these two commits, in - `kernel/` part of the working tree. - -After that, `git commit \--subproject` option would be needed to -make a commit. - -[NOTE] -This suggests that we would need to have something similar to -`MERGE_HEAD` for merging the subproject part. In the case of -merging two toplevel project commits, we probably can read the -`bind` lines from the `MERGE_HEAD` commit and either our `HEAD` -commit or our index file. Further, we probably would require -that the latter two must match, just as we currently require the -index file matches our `HEAD` commit before `git merge`. - -Just like the current `pull = fetch + merge` semantics, the -subproject aware version `git pull \--subproject=frotz/` would be -a `git fetch \--subproject=frotz/` followed by a `git merge -\--subproject=frotz/`. So the above would be: - -. Fetch the head. -+ ------------- -$ git fetch --subproject=kernel/ git://git.kernel.org/.../linux-2.6/ ------------- -+ -which would fetch the commit chain from the remote repository, and -write something like this to `FETCH_HEAD`: -+ ------------- -3ee68c4...\tfor-merge-into kernel/\tbranch 'master' of git://.../linux-2.6 ------------- - -. Run `git merge`. -+ ------------- -$ git merge --subproject=kernel/ \ - 'Merge git://.../linux-2.6 into kernel/' HEAD 3ee68c4... ------------- - -. In case it does not cleanly automerge, `git merge` would write -the necessary information for a later `git commit` to use in -`MERGE_HEAD`. It may look like this: -+ ------------- -3ee68c4af3fd7228c1be63254b9f884614f9ebb2 kernel/ ------------- -+ -Similarly, `MERGE_MSG` file will hold the merge message. - -With this, a later invocation of `git commit` to record the -result of hand resolving would be able to notice that: - -. We should be first resolving `kernel/` subproject, not the - whole thing. -. The remote `HEAD` is `3ee68c4\...` commit. -. The merge message is `Merge git://\.../linux-2.6 into kernel/`. - -and would make a merge commit, and register that resulting -commit in the index file using `update-index \--bind` instead of -updating *any* branch head. - - -Management of Subprojects -------------------------- - -While the above as a mechanism would support version controlling -of subprojects as a part of *one* larger toplevel project, it -probably is worth pointing out that having a separate repository -to manage the subproject independently would be a good idea. -The same subproject can be incorporated into more than one -toplevel projects, and after all, a subproject should be -something that can stand on its own. In our example scenario, -the `kernel/` project is used as a subproject for the "gadget" -product, but at the same time, the organizaton that runs the -"gadget" project may use Linux on their development machines, -and have their own kernel hackers, not necessarily related to -the use of the kernel in the "gadget" product. - -What this suggests is that not just we need to be able to pull -the kernel development history *into* the subproject of the -"gadget" project, but also we need to be able to push the -development history of the kernel part alone *out* *of* the -"gadget" project to another repository that deals only with the -kernel part. - -It might go this way. First the setup: - ------------- -$ git clone git://git.kernel.org/.../linux-2.6 Linux -$ ls -dF * -cloned/ combined/ current/ Linux/ ------------- - -That is, in addition to the `combined/` which we have been using -to develop the "gadget" product in, we now have a repository for -the kernel, cloned from Linus. In the previous section, we have -outlined how we update the kernel subproject part of `combined/` -repository from the `kernel.org` repository. The same procedure -would work for pulling from `Linux/` repository here. - -We are now going the other way; propagate the kernel work done -in the "gadget" project repository `combined/` back to `Linux/`. -We might do this at the lowest level: - ------------- -$ cd combined -$ git cat-file commit HEAD | - sed -ne 's|^bind \([0-9a-f]*\) kernel/$|\1|p' >.git/refs/heads/linux26 -$ git push ../Linux linux26:master ------------- - -Or, more realistically, since the `Linux` project might already -have their own commits on its `master`: - ------------- -$ cd Linux -$ git pull ../combined linux26 ------------- - -Either way we would need an easy way to maintain the `linux26` -branch in the above example, and that will have to be part of -the wrapper scripts like `git commit` (more likely, that would -be a job for `git commit \--subproject`) for the usability's -sake; in other words, the `cat-file commit` piped to `sed` above -is not something the end user would do, but something that is -done by the wrapper scripts. - -Hopefully the people who work in `Linux/` repository would run -`format-patch` and feed their changes back to the kernel -community.