|
|
|
A short git tutorial
|
|
|
|
====================
|
|
|
|
May 2005
|
|
|
|
|
|
|
|
|
|
|
|
Introduction
|
|
|
|
------------
|
|
|
|
|
|
|
|
This is trying to be a short tutorial on setting up and using a git
|
|
|
|
archive, mainly because being hands-on and using explicit examples is
|
|
|
|
often the best way of explaining what is going on.
|
|
|
|
|
|
|
|
In normal life, most people wouldn't use the "core" git programs
|
|
|
|
directly, but rather script around them to make them more palatable.
|
|
|
|
Understanding the core git stuff may help some people get those scripts
|
|
|
|
done, though, and it may also be instructive in helping people
|
|
|
|
understand what it is that the higher-level helper scripts are actually
|
|
|
|
doing.
|
|
|
|
|
|
|
|
The core git is often called "plumbing", with the prettier user
|
|
|
|
interfaces on top of it called "porcelain". You may not want to use the
|
|
|
|
plumbing directly very often, but it can be good to know what the
|
|
|
|
plumbing does for when the porcelain isn't flushing...
|
|
|
|
|
|
|
|
|
|
|
|
Creating a git archive
|
|
|
|
----------------------
|
|
|
|
|
|
|
|
Creating a new git archive couldn't be easier: all git archives start
|
|
|
|
out empty, and the only thing you need to do is find yourself a
|
|
|
|
subdirectory that you want to use as a working tree - either an empty
|
|
|
|
one for a totally new project, or an existing working tree that you want
|
|
|
|
to import into git.
|
|
|
|
|
|
|
|
For our first example, we're going to start a totally new archive from
|
|
|
|
scratch, with no pre-existing files, and we'll call it "git-tutorial".
|
|
|
|
To start up, create a subdirectory for it, change into that
|
|
|
|
subdirectory, and initialize the git infrastructure with "git-init-db":
|
|
|
|
|
|
|
|
mkdir git-tutorial
|
|
|
|
cd git-tutorial
|
|
|
|
git-init-db
|
|
|
|
|
|
|
|
to which git will reply
|
|
|
|
|
|
|
|
defaulting to local storage area
|
|
|
|
|
|
|
|
which is just git's way of saying that you haven't been doing anything
|
|
|
|
strange, and that it will have created a local .git directory setup for
|
|
|
|
your new project. You will now have a ".git" directory, and you can
|
|
|
|
inspect that with "ls". For your new empty project, ls should show you
|
|
|
|
three entries:
|
|
|
|
|
|
|
|
- a symlink called HEAD, pointing to "refs/heads/master"
|
|
|
|
|
|
|
|
Don't worry about the fact that the file that the HEAD link points to
|
|
|
|
doesn't even exist yet - you haven't created the commit that will
|
|
|
|
start your HEAD development branch yet.
|
|
|
|
|
|
|
|
- a subdirectory called "objects", which will contain all the git SHA1
|
|
|
|
objects of your project. You should never have any real reason to
|
|
|
|
look at the objects directly, but you might want to know that these
|
|
|
|
objects are what contains all the real _data_ in your repository.
|
|
|
|
|
|
|
|
- a subdirectory called "refs", which contains references to objects.
|
|
|
|
|
|
|
|
In particular, the "refs" subdirectory will contain two other
|
|
|
|
subdirectories, named "heads" and "tags" respectively. They do
|
|
|
|
exactly what their names imply: they contain references to any number
|
|
|
|
of different "heads" of development (aka "branches"), and to any
|
|
|
|
"tags" that you have created to name specific versions of your
|
|
|
|
repository.
|
|
|
|
|
|
|
|
One note: the special "master" head is the default branch, which is
|
|
|
|
why the .git/HEAD file was created as a symlink to it even if it
|
|
|
|
doesn't yet exist. Basically, the HEAD link is supposed to always
|
|
|
|
point to the branch you are working on right now, and you always
|
|
|
|
start out expecting to work on the "master" branch.
|
|
|
|
|
|
|
|
However, this is only a convention, and you can name your branches
|
|
|
|
anything you want, and don't have to ever even _have_ a "master"
|
|
|
|
branch. A number of the git tools will assume that .git/HEAD is
|
|
|
|
valid, though.
|
|
|
|
|
|
|
|
[ Implementation note: an "object" is identified by its 160-bit SHA1
|
|
|
|
hash, aka "name", and a reference to an object is always the 40-byte
|
|
|
|
hex representation of that SHA1 name. The files in the "refs"
|
|
|
|
subdirectory are expected to contain these hex references (usually
|
|
|
|
with a final '\n' at the end), and you should thus expect to see a
|
|
|
|
number of 41-byte files containing these references in this refs
|
|
|
|
subdirectories when you actually start populating your tree ]
|
|
|
|
|
|
|
|
You have now created your first git archive. Of course, since it's
|
|
|
|
empty, that's not very useful, so let's start populating it with data.
|
|
|
|
|
|
|
|
|
|
|
|
Populating a git archive
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
We'll keep this simple and stupid, so we'll start off with populating a
|
|
|
|
few trivial files just to get a feel for it.
|
|
|
|
|
|
|
|
Start off with just creating any random files that you want to maintain
|
|
|
|
in your git archive. We'll start off with a few bad examples, just to
|
|
|
|
get a feel for how this works:
|
|
|
|
|
|
|
|
echo "Hello World" > a
|
|
|
|
echo "Silly example" > b
|
|
|
|
|
|
|
|
you have now created two files in your working directory, but to
|
|
|
|
actually check in your hard work, you will have to go through two steps:
|
|
|
|
|
|
|
|
- fill in the "cache" aka "index" file with the information about your
|
|
|
|
working directory state
|
|
|
|
|
|
|
|
- commit that index file as an object.
|
|
|
|
|
|
|
|
The first step is trivial: when you want to tell git about any changes
|
|
|
|
to your working directory, you use the "git-update-cache" program. That
|
|
|
|
program normally just takes a list of filenames you want to update, but
|
|
|
|
to avoid trivial mistakes, it refuses to add new entries to the cache
|
|
|
|
(or remove existing ones) unless you explicitly tell it that you're
|
|
|
|
adding a new entry with the "--add" flag (or removing an entry with the
|
|
|
|
"--remove") flag.
|
|
|
|
|
|
|
|
So to populate the index with the two files you just created, you can do
|
|
|
|
|
|
|
|
git-update-cache --add a b
|
|
|
|
|
|
|
|
and you have now told git to track those two files.
|
|
|
|
|
|
|
|
In fact, as you did that, if you now look into your object directory,
|
|
|
|
you'll notice that git will have added two new objects to the object
|
|
|
|
store. If you did exactly the steps above, you should now be able to do
|
|
|
|
|
|
|
|
ls .git/objects/??/*
|
|
|
|
|
|
|
|
and see two files:
|
|
|
|
|
|
|
|
.git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238
|
|
|
|
.git/objects/f2/4c74a2e500f5ee1332c86b94199f52b1d1d962
|
|
|
|
|
|
|
|
which correspond with the object with SHA1 names of 557db... and f24c7..
|
|
|
|
respectively.
|
|
|
|
|
|
|
|
If you want to, you can use "git-cat-file" to look at those objects, but
|
|
|
|
you'll have to use the object name, not the filename of the object:
|
|
|
|
|
|
|
|
git-cat-file -t 557db03de997c86a4a028e1ebd3a1ceb225be238
|
|
|
|
|
|
|
|
where the "-t" tells git-cat-file to tell you what the "type" of the
|
|
|
|
object is. Git will tell you that you have a "blob" object (ie just a
|
|
|
|
regular file), and you can see the contents with
|
|
|
|
|
|
|
|
git-cat-file "blob" 557db03de997c86a4a028e1ebd3a1ceb225be238
|
|
|
|
|
|
|
|
which will print out "Hello World". The object 557db... is nothing
|
|
|
|
more than the contents of your file "a".
|
|
|
|
|
|
|
|
[ Digression: don't confuse that object with the file "a" itself. The
|
|
|
|
object is literally just those specific _contents_ of the file, and
|
|
|
|
however much you later change the contents in file "a", the object we
|
|
|
|
just looked at will never change. Objects are immutable. ]
|
|
|
|
|
|
|
|
Anyway, as we mentioned previously, you normally never actually take a
|
|
|
|
look at the objects themselves, and typing long 40-character hex SHA1
|
|
|
|
names is not something you'd normally want to do. The above digression
|
|
|
|
was just to show that "git-update-cache" did something magical, and
|
|
|
|
actually saved away the contents of your files into the git content
|
|
|
|
store.
|
|
|
|
|
|
|
|
Updating the cache did something else too: it created a ".git/index"
|
|
|
|
file. This is the index that describes your current working tree, and
|
|
|
|
something you should be very aware of. Again, you normally never worry
|
|
|
|
about the index file itself, but you should be aware of the fact that
|
|
|
|
you have not actually really "checked in" your files into git so far,
|
|
|
|
you've only _told_ git about them.
|
|
|
|
|
|
|
|
However, since git knows about them, you can now start using some of the
|
|
|
|
most basic git commands to manipulate the files or look at their status.
|
|
|
|
|
|
|
|
In particular, let's not even check in the two files into git yet, we'll
|
|
|
|
start off by adding another line to "a" first:
|
|
|
|
|
|
|
|
echo "It's a new day for git" >> a
|
|
|
|
|
|
|
|
and you can now, since you told git about the previous state of "a", ask
|
|
|
|
git what has changed in the tree compared to your old index, using the
|
|
|
|
"git-diff-files" command:
|
|
|
|
|
|
|
|
git-diff-files
|
|
|
|
|
|
|
|
oops. That wasn't very readable. It just spit out its own internal
|
|
|
|
version of a "diff", but that internal version really just tells you
|
|
|
|
that it has noticed that "a" has been modified, and that the old object
|
|
|
|
contents it had have been replaced with something else.
|
|
|
|
|
|
|
|
To make it readable, we can tell git-diff-files to output the
|
|
|
|
differences as a patch, using the "-p" flag:
|
|
|
|
|
|
|
|
git-diff-files -p
|
|
|
|
|
|
|
|
which will spit out
|
|
|
|
|
|
|
|
diff --git a/a b/a
|
|
|
|
--- a/a
|
|
|
|
+++ b/a
|
|
|
|
@@ -1 +1,2 @@
|
|
|
|
Hello World
|
|
|
|
+It's a new day for git
|
|
|
|
|
|
|
|
ie the diff of the change we caused by adding another line to "a".
|
|
|
|
|
|
|
|
In other words, git-diff-files always shows us the difference between
|
|
|
|
what is recorded in the index, and what is currently in the working
|
|
|
|
tree. That's very useful.
|
|
|
|
|
|
|
|
A common shorthand for "git-diff-files -p" is to just write
|
|
|
|
|
|
|
|
git diff
|
|
|
|
|
|
|
|
which will do the same thing.
|
|
|
|
|
|
|
|
|
|
|
|
Committing git state
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
Now, we want to go to the next stage in git, which is to take the files
|
|
|
|
that git knows about in the index, and commit them as a real tree. We do
|
|
|
|
that in two phases: creating a "tree" object, and committing that "tree"
|
|
|
|
object as a "commit" object together with an explanation of what the
|
|
|
|
tree was all about, along with information of how we came to that state.
|
|
|
|
|
|
|
|
Creating a tree object is trivial, and is done with "git-write-tree".
|
|
|
|
There are no options or other input: git-write-tree will take the
|
|
|
|
current index state, and write an object that describes that whole
|
|
|
|
index. In other words, we're now tying together all the different
|
|
|
|
filenames with their contents (and their permissions), and we're
|
|
|
|
creating the equivalent of a git "directory" object:
|
|
|
|
|
|
|
|
git-write-tree
|
|
|
|
|
|
|
|
and this will just output the name of the resulting tree, in this case
|
|
|
|
(if you have does exactly as I've described) it should be
|
|
|
|
|
|
|
|
3ede4ed7e895432c0a247f09d71a76db53bd0fa4
|
|
|
|
|
|
|
|
which is another incomprehensible object name. Again, if you want to,
|
|
|
|
you can use "git-cat-file -t 3ede4.." to see that this time the object
|
|
|
|
is not a "blob" object, but a "tree" object (you can also use
|
|
|
|
git-cat-file to actually output the raw object contents, but you'll see
|
|
|
|
mainly a binary mess, so that's less interesting).
|
|
|
|
|
|
|
|
However - normally you'd never use "git-write-tree" on its own, because
|
|
|
|
normally you always commit a tree into a commit object using the
|
|
|
|
"git-commit-tree" command. In fact, it's easier to not actually use
|
|
|
|
git-write-tree on its own at all, but to just pass its result in as an
|
|
|
|
argument to "git-commit-tree".
|
|
|
|
|
|
|
|
"git-commit-tree" normally takes several arguments - it wants to know
|
|
|
|
what the _parent_ of a commit was, but since this is the first commit
|
|
|
|
ever in this new archive, and it has no parents, we only need to pass in
|
|
|
|
the tree ID. However, git-commit-tree also wants to get a commit message
|
|
|
|
on its standard input, and it will write out the resulting ID for the
|
|
|
|
commit to its standard output.
|
|
|
|
|
|
|
|
And this is where we start using the .git/HEAD file. The HEAD file is
|
|
|
|
supposed to contain the reference to the top-of-tree, and since that's
|
|
|
|
exactly what git-commit-tree spits out, we can do this all with a simple
|
|
|
|
shell pipeline:
|
|
|
|
|
|
|
|
echo "Initial commit" | git-commit-tree $(git-write-tree) > .git/HEAD
|
|
|
|
|
|
|
|
which will say:
|
|
|
|
|
|
|
|
Committing initial tree 3ede4ed7e895432c0a247f09d71a76db53bd0fa4
|
|
|
|
|
|
|
|
just to warn you about the fact that it created a totally new commit
|
|
|
|
that is not related to anything else. Normally you do this only _once_
|
|
|
|
for a project ever, and all later commits will be parented on top of an
|
|
|
|
earlier commit, and you'll never see this "Committing initial tree"
|
|
|
|
message ever again.
|
|
|
|
|
|
|
|
Again, normally you'd never actually do this by hand. There is a
|
|
|
|
helpful script called "git commit" that will do all of this for you. So
|
|
|
|
you could have just writtten
|
|
|
|
|
|
|
|
git commit
|
|
|
|
|
|
|
|
instead, and it would have done the above magic scripting for you.
|
|
|
|
|
|
|
|
|
|
|
|
Making a change
|
|
|
|
---------------
|
|
|
|
|
|
|
|
Remember how we did the "git-update-cache" on file "a" and then we
|
|
|
|
changed "a" afterward, and could compare the new state of "a" with the
|
|
|
|
state we saved in the index file?
|
|
|
|
|
|
|
|
Further, remember how I said that "git-write-tree" writes the contents
|
|
|
|
of the _index_ file to the tree, and thus what we just committed was in
|
|
|
|
fact the _original_ contents of the file "a", not the new ones. We did
|
|
|
|
that on purpose, to show the difference between the index state, and the
|
|
|
|
state in the working directory, and how they don't have to match, even
|
|
|
|
when we commit things.
|
|
|
|
|
|
|
|
As before, if we do "git-diff-files -p" in our git-tutorial project,
|
|
|
|
we'll still see the same difference we saw last time: the index file
|
|
|
|
hasn't changed by the act of committing anything. However, now that we
|
|
|
|
have committed something, we can also learn to use a new command:
|
|
|
|
"git-diff-cache".
|
|
|
|
|
|
|
|
Unlike "git-diff-files", which showed the difference between the index
|
|
|
|
file and the working directory, "git-diff-cache" shows the differences
|
|
|
|
between a committed _tree_ and either the the index file or the working
|
|
|
|
directory. In other words, git-diff-cache wants a tree to be diffed
|
|
|
|
against, and before we did the commit, we couldn't do that, because we
|
|
|
|
didn't have anything to diff against.
|
|
|
|
|
|
|
|
But now we can do
|
|
|
|
|
|
|
|
git-diff-cache -p HEAD
|
|
|
|
|
|
|
|
(where "-p" has the same meaning as it did in git-diff-files), and it
|
|
|
|
will show us the same difference, but for a totally different reason.
|
|
|
|
Now we're comparing the working directory not against the index file,
|
|
|
|
but against the tree we just wrote. It just so happens that those two
|
|
|
|
are obviously the same, so we get the same result.
|
|
|
|
|
|
|
|
Again, because this is a common operation, you can also just shorthand
|
|
|
|
it with
|
|
|
|
|
|
|
|
git diff HEAD
|
|
|
|
|
|
|
|
which ends up doing the above for you.
|
|
|
|
|
|
|
|
In other words, "git-diff-cache" normally compares a tree against the
|
|
|
|
working directory, but when given the "--cached" flag, it is told to
|
|
|
|
instead compare against just the index cache contents, and ignore the
|
|
|
|
current working directory state entirely. Since we just wrote the index
|
|
|
|
file to HEAD, doing "git-diff-cache --cached -p HEAD" should thus return
|
|
|
|
an empty set of differences, and that's exactly what it does.
|
|
|
|
|
|
|
|
[ Digression: "git-diff-cache" really always uses the index for its
|
|
|
|
comparisons, and saying that it compares a tree against the working
|
|
|
|
directory is thus not strictly accurate. In particular, the list of
|
|
|
|
files to compare (the "meta-data") _always_ comes from the index file,
|
|
|
|
regardless of whether the --cached flag is used or not. The --cached
|
|
|
|
flag really only determines whether the file _contents_ to be compared
|
|
|
|
come from the working directory or not.
|
|
|
|
|
|
|
|
This is not hard to understand, as soon as you realize that git simply
|
|
|
|
never knows (or cares) about files that it is not told about
|
|
|
|
explicitly. Git will never go _looking_ for files to compare, it
|
|
|
|
expects you to tell it what the files are, and that's what the index
|
|
|
|
is there for. ]
|
|
|
|
|
|
|
|
However, our next step is to commit the _change_ we did, and again, to
|
|
|
|
understand what's going on, keep in mind the difference between "working
|
|
|
|
directory contents", "index file" and "committed tree". We have changes
|
|
|
|
in the working directory that we want to commit, and we always have to
|
|
|
|
work through the index file, so the first thing we need to do is to
|
|
|
|
update the index cache:
|
|
|
|
|
|
|
|
git-update-cache a
|
|
|
|
|
|
|
|
(note how we didn't need the "--add" flag this time, since git knew
|
|
|
|
about the file already).
|
|
|
|
|
|
|
|
Note what happens to the different git-diff-xxx versions here. After
|
|
|
|
we've updated "a" in the index, "git-diff-files -p" now shows no
|
|
|
|
differences, but "git-diff-cache -p HEAD" still _does_ show that the
|
|
|
|
current state is different from the state we committed. In fact, now
|
|
|
|
"git-diff-cache" shows the same difference whether we use the "--cached"
|
|
|
|
flag or not, since now the index is coherent with the working directory.
|
|
|
|
|
|
|
|
Now, since we've updated "a" in the index, we can commit the new
|
|
|
|
version. We could do it by writing the tree by hand again, and
|
|
|
|
committing the tree (this time we'd have to use the "-p HEAD" flag to
|
|
|
|
tell commit that the HEAD was the _parent_ of the new commit, and that
|
|
|
|
this wasn't an initial commit any more), but you've done that once
|
|
|
|
already, so let's just use the helpful script this time:
|
|
|
|
|
|
|
|
git commit
|
|
|
|
|
|
|
|
which starts an editor for you to write the commit message and tells you
|
|
|
|
a bit about what you're doing.
|
|
|
|
|
|
|
|
Write whatever message you want, and all the lines that start with '#'
|
|
|
|
will be pruned out, and the rest will be used as the commit message for
|
|
|
|
the change. If you decide you don't want to commit anything after all at
|
|
|
|
this point (you can continue to edit things and update the cache), you
|
|
|
|
can just leave an empty message. Otherwise git-commit-script will commit
|
|
|
|
the change for you.
|
|
|
|
|
|
|
|
You've now made your first real git commit. And if you're interested in
|
|
|
|
looking at what git-commit-script really does, feel free to investigate:
|
|
|
|
it's a few very simple shell scripts to generate the helpful (?) commit
|
|
|
|
message headers, and a few one-liners that actually do the commit itself.
|
|
|
|
|
|
|
|
|
|
|
|
Checking it out
|
|
|
|
---------------
|
|
|
|
|
|
|
|
While creating changes is useful, it's even more useful if you can tell
|
|
|
|
later what changed. The most useful command for this is another of the
|
|
|
|
"diff" family, namely "git-diff-tree".
|
|
|
|
|
|
|
|
git-diff-tree can be given two arbitrary trees, and it will tell you the
|
|
|
|
differences between them. Perhaps even more commonly, though, you can
|
|
|
|
give it just a single commit object, and it will figure out the parent
|
|
|
|
of that commit itself, and show the difference directly. Thus, to get
|
|
|
|
the same diff that we've already seen several times, we can now do
|
|
|
|
|
|
|
|
git-diff-tree -p HEAD
|
|
|
|
|
|
|
|
(again, "-p" means to show the difference as a human-readable patch),
|
|
|
|
and it will show what the last commit (in HEAD) actually changed.
|
|
|
|
|
|
|
|
More interestingly, you can also give git-diff-tree the "-v" flag, which
|
|
|
|
tells it to also show the commit message and author and date of the
|
|
|
|
commit, and you can tell it to show a whole series of diffs.
|
|
|
|
Alternatively, you can tell it to be "silent", and not show the diffs at
|
|
|
|
all, but just show the actual commit message.
|
|
|
|
|
|
|
|
In fact, together with the "git-rev-list" program (which generates a
|
|
|
|
list of revisions), git-diff-tree ends up being a veritable fount of
|
|
|
|
changes. A trivial (but very useful) script called "git-whatchanged" is
|
|
|
|
included with git which does exactly this, and shows a log of recent
|
|
|
|
activity.
|
|
|
|
|
|
|
|
To see the whole history of our pitiful little git-tutorial project, you
|
|
|
|
can do
|
|
|
|
|
|
|
|
git log
|
|
|
|
|
|
|
|
which shows just the log messages, or if we want to see the log together
|
|
|
|
with the associated patches use the more complex (and much more
|
|
|
|
powerful)
|
|
|
|
|
|
|
|
git-whatchanged -p --root
|
|
|
|
|
|
|
|
and you will see exactly what has changed in the repository over its
|
|
|
|
short history.
|
|
|
|
|
|
|
|
[ Side note: the "--root" flag is a flag to git-diff-tree to tell it to
|
|
|
|
show the initial aka "root" commit too. Normally you'd probably not
|
|
|
|
want to see the initial import diff, but since the tutorial project
|
|
|
|
was started from scratch and is so small, we use it to make the result
|
|
|
|
a bit more interesting ]
|
|
|
|
|
|
|
|
With that, you should now be having some inkling of what git does, and
|
|
|
|
can explore on your own.
|
|
|
|
|
|
|
|
|
|
|
|
Copying archives
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
Git archives are normally totally self-sufficient, and it's worth noting
|
|
|
|
that unlike CVS, for example, there is no separate notion of
|
|
|
|
"repository" and "working tree". A git repository normally _is_ the
|
|
|
|
working tree, with the local git information hidden in the ".git"
|
|
|
|
subdirectory. There is nothing else. What you see is what you got.
|
|
|
|
|
|
|
|
[ Side note: you can tell git to split the git internal information from
|
|
|
|
the directory that it tracks, but we'll ignore that for now: it's not
|
|
|
|
how normal projects work, and it's really only meant for special uses.
|
|
|
|
So the mental model of "the git information is always tied directly to
|
|
|
|
the working directory that it describes" may not be technically 100%
|
|
|
|
accurate, but it's a good model for all normal use ]
|
|
|
|
|
|
|
|
This has two implications:
|
|
|
|
|
|
|
|
- if you grow bored with the tutorial archive you created (or you've
|
|
|
|
made a mistake and want to start all over), you can just do simple
|
|
|
|
|
|
|
|
rm -rf git-tutorial
|
|
|
|
|
|
|
|
and it will be gone. There's no external repository, and there's no
|
|
|
|
history outside of the project you created.
|
|
|
|
|
|
|
|
- if you want to move or duplicate a git archive, you can do so. There
|
|
|
|
is no "git clone" command: if you want to create a copy of your
|
|
|
|
archive (with all the full history that went along with it), you can
|
|
|
|
do so with a regular "cp -a git-tutorial new-git-tutorial".
|
|
|
|
|
|
|
|
Note that when you've moved or copied a git archive, your git index
|
|
|
|
file (which caches various information, notably some of the "stat"
|
|
|
|
information for the files involved) will likely need to be refreshed.
|
|
|
|
So after you do a "cp -a" to create a new copy, you'll want to do
|
|
|
|
|
|
|
|
git-update-cache --refresh
|
|
|
|
|
|
|
|
to make sure that the index file is up-to-date in the new one.
|
|
|
|
|
|
|
|
Note that the second point is true even across machines. You can
|
|
|
|
duplicate a remote git archive with _any_ regular copy mechanism, be it
|
|
|
|
"scp", "rsync" or "wget".
|
|
|
|
|
|
|
|
When copying a remote repository, you'll want to at a minimum update the
|
|
|
|
index cache when you do this, and especially with other peoples
|
|
|
|
repositories you often want to make sure that the index cache is in some
|
|
|
|
known state (you don't know _what_ they've done and not yet checked in),
|
|
|
|
so usually you'll precede the "git-update-cache" with a
|
|
|
|
|
|
|
|
git-read-tree --reset HEAD
|
|
|
|
git-update-cache --refresh
|
|
|
|
|
|
|
|
which will force a total index re-build from the tree pointed to by HEAD
|
|
|
|
(it resets the index contents to HEAD, and then the git-update-cache
|
|
|
|
makes sure to match up all index entries with the checked-out files).
|
|
|
|
|
|
|
|
The above can also be written as simply
|
|
|
|
|
|
|
|
git reset
|
|
|
|
|
|
|
|
and in fact a lot of the common git command combinations can be scripted
|
|
|
|
with the "git xyz" interfaces, and you can learn things by just looking
|
|
|
|
at what the git-*-script scripts do ("git reset" is the above two lines
|
|
|
|
implemented in "git-reset-script", but some things like "git status" and
|
|
|
|
"git commit" are slightly more complex scripts around the basic git
|
|
|
|
commands).
|
|
|
|
|
|
|
|
NOTE! Many (most?) public remote repositories will not contain any of
|
|
|
|
the checked out files or even an index file, and will _only_ contain the
|
|
|
|
actual core git files. Such a repository usually doesn't even have the
|
|
|
|
".git" subdirectory, but has all the git files directly in the
|
|
|
|
repository.
|
|
|
|
|
|
|
|
To create your own local live copy of such a "raw" git repository, you'd
|
|
|
|
first create your own subdirectory for the project, and then copy the
|
|
|
|
raw repository contents into the ".git" directory. For example, to
|
|
|
|
create your own copy of the git repository, you'd do the following
|
|
|
|
|
|
|
|
mkdir my-git
|
|
|
|
cd my-git
|
|
|
|
rsync -rL rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/git.git/ .git
|
|
|
|
|
|
|
|
followed by
|
|
|
|
|
|
|
|
git-read-tree HEAD
|
|
|
|
|
|
|
|
to populate the index. However, now you have populated the index, and
|
|
|
|
you have all the git internal files, but you will notice that you don't
|
|
|
|
actually have any of the _working_directory_ files to work on. To get
|
|
|
|
those, you'd check them out with
|
|
|
|
|
|
|
|
git-checkout-cache -u -a
|
|
|
|
|
|
|
|
where the "-u" flag means that you want the checkout to keep the index
|
|
|
|
up-to-date (so that you don't have to refresh it afterward), and the
|
|
|
|
"-a" file means "check out all files" (if you have a stale copy or an
|
|
|
|
older version of a checked out tree you may also need to add the "-f"
|
|
|
|
file first, to tell git-checkout-cache to _force_ overwriting of any old
|
|
|
|
files).
|
|
|
|
|
|
|
|
Again, this can all be simplified with
|
|
|
|
|
|
|
|
git clone rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/git.git/ my-git
|
|
|
|
cd my-git
|
|
|
|
git checkout
|
|
|
|
|
|
|
|
which will end up doing all of the above for you.
|
|
|
|
|
|
|
|
You have now successfully copied somebody else's (mine) remote
|
|
|
|
repository, and checked it out.
|
|
|
|
|
|
|
|
|
|
|
|
Creating a new branch
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
Branches in git are really nothing more than pointers into the git
|
|
|
|
object space from within the ",git/refs/" subdirectory, and as we
|
|
|
|
already discussed, the HEAD branch is nothing but a symlink to one of
|
|
|
|
these object pointers.
|
|
|
|
|
|
|
|
You can at any time create a new branch by just picking an arbitrary
|
|
|
|
point in the project history, and just writing the SHA1 name of that
|
|
|
|
object into a file under .git/refs/heads/. You can use any filename you
|
|
|
|
want (and indeed, subdirectories), but the convention is that the
|
|
|
|
"normal" branch is called "master". That's just a convention, though,
|
|
|
|
and nothing enforces it.
|
|
|
|
|
|
|
|
To show that as an example, let's go back to the git-tutorial archive we
|
|
|
|
used earlier, and create a branch in it. You literally do that by just
|
|
|
|
creating a new SHA1 reference file, and switch to it by just making the
|
|
|
|
HEAD pointer point to it:
|
|
|
|
|
|
|
|
cat .git/HEAD > .git/refs/heads/mybranch
|
|
|
|
ln -sf refs/heads/mybranch .git/HEAD
|
|
|
|
|
|
|
|
and you're done.
|
|
|
|
|
|
|
|
Now, if you make the decision to start your new branch at some other
|
|
|
|
point in the history than the current HEAD, you usually also want to
|
|
|
|
actually switch the contents of your working directory to that point
|
|
|
|
when you switch the head, and "git checkout" will do that for you:
|
|
|
|
instead of switching the branch by hand with "ln -sf", you can just do
|
|
|
|
|
|
|
|
git checkout mybranch
|
|
|
|
|
|
|
|
which will basically "jump" to the branch specified, update your working
|
|
|
|
directory to that state, and also make it become the new default HEAD.
|
|
|
|
|
|
|
|
You can always just jump back to your original "master" branch by doing
|
|
|
|
|
|
|
|
git checkout master
|
|
|
|
|
|
|
|
and if you forget which branch you happen to be on, a simple
|
|
|
|
|
|
|
|
ls -l .git/HEAD
|
|
|
|
|
|
|
|
will tell you where it's pointing.
|
|
|
|
|
|
|
|
|
|
|
|
Merging two branches
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
One of the ideas of having a branch is that you do some (possibly
|
|
|
|
experimental) work in it, and eventually merge it back to the main
|
|
|
|
branch. So assuming you created the above "mybranch" that started out
|
|
|
|
being the same as the original "master" branch, let's make sure we're in
|
|
|
|
that branch, and do some work there.
|
|
|
|
|
|
|
|
git checkout mybranch
|
|
|
|
echo "Work, work, work" >> a
|
|
|
|
git commit a
|
|
|
|
|
|
|
|
Here, we just added another line to "a", and we used a shorthand for
|
|
|
|
both going a "git-update-cache a" and "git commit" by just giving the
|
|
|
|
filename directly to "git commit".
|
|
|
|
|
|
|
|
Now, to make it a bit more interesting, let's assume that somebody else
|
|
|
|
does some work in the original branch, and simulate that by going back
|
|
|
|
to the master branch, and editing the same file differently there:
|
|
|
|
|
|
|
|
git checkout master
|
|
|
|
|
|
|
|
Here, take a moment to look at the contents of "a", and notice how they
|
|
|
|
don't contain the work we just did in "mybranch" - because that work
|
|
|
|
hasn't happened in the "master" branch at all. Then do
|
|
|
|
|
|
|
|
echo "Play, play, play" >> a
|
|
|
|
echo "Lots of fun" >> b
|
|
|
|
git commit a b
|
|
|
|
|
|
|
|
since the master branch is obviously in a much better mood.
|
|
|
|
|
|
|
|
Now, you've got two branches, and you decide that you want to merge the
|
|
|
|
work done. Before we do that, let's introduce a cool graphical tool that
|
|
|
|
helps you view what's going on:
|
|
|
|
|
|
|
|
gitk --all
|
|
|
|
|
|
|
|
will show you graphically both of your branches (that's what the "--all"
|
|
|
|
means: normally it will just show you your current HEAD) and their
|
|
|
|
histories. You can also see exactly how they came to be from a common
|
|
|
|
source.
|
|
|
|
|
|
|
|
Anyway, let's exit gitk (^Q or the File menu), and decide that we want
|
|
|
|
to merge the work we did on the "mybranch" branch into the "master"
|
|
|
|
branch (which is currently our HEAD too). To do that, there's a nice
|
|
|
|
script called "git resolve", which wants to know which branches you want
|
|
|
|
to resolve and what the merge is all about:
|
|
|
|
|
|
|
|
git resolve HEAD mybranch "Merge work in mybranch"
|
|
|
|
|
|
|
|
where the third argument is going to be used as the commit message if
|
|
|
|
the merge can be resolved automatically.
|
|
|
|
|
|
|
|
Now, in this case we've intentionally created a situation where the
|
|
|
|
merge will need to be fixed up by hand, though, so git will do as much
|
|
|
|
of it as it can automatically (which in this case is just merge the "b"
|
|
|
|
file, which had no differences in the "mybranch" branch), and say:
|
|
|
|
|
|
|
|
Simple merge failed, trying Automatic merge
|
|
|
|
Auto-merging a.
|
|
|
|
merge: warning: conflicts during merge
|
|
|
|
ERROR: Merge conflict in a.
|
|
|
|
fatal: merge program failed
|
|
|
|
Automatic merge failed, fix up by hand
|
|
|
|
|
|
|
|
which is way too verbose, but it basically tells you that it failed the
|
|
|
|
really trivial merge ("Simple merge") and did an "Automatic merge"
|
|
|
|
instead, but that too failed due to conflicts in "a".
|
|
|
|
|
|
|
|
Not to worry. It left the (trivial) conflict in "a" in the same form you
|
|
|
|
should already be well used to if you've ever used CVS, so let's just
|
|
|
|
open "a" in our editor (whatever that may be), and fix it up somehow.
|
|
|
|
I'd suggest just making it so that "a" contains all four lines:
|
|
|
|
|
|
|
|
Hello World
|
|
|
|
It's a new day for git
|
|
|
|
Play, play, play
|
|
|
|
Work, work, work
|
|
|
|
|
|
|
|
and once you're happy with your manual merge, just do a
|
|
|
|
|
|
|
|
git commit a
|
|
|
|
|
|
|
|
which will very loudly warn you that you're now committing a merge
|
|
|
|
(which is correct, so never mind), and you can write a small merge
|
|
|
|
message about your adventures in git-merge-land.
|
|
|
|
|
|
|
|
After you're done, start up "gitk --all" to see graphically what the
|
|
|
|
history looks like. Notive that "mybranch" still exists, and you can
|
|
|
|
switch to it, and continue to work with it if you want to. The
|
|
|
|
"mybranch" branch will not contain the merge, but next time you merge it
|
|
|
|
from the "master" branch, git will know how you merged it, so you'll not
|
|
|
|
have to do _that_ merge again.
|
|
|
|
|
|
|
|
|
|
|
|
Merging external work
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
It's usually much more common that you merge with somebody else than
|
|
|
|
merging with your own branches, so it's worth pointing out that git
|
|
|
|
makes that very easy too, and in fact, it's not that different from
|
|
|
|
doing a "git resolve". In fact, a remote merge ends up being nothing
|
|
|
|
more than "fetch the work from a remote repository into a temporary tag"
|
|
|
|
followed by a "git resolve".
|
|
|
|
|
|
|
|
It's such a common thing to do that it's called "git pull", and you can
|
|
|
|
simply do
|
|
|
|
|
|
|
|
git pull <remote-repository>
|
|
|
|
|
|
|
|
and optionally give a branch-name for the remote end as a second
|
|
|
|
argument.
|
|
|
|
|
|
|
|
[ Todo: fill in real examples ]
|
|
|
|
|
|
|
|
|
|
|
|
Tagging a version
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
In git, there's two kinds of tags, a "light" one, and a "signed tag".
|
|
|
|
|
|
|
|
A "light" tag is technically nothing more than a branch, except we put
|
|
|
|
it in the ".git/refs/tags/" subdirectory instead of calling it a "head".
|
|
|
|
So the simplest form of tag involves nothing more than
|
|
|
|
|
|
|
|
cat .git/HEAD > .git/refs/tags/my-first-tag
|
|
|
|
|
|
|
|
after which point you can use this symbolic name for that particular
|
|
|
|
state. You can, for example, do
|
|
|
|
|
|
|
|
git diff my-first-tag
|
|
|
|
|
|
|
|
to diff your current state against that tag (which at this point will
|
|
|
|
obviously be an empty diff, but if you continue to develop and commit
|
|
|
|
stuff, you can use your tag as a "anchor-point" to see what has changed
|
|
|
|
since you tagged it.
|
|
|
|
|
|
|
|
A "signed tag" is actually a real git object, and contains not only a
|
|
|
|
pointer to the state you want to tag, but also a small tag name and
|
|
|
|
message, along with a PGP signature that says that yes, you really did
|
|
|
|
that tag. You create these signed tags with
|
|
|
|
|
|
|
|
git tag <tagname>
|
|
|
|
|
|
|
|
which will sign the current HEAD (but you can also give it another
|
|
|
|
argument that specifies the thing to tag, ie you could have tagged the
|
|
|
|
current "mybranch" point by using "git tag <tagname> mybranch").
|
|
|
|
|
|
|
|
You normally only do signed tags for major releases or things
|
|
|
|
like that, while the light-weight tags are useful for any marking you
|
|
|
|
want to do - any time you decide that you want to remember a certain
|
|
|
|
point, just create a private tag for it, and you have a nice symbolic
|
|
|
|
name for the state at that point.
|
|
|
|
|
|
|
|
[ to be continued.. cvsimports, pushing and pulling ]
|