Instead of working out descendent heads and descendent & ancestor
branches in a two-pass algorithm, this reads and stores a simplified
version of the graph topology, and works out descendent/ancestor
tags and descendent heads on demand (with a bit of caching).
The advantages of this are, first, that we now don't have to use
--topo-order on the git rev-list process. Secondly, we don't have
to re-read the whole graph when tags or heads change or even when
the graph changes. Since we can cope with parents coming before
children, we can update the graph by running a git rev-list with
arguments that just give us the new commits, and merge the new
commits into the simplified graph.
The graph is simplified in the sense that commits with exactly one
parent and one child (which is >90% of them in most cases) are grouped
together into arcs joining nodes or 'branch/merge points', which are
the commits that don't have exactly 1 parent and 1 child. This reduces
the size of the graph substantially and decreases the time to traverse
it correspondingly.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes "git log --follow" to hopefully not leak memory any more, and
also cleans it up a bit to look more like some of the other functions that
use "diff_queued_diff" (by *not* using it directly as a global in the
code, but by instead just taking a pointer to the diff queue and using
that).
As to "diff_queued_diff", I think it would be better off not as a global
at all, but as being just an entry in the "struct diff_options" structure,
but that's a separate issue, and there may be some subtle reason for why
it's currently a global.
Anyway, no real changes. Instead of having a magical first entry in the
diff-queue, we now end up just keeping the diff-queue clean, and keeping
our "preferred" file pairing in an internal "choice" variable. That makes
it easy to switch the choice around when we find a better one.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ok, I've really held off doing this too damn long, because I'm lazy, and I
was always hoping that somebody else would do it.
But no, people keep asking for it, but nobody actually did anything, so I
decided I might as well bite the bullet, and instead of telling people
they could add a "--follow" flag to "git log" to do what they want to do,
I decided that it looks like I just have to do it for them..
The code wasn't actually that complicated, in that the diffstat for this
patch literally says "70 insertions(+), 1 deletions(-)", but I will have
to admit that in order to get to this fairly simple patch, you did have to
know and understand the internal git diff generation machinery pretty
well, and had to really be able to follow how commit generation interacts
with generating patches and generating the log.
So I suspect that while I was right that it wasn't that hard, I might have
been expecting too much of random people - this patch does seem to be
firmly in the core "Linus or Junio" territory.
To make a long story short: I'm sorry for it taking so long until I just
did it.
I'm not going to guarantee that this works for everybody, but you really
can just look at the patch, and after the appropriate appreciative noises
("Ooh, aah") over how clever I am, you can then just notice that the code
itself isn't really that complicated.
All the real new code is in the new "try_to_follow_renames()" function. It
really isn't rocket science: we notice that the pathname we were looking
at went away, so we start a full tree diff and try to see if we can
instead make that pathname be a rename or a copy from some other previous
pathname. And if we can, we just continue, except we show *that*
particular diff, and ever after we use the _previous_ pathname.
One thing to look out for: the "rename detection" is considered to be a
singular event in the _linear_ "git log" output! That's what people want
to do, but I just wanted to point out that this patch is *not* carrying
around a "commit,pathname" kind of pair and it's *not* going to be able to
notice the file coming from multiple *different* files in earlier history.
IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind
of "files have single identities" kind of semantics, and git log will just
pick the identity based on the normal move/copy heuristics _as_if_ the
history could be linearized.
Put another way: I think the model is broken, but given the broken model,
I think this patch does just about as well as you can do. If you have
merges with the same "file" having different filenames over the two
branches, git will just end up picking _one_ of the pathnames at the point
where the newer one goes away. It never looks at multiple pathnames in
parallel.
And if you understood all that, you probably didn't need it explained, and
if you didn't understand the above blathering, it doesn't really mtter to
you. What matters to you is that you can now do
git log -p --follow builtin-rev-list.c
and it will find the point where the old "rev-list.c" got renamed to
"builtin-rev-list.c" and show it as such.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is based on Jeff King's example in
20070621130137.GB4487@coredump.intra.peff.net
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/filter:
filter-branch: subdirectory filter needs --full-history
filter-branch: Simplify parent computation.
Teach filter-branch about subdirectory filtering
filter-branch: also don't fail in map() if a commit cannot be mapped
filter-branch: Use rev-list arguments to specify revision ranges.
filter-branch: fix behaviour of '-k'
filter-branch: use $(($i+1)) instead of $((i+1))
chmod +x git-filter-branch.sh
filter-branch: prevent filters from reading from stdin
t7003: make test repeatable
Add git-filter-branch
Luiz Fernando N. Capitulino noticed the one in tree-walk.h where
we cast away constness while computing the legnth of a tree
entry.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When there are several candidates for a rename source, and one of them
has an identical basename to the rename target, take that one.
Noticed by Govind Salinas, posted by Shawn O. Pearce, partial patch
by Linus Torvalds.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jakub Narebski pointed out that the git-gui blame viewer is not a
widely known feature, but is incredibly useful. Part of the issue
is advertising. Up until now we haven't even referenced git-gui from
within the core Git manual pages, mostly because I just wasn't sure
how I wanted to supply git-gui documentation to end-users, or how
that documentation should integrate with the core Git documentation.
Based upon Jakub's comment that many users may not even know that
the gui is available in a stock Git distribution I'm offering up
two basic manual pages: git-citool and git-gui. These should offer
enough of a starting point for users to identify that the gui exists,
and how to start it. Future releases of git-gui may contain their
own documentation system available from within a running git-gui.
But not today.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now uses git-tag instead of manually constructing the tag. This gives us a
correct timestamp, removes some crufty code, and makes it work the same as
git-cvsimport.
The generated tags are now lightweight tags instead of tag objects, which may
or may not be the behaviour we want.
Also, remove two unused variables from git-cvsimport.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simon has asked that the git.git project include the git-p4 project
as at least a contrib/fast-import within git.git. I think it makes
a lot of sense, as git-p4 nicely complements the only other in-tree
fast-import user: import-tars.perl.
git-p4 is offered under the MIT license by its authors.
Raimund Bauer just discovered that the default bash completion for
a local branch name in a git-push line is not the best choice when
the branch does not exist on the remote system.
In the past we have always completed the local name 'test' as
"test:test", indicating that the destination name is the same as
the local name. But this fails when "test" does not yet exist on
the remote system, as there is no "test" branch for it to match
the name against.
Fortunately git-push does the right thing when given just the
local branch, as it assumes you want to use the same name in the
destination repository. So we now offer "test" as the completion
in a git-push line, and let git-push assume that is also the remote
branch name.
We also still support the remote branch completion after the :,
but only if the user manually adds the colon before trying to get
a completion.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Junio asked that we don't force the user to have a valid X11 server
configured in $DISPLAY just to obtain the output of `git gui version`.
This makes sense, the user may be an automated tool that is running
without an X server available to it, such as a build script or other
sort of package management system. Or it might just be a user working
in a non-GUI environment and wondering "what version of git-gui do I
have installed?".
Tcl has a lot of warts, but one of its better ones is that a comment
can be continued to the next line by escaping the LF that would have
ended the comment using a backslash-LF sequence. In the past we have
used this trick to escape away the 'exec wish' that is actually a Bourne
shell script and keep Tcl from executing it.
I'm using that feature here to comment out the Bourne shell script and
hide it from the Tcl engine. Except now our Bourne shell script is a
few lines long and checks to see if it should print the version, or not.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This isn't used right now in git-p4 but I use it in an external script that loads git-p4 as module.
Signed-off-by: Simon Hausmann <shausman@trolltech.com>
Alex Riesen wanted a quieter installation process for git and its
contained git-gui. His earlier patch to do this failed to work
properly when V=1, and didn't really give a great indication of
what the installation was doing.
These rules are a little bit on the messy side, as each of our
install actions is composed of at least two variables, but in the
V=1 case the text is identical to what we had before, while in the
non-V=1 case we use some more complex rules to show the interesting
details, and hide the less interesting bits.
We now can also set QUIET= (nothing) to see the rules that are used
when V= (nothing), so we can debug those too if we have to. This is
actually a side-effect of how we insert the @ into the rules we use
for the "lists of things", like our builtins or our library files.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The blame viewer is composed of two different areas, the file
area on top and the commit area on the bottom. If users are
trying to shift the focus it is probably because they want to
shift from one area to the other, so we just setup Tab and
Shift-Tab to jump from the one half to the other in a cycle.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Mark Levedahl <mlevedahl@gmail.com> noted that installation on Cygwin
to /usr/bin can cause problems with the automatic guessing of our
library location. The problem is that installation to /usr/bin
means we actually have:
/usr/bin = c:\cygwin\bin
/usr/share = c:\cygwin\usr\share
So git-gui guesses that its library should be found within the
c:\cygwin\share directory, as that is where it should be relative
to the script itself in c:\cygwin\bin.
In my first version of this patch I tried to use `cygpath` to resolve
/usr/bin and /usr/share to test that they were in the same relative
locations, but that didn't work out correctly as we were actually
testing /usr/share against itself, so it always was equal, and we
always used relative paths. So my original solution was quite wrong.
Mark suggested we just always disable relative behavior on Cygwin,
because of the complexity of the mount mapping problem, so that's
all I'm doing.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the remote repository does not have a "current branch", git-clone
was confused and did not set up the resulting new repository
correctly. It did not reset HEAD from the default 'master', and did
not write the SHA1 to the master branch.
Signed-off-by: Nanako Shiraishi <nanako3@bluebottle.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Frank Lichtenheld, Fri, Jun 15, 2007 03:01:53 +0200:
> +test_expect_failure 'req_Root failure (export-all w/o whitelist)' \
> + 'cat request-anonymous | git-cvsserver --export-all pserver >log 2>&1
> + || false'
This does not work, at least for bash in current Ubuntu:
GNU bash, version 3.2.13(1)-release
You have to put "||" on the previous line:
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Collect "unknown" source branches separately and register them at the end.
Also added a minor speed up to splitFilesIntoBranches by breaking out of the loop through all branches when it's safe.
Signed-off-by: Simon Hausmann <simon@lst.de>
ALLOC_GROW now expects the 'nr' argument to be "how much you
want" and not "how much you have". This fixes all cases
where we weren't previously adding anything to the 'nr'.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
add_user_info() possibly adds way more than just the commit header line.
In fact, it sometimes needs so much more space that there is a buffer
overrun, leading to an ugly crash. For example, the date is printed in its
own line, and usually takes up more space than the equivalent Unix epoch.
So, for good measure, add 80 characters (a full line) to the allocated
space, in addition to the header line length.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ALLOC_GROW macro will never let us fill the array completely,
instead allocating an extra chunk if that would be the case. This is
because the 'nr' argument was originally treated as "how much we do have
now" instead of "how much do we want". The latter makes much more
sense because you can grow by more than one item.
This off-by-one never resulted in an error because it meant we were
overly conservative about when to allocate. Any callers which passed
"how much we have now" need to be updated, or they will fail to allocate
enough.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Based on description of commit 477f2b4131
"git log --full-diff" adding this option.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation taken from paraphrased description of "--abbrev[=<n>]"
diff option, and from description of commit 5c51c985 introducing
this option.
Note that to change number of digits one must use "--abbrev=<n>",
which affects [also] diff output.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Note that git log does not understand this option yet:
$ git log --timestamp
fatal: unrecognized argument: --timestamp
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document --stale-fix, used in "git reflog expire --stale-fix --all"
to remove invalid reflog entries, to fix situation after running
non reflog-aware git-prune from an older git in the presence of
reflogs (see RelNotes-1.5.0.txt).
Based on description of commit 1389d9ddaa
"reflog expire --fix-stale"
which introduced this option.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A perforce command with all the files in the repo is generated to get
all the file content.
Here is a patch to break it into multiple successive perforce command
who uses 4K of parameter max, and collect the output for later.
It works, but not for big depos, because the whole perforce depo
content is stored in memory in P4Sync.run(), and it looks like mine is
bigger than 2 Gigs, so I had to kill the process.
[Simon: I added the bit about using SC_ARG_MAX, as suggested by Han-Wen]
Signed-off-by: Benjamin Sergeant <bsergean@gmail.com>
Signed-off-by: Simon Hausmann <simon@lst.de>
Randal L. Schwartz noticed compilation problems on SunOS, which made
me look at the code again. The thing is, h_errno is not used by
connect(2), it is only for functions from netdb.h, like gethostbyname.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/remote:
git-push: Update description of refspecs and add examples
remote.c: "git-push frotz" should update what matches at the source.
remote.c: fix "git push" weak match disambiguation
remote.c: minor clean-up of match_explicit()
remote.c: refactor creation of new dst ref
remote.c: refactor match_explicit_refs()
* fl/cvsserver:
cvsserver: Actually implement --export-all
cvsserver: Let --base-path and pserver get along just fine
cvsserver: Add some useful commandline options
* lh/submodule:
gitmodules(5): remove leading period from synopsis
Add gitmodules(5)
git-submodule: give submodules proper names
Rename sections from "module" to "submodule" in .gitmodules
git-submodule: remember to checkout after clone
t7400: barf if git-submodule removes or replaces a file
You don't need to use string eval to define new functions; assigning a
code reference to the target symbol table is enough.
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refspecs with no colons are left with no dst value, because they are
interepreted differently for fetch and push. For push, they mean to
reuse the src side. Fix this for patterns.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>