Ok, I've really held off doing this too damn long, because I'm lazy, and I
was always hoping that somebody else would do it.
But no, people keep asking for it, but nobody actually did anything, so I
decided I might as well bite the bullet, and instead of telling people
they could add a "--follow" flag to "git log" to do what they want to do,
I decided that it looks like I just have to do it for them..
The code wasn't actually that complicated, in that the diffstat for this
patch literally says "70 insertions(+), 1 deletions(-)", but I will have
to admit that in order to get to this fairly simple patch, you did have to
know and understand the internal git diff generation machinery pretty
well, and had to really be able to follow how commit generation interacts
with generating patches and generating the log.
So I suspect that while I was right that it wasn't that hard, I might have
been expecting too much of random people - this patch does seem to be
firmly in the core "Linus or Junio" territory.
To make a long story short: I'm sorry for it taking so long until I just
did it.
I'm not going to guarantee that this works for everybody, but you really
can just look at the patch, and after the appropriate appreciative noises
("Ooh, aah") over how clever I am, you can then just notice that the code
itself isn't really that complicated.
All the real new code is in the new "try_to_follow_renames()" function. It
really isn't rocket science: we notice that the pathname we were looking
at went away, so we start a full tree diff and try to see if we can
instead make that pathname be a rename or a copy from some other previous
pathname. And if we can, we just continue, except we show *that*
particular diff, and ever after we use the _previous_ pathname.
One thing to look out for: the "rename detection" is considered to be a
singular event in the _linear_ "git log" output! That's what people want
to do, but I just wanted to point out that this patch is *not* carrying
around a "commit,pathname" kind of pair and it's *not* going to be able to
notice the file coming from multiple *different* files in earlier history.
IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind
of "files have single identities" kind of semantics, and git log will just
pick the identity based on the normal move/copy heuristics _as_if_ the
history could be linearized.
Put another way: I think the model is broken, but given the broken model,
I think this patch does just about as well as you can do. If you have
merges with the same "file" having different filenames over the two
branches, git will just end up picking _one_ of the pathnames at the point
where the newer one goes away. It never looks at multiple pathnames in
parallel.
And if you understood all that, you probably didn't need it explained, and
if you didn't understand the above blathering, it doesn't really mtter to
you. What matters to you is that you can now do
git log -p --follow builtin-rev-list.c
and it will find the point where the old "rev-list.c" got renamed to
"builtin-rev-list.c" and show it as such.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes and Marco discovered that "git log -z" spent cycles in diff even
though there is no need to actually compute diffs.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch introduces --extended-regexp and --regexp-ignore-case options to
tune what kind of patterns the pattern-limiting options (--grep, --author,
...) accept.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes a crash in broken repositories where random commits
suddenly disappear.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds --date={local,relative,default} option to log family of commands,
to allow displaying timestamps in user's local timezone, relative time, or
the default format.
Existing --relative-date option is a synonym of --date=relative; we could
probably deprecate it in the long run.
Signed-off-by: Junio C Hamano <junkio@cox.net>
An earlier --subject-prefix patch forgot that format-patch is
not the only codepath that adds the "[PATCH]" prefix, and broke
everybody else in the log family.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is meant to be a saner replacement for "git-cherry".
When used with "A...B", this filters out commits whose patch
text has the same patch-id as a commit on the other side. It
would probably most useful to use with --left-right.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes a problem reported by Randal Schwartz:
>I finally tracked down all the (albeit inconsequential) errors I was getting
>on both OpenBSD and OSX. It's the warn() function in usage.c. There's
>warn(3) in BSD-style distros. It'd take a "great rename" to change it, but if
>someone with better C skills than I have could do that, my linker and I would
>appreciate it.
It was annoying to me, too, when I was doing some mergetool testing on
Mac OS X, so here's a fix.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Randal L. Schwartz" <merlyn@stonehenge.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes slightly more lines than it adds, but the real reason for
doing this is that future optimizations will require more setup of the
tree descriptor, and so we want to do it in one place.
Also renamed the "desc.buf" field to "desc.buffer" just to trigger
compiler errors for old-style manual initializations, making sure I
didn't miss anything.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If your development history does not have fast-forward merges,
i.e. the "first parent" of commits in your history are special
than other parents, this option gives a better overview of the
evolution of a particular branch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This uses diff-tree --quiet machinery to terminate the internal
diff-tree between a commit and its parents via revs.pruning (not
revs.diffopt) as soon as we find enough about the tree change.
With respect to the optionally given pathspec, we are interested
if the tree of commit is identical to the parent's, only adds
new paths to the parent's, or there are other differences. As
soon as we find out that there is one such other kind of
difference, we do not have to compare the rest of the tree.
Because we do not call standard diff_addremove/diff_change, we
instruct the diff-tree machinery to stop early by setting
has_changes when we say we found the trees to be different.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This explains how tree_difference variable is used, and updates two
places where the code knows symbolic constant REV_TREE_SAME is 0.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When the list is truly limited and get_revision_1() returned NULL,
the code incorrectly returned it without switching to boundary emiting
mode. Silly.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This moves the code to set SHOWN on the commit from get_revision_1()
back to get_revision(), so that the bit means what it originally
meant: this commit has been given back to the caller.
Also it fixes the --reverse breakage Dscho pointed out.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes the flag internally used by revision traversal to
decide which commits are indeed boundaries and renames it to
CHILD_SHOWN. builtin-bundle uses the symbol for its
verification, but I think the logic it uses it is wrong. The
flag is still useful but it is local to the git-bundle, so it is
renamed to PREREQ_MARK.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This cleans up the boundary processing in the commit walker. It
- rips out the boundary logic from the commit walker. Placing
"negative" commits in the revs->commits list was Ok if all we
cared about "boundary" was the UNINTERESTING limiting case,
but conceptually it was wrong.
- makes get_revision_1() function to walk the commits and return
the results as if there is no funny postprocessing flags such
as --reverse, --skip nor --max-count.
- makes get_revision() function the postprocessing phase:
If reverse is given, wait for get_revision_1() to give
everything that it would normally give, and then reverse it
before consuming.
If skip is given, skip that many before going further.
If max is given, stop when we gave out that many.
Now that we are about to return one positive commit, mark
the parents of that commit to be potential boundaries
before returning, iff we are doing the boundary processing.
Return the commit.
- After get_revision() finishes giving out all the positive
commits, if we are doing the boundary processing, we look at
the parents that we marked as potential boundaries earlier,
see if they are really boundaries, and give them out.
It loses more code than it adds, even when the new gc_boundary()
function, which is purely for early optimization, is counted.
Note that this patch is purely for eyeballing and discussion
only. It breaks git-bundle's verify logic because the logic
does not use BOUNDARY_SHOW flag for its internal computation
anymore. After we correct it not to attempt to affect the
boundary processing by setting the BOUNDARY_SHOW flag, we can
remove BOUNDARY_SHOW from revision.h and use that bit assignment
for the new CHILD_SHOWN flag.
Signed-off-by: Junio C Hamano <junkio@cox.net>
So when we do
git show v1.4.4..v1.5.0
that's an illogical thing to do, since "git show" is defined to be a
non-revision-walking action, which means the range operator be pointless
and wrong. The fact that we happily accept it (and then _only_ show
v1.5.0, which is the positive end of the range) is quite arguably not very
logical.
We should complain, and say that you can only do "no_walk" with positive
refs. Negative object refs really don't make any sense unless you walk
the obejct list (or you're "git diff" and know about ranges explicitly).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
There were instances of strncmp() that were formatted improperly
(e.g. whitespace around parameter before closing parenthesis)
that caused the earlier mechanical conversion step to miss
them. This step cleans them up.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This mechanically converts strncmp() to use prefixcmp(), but only when
the parameters match specific patterns, so that they can be verified
easily. Leftover from this will be fixed in a separate step, including
idiotic conversions like
if (!strncmp("foo", arg, 3))
=>
if (!(-prefixcmp(arg, "foo")))
This was done by using this script in px.perl
#!/usr/bin/perl -i.bak -p
if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) {
s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|;
}
if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) {
s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|;
}
and running:
$ git grep -l strncmp -- '*.c' | xargs perl px.perl
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now, when saying --max-age=<timestamp>, or --max-count=<n>, together
with --boundary, rev-list prints the boundary commits, i.e. the
commits which are _just_ not shown without --boundary, i.e. their
children are, but they aren't.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Similar to commit eb8381c8, we need to use for_each_reflog() to make
sure we do not miss objects reachable from HEAD reflog.
Signed-off-by: Junio C Hamano <junkio@cox.net>
A short-hand "-g" for "git log --walk-reflogs" and "git
show-branch --reflog" makes it easier to access the reflog
info.
[jc: added -g to show-branch for symmetry]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The option --reverse reverses the order of the commits.
[jc: with comments on rev_info.reverse from Simon 'corecode' Schubert.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When called with "--walk-reflogs", as long as there are reflogs
available, the walker will take this information into account, rather
than the parent information in the commit object.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It used to ignore the return value of the helper function; now, it
expects it to return 0, and stops iteration upon non-zero return
values; this value is then passed on as the return value of
for_each_reflog_ent().
Further, it makes no sense to force the parsing upon the helper
functions; for_each_reflog_ent() now calls the helper function with
old and new sha1, the email, the timestamp & timezone, and the message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If the user has been using reflog for a long time (e.g. since its
introduction) then it is very likely that an existing branch's
reflog may still mention commits which have long since been pruned
out of the repository.
Rather than aborting with a very useless error message during
git-repack, pack as many valid commits as we can get from the
reflog and let the user know that the branch's reflog contains
already pruned commits. A future 'git reflog expire' (or whatever
it finally winds up being called) can then be performed to expunge
those reflog entries.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds a new option --reflog to pack-objects and revision
machinery; do not bother documenting it for now, since this is
only useful for local repacking.
When the option is passed, objects reachable from reflog entries
are marked as interesting while computing the set of objects to
pack.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds --skip=<n> option to revision traversal machinery.
Documentation and test were added by Robert Fitzsimons.
Signed-off-by: Robert Fitzsimons <robfitz@273k.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a mechanical clean-up of the way *.c files include
system header files.
(1) sources under compat/, platform sha-1 implementations, and
xdelta code are exempt from the following rules;
(2) the first #include must be "git-compat-util.h" or one of
our own header file that includes it first (e.g. config.h,
builtin.h, pkt-line.h);
(3) system headers that are included in "git-compat-util.h"
need not be included in individual C source files.
(4) "git-compat-util.h" does not have to include subsystem
specific header files (e.g. expat.h).
Signed-off-by: Junio C Hamano <junkio@cox.net>
This reverts commit 5761231975.
Feeding symmetric difference to gitk is so useful, and it is the
same for other graphical Porcelains. Rather than forcing them
to pass --no-left-right, making it optional.
Noticed and reported by Jeff King.
When using symmetric differences, I think the user almost always
would want to know which side of the symmetry each commit came
from. So this removes --left-right option from the command
line, and turns it on automatically when a symmetric difference
is used ("git log --merge" counts as a symmetric difference
between HEAD and MERGE_HEAD).
Just in case, a new option --no-left-right is provided to defeat
this, but I do not know if it would be useful.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The output from "symmetric diff", i.e. A...B, does not
distinguish between commits that are reachable from A and the
ones that are reachable from B. In this picture, such a
symmetric diff includes commits marked with a and b.
x---b---b branch B
/ \ /
/ .
/ / \
o---x---a---a branch A
However, you cannot tell which ones are 'a' and which ones are
'b' from the output. Sometimes this is frustrating. This adds
an output option, --left-right, to rev-list.
rev-list --left-right A...B
would show ones reachable from A prefixed with '<' and the ones
reachable from B prefixed with '>'.
When combined with --boundary, boundary commits (the ones marked
with 'x' in the above picture) are shown with prefix '-', so you
would see list that looks like this:
git rev-list --left-right --boundary --pretty=oneline A...B
>bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb 3rd on b
>bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb 2nd on b
<aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 3rd on a
<aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 2nd on a
-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 1st on b
-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 1st on a
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a shorthand for "<rev> --not <rev>^@", i.e. "include
this commit but exclude any of its parents".
When a new file $F is introduced by revision $R, this notation
can be used to find a copy-and-paste from existing file in the
parents of that revision without annotating the ancestry of the
lines that were copied from:
git pickaxe -f -C $R^! -- $F
Signed-off-by: Junio C Hamano <junkio@cox.net>
When getting the list of all unpacked objects by walking the commit history,
we would stop traversal whenever we hit a packed commit. However the fact
that we found a packed commit does not guarantee that all previous commits
are also packed. As a result the commit walkers did not show all reachable
unpacked objects.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This lets you say:
git log --all-match --author=Linus --committer=Junio --grep=rev-list
to limit commits that was written by Linus, committed by me and
the log message contains word "rev-list".
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds a "int *flag" parameter to resolve_ref() and makes
for_each_ref() family to call callback function with an extra
"int flag" parameter. They are used to give two bits of
information (REF_ISSYMREF and REF_ISPACKED) about the ref.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a long overdue fix to the API for for_each_ref() family
of functions. It allows the callers to specify a callback data
pointer, so that the caller does not have to use static
variables to communicate with the callback funciton.
The updated for_each_ref() family takes a function of type
int (*fn)(const char *, const unsigned char *, void *)
and a void pointer as parameters, and calls the function with
the name of the ref and its SHA-1 with the caller-supplied void
pointer as parameters.
The commit updates two callers, builtin-name-rev.c and
builtin-pack-refs.c as an example.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now we can tell the built-in grep to grep only in head or in
body, use that to update --author, --committer, and --grep.
Unfortunately, to make --and, --not and other grep boolean
expressions useful, as in:
# Things written by Junio committed and by Linus and log
# does not talk about diff.
git log --author=Junio --and --committer=Linus \
--grep-not --grep=diff
we will need to do another round of built-in grep core
enhancement, because grep boolean expressions are designed to
work on one line at a time.
Signed-off-by: Junio C Hamano <junkio@cox.net>
I know that I'd prefer a rule where
"--author=^Junio"
would result in the grep-pattern being "^author Junio", but without the
initial '^' it would be "^author .*Junio".
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds three options to setup_revisions(), which lets you
filter resulting commits by the author name, the committer name
and the log message with regexp.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is from a suggestion by Linus, just to mark the locations where we
need to modify to actually implement the filtering.
We do not have any actual filtering code yet.
Signed-off-by: Junio C Hamano <junkio@cox.net>