|
|
|
git-blame(1)
|
|
|
|
============
|
|
|
|
|
|
|
|
NAME
|
|
|
|
----
|
|
|
|
git-blame - Show what revision and author last modified each line of a file
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
--------
|
|
|
|
[verse]
|
|
|
|
'git blame' [-c] [-b] [-l] [--root] [-t] [-f] [-n] [-s] [-e] [-p] [-w] [--incremental]
|
|
|
|
[-L <range>] [-S <revs-file>] [-M] [-C] [-C] [-C] [--since=<date>]
|
blame: add the ability to ignore commits and their changes
Commits that make formatting changes or function renames are often not
interesting when blaming a file. A user may deem such a commit as 'not
interesting' and want to ignore and its changes it when assigning blame.
For example, say a file has the following git history / rev-list:
---O---A---X---B---C---D---Y---E---F
Commits X and Y both touch a particular line, and the other commits do
not:
X: "Take a third parameter"
-MyFunc(1, 2);
+MyFunc(1, 2, 3);
Y: "Remove camelcase"
-MyFunc(1, 2, 3);
+my_func(1, 2, 3);
git-blame will blame Y for the change. I'd like to be able to ignore Y:
both the existence of the commit as well as any changes it made. This
differs from -S rev-list, which specifies the list of commits to
process for the blame. We would still process Y, but just don't let the
blame 'stick.'
This patch adds the ability for users to ignore a revision with
--ignore-rev=rev, which may be repeated. They can specify a set of
files of full object names of revs, e.g. SHA-1 hashes, one per line. A
single file may be specified with the blame.ignoreRevFile config option
or with --ignore-rev-file=file. Both the config option and the command
line option may be repeated multiple times. An empty file name "" will
clear the list of revs from previously processed files. Config options
are processed before command line options.
For a typical use case, projects will maintain the file containing
revisions for commits that perform mass reformatting, and their users
have the option to ignore all of the commits in that file.
Additionally, a user can use the --ignore-rev option for one-off
investigation. To go back to the example above, X was a substantive
change to the function, but not the change the user is interested in.
The user inspected X, but wanted to find the previous change to that
line - perhaps a commit that introduced that function call.
To make this work, we can't simply remove all ignored commits from the
rev-list. We need to diff the changes introduced by Y so that we can
ignore them. We let the blames get passed to Y, just like when
processing normally. When Y is the target, we make sure that Y does not
*keep* any blames. Any changes that Y is responsible for get passed to
its parent. Note we make one pass through all of the scapegoats
(parents) to attempt to pass blame normally; we don't know if we *need*
to ignore the commit until we've checked all of the parents.
The blame_entry will get passed up the tree until we find a commit that
has a diff chunk that affects those lines.
One issue is that the ignored commit *did* make some change, and there is
no general solution to finding the line in the parent commit that
corresponds to a given line in the ignored commit. That makes it hard
to attribute a particular line within an ignored commit's diff
correctly.
For example, the parent of an ignored commit has this, say at line 11:
commit-a 11) #include "a.h"
commit-b 12) #include "b.h"
Commit X, which we will ignore, swaps these lines:
commit-X 11) #include "b.h"
commit-X 12) #include "a.h"
We can pass that blame entry to the parent, but line 11 will be
attributed to commit A, even though "include b.h" came from commit B.
The blame mechanism will be looking at the parent's view of the file at
line number 11.
ignore_blame_entry() is set up to allow alternative algorithms for
guessing per-line blames. Any line that is not attributed to the parent
will continue to be blamed on the ignored commit as if that commit was
not ignored. Upcoming patches have the ability to detect these lines
and mark them in the blame output.
The existing algorithm is simple: blame each line on the corresponding
line in the parent's diff chunk. Any lines beyond that stay with the
target.
For example, the parent of an ignored commit has this, say at line 11:
commit-a 11) void new_func_1(void *x, void *y);
commit-b 12) void new_func_2(void *x, void *y);
commit-c 13) some_line_c
commit-d 14) some_line_d
After a commit 'X', we have:
commit-X 11) void new_func_1(void *x,
commit-X 12) void *y);
commit-X 13) void new_func_2(void *x,
commit-X 14) void *y);
commit-c 15) some_line_c
commit-d 16) some_line_d
Commit X nets two additionally lines: 13 and 14. The current
guess_line_blames() algorithm will not attribute these to the parent,
whose diff chunk is only two lines - not four.
When we ignore with the current algorithm, we get:
commit-a 11) void new_func_1(void *x,
commit-b 12) void *y);
commit-X 13) void new_func_2(void *x,
commit-X 14) void *y);
commit-c 15) some_line_c
commit-d 16) some_line_d
Note that line 12 was blamed on B, though B was the commit for
new_func_2(), not new_func_1(). Even when guess_line_blames() finds a
line in the parent, it may still be incorrect.
Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
6 years ago
|
|
|
[--ignore-rev <rev>] [--ignore-revs-file <file>]
|
|
|
|
[--progress] [--abbrev=<n>] [<rev> | --contents <file> | --reverse <rev>..<rev>]
|
|
|
|
[--] <file>
|
|
|
|
|
|
|
|
DESCRIPTION
|
|
|
|
-----------
|
|
|
|
|
|
|
|
Annotates each line in the given file with information from the revision which
|
|
|
|
last modified the line. Optionally, start annotating from the given revision.
|
|
|
|
|
|
|
|
When specified one or more times, `-L` restricts annotation to the requested
|
|
|
|
lines.
|
|
|
|
|
|
|
|
The origin of lines is automatically followed across whole-file
|
|
|
|
renames (currently there is no option to turn the rename-following
|
|
|
|
off). To follow lines moved from one file to another, or to follow
|
|
|
|
lines that were copied and pasted from another file, etc., see the
|
|
|
|
`-C` and `-M` options.
|
|
|
|
|
|
|
|
The report does not tell you anything about lines which have been deleted or
|
|
|
|
replaced; you need to use a tool such as 'git diff' or the "pickaxe"
|
|
|
|
interface briefly mentioned in the following paragraph.
|
|
|
|
|
|
|
|
Apart from supporting file annotation, Git also supports searching the
|
|
|
|
development history for when a code snippet occurred in a change. This makes it
|
|
|
|
possible to track when a code snippet was added to a file, moved or copied
|
|
|
|
between files, and eventually deleted or replaced. It works by searching for
|
|
|
|
a text string in the diff. A small example of the pickaxe interface
|
|
|
|
that searches for `blame_usage`:
|
|
|
|
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
$ git log --pretty=oneline -S'blame_usage'
|
|
|
|
5040f17eba15504bad66b14a645bddd9b015ebb7 blame -S <ancestry-file>
|
|
|
|
ea4c7f9bf69e781dd0cd88d2bccb2bf5cc15c9a7 git-blame: Make the output
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
|
|
|
|
OPTIONS
|
|
|
|
-------
|
|
|
|
include::blame-options.txt[]
|
|
|
|
|
|
|
|
-c::
|
|
|
|
Use the same output mode as linkgit:git-annotate[1] (Default: off).
|
|
|
|
|
|
|
|
--score-debug::
|
|
|
|
Include debugging information related to the movement of
|
|
|
|
lines between files (see `-C`) and lines moved within a
|
|
|
|
file (see `-M`). The first number listed is the score.
|
|
|
|
This is the number of alphanumeric characters detected
|
|
|
|
as having been moved between or within files. This must be above
|
|
|
|
a certain threshold for 'git blame' to consider those lines
|
|
|
|
of code to have been moved.
|
|
|
|
|
|
|
|
-f::
|
|
|
|
--show-name::
|
|
|
|
Show the filename in the original commit. By default
|
|
|
|
the filename is shown if there is any line that came from a
|
|
|
|
file with a different name, due to rename detection.
|
|
|
|
|
|
|
|
-n::
|
|
|
|
--show-number::
|
|
|
|
Show the line number in the original commit (Default: off).
|
|
|
|
|
|
|
|
-s::
|
|
|
|
Suppress the author name and timestamp from the output.
|
|
|
|
|
|
|
|
-e::
|
|
|
|
--show-email::
|
|
|
|
Show the author email instead of author name (Default: off).
|
|
|
|
This can also be controlled via the `blame.showEmail` config
|
|
|
|
option.
|
|
|
|
|
|
|
|
-w::
|
|
|
|
Ignore whitespace when comparing the parent's version and
|
|
|
|
the child's to find where the lines came from.
|
|
|
|
|
|
|
|
--abbrev=<n>::
|
|
|
|
Instead of using the default 7+1 hexadecimal digits as the
|
|
|
|
abbreviated object name, use <n>+1 digits. Note that 1 column
|
|
|
|
is used for a caret to mark the boundary commit.
|
|
|
|
|
|
|
|
|
|
|
|
THE PORCELAIN FORMAT
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
In this format, each line is output after a header; the
|
|
|
|
header at the minimum has the first line which has:
|
|
|
|
|
|
|
|
- 40-byte SHA-1 of the commit the line is attributed to;
|
|
|
|
- the line number of the line in the original file;
|
|
|
|
- the line number of the line in the final file;
|
|
|
|
- on a line that starts a group of lines from a different
|
|
|
|
commit than the previous one, the number of lines in this
|
|
|
|
group. On subsequent lines this field is absent.
|
|
|
|
|
|
|
|
This header line is followed by the following information
|
|
|
|
at least once for each commit:
|
|
|
|
|
|
|
|
- the author name ("author"), email ("author-mail"), time
|
|
|
|
("author-time"), and time zone ("author-tz"); similarly
|
|
|
|
for committer.
|
|
|
|
- the filename in the commit that the line is attributed to.
|
|
|
|
- the first line of the commit log message ("summary").
|
|
|
|
|
|
|
|
The contents of the actual line is output after the above
|
|
|
|
header, prefixed by a TAB. This is to allow adding more
|
|
|
|
header elements later.
|
|
|
|
|
|
|
|
The porcelain format generally suppresses commit information that has
|
|
|
|
already been seen. For example, two lines that are blamed to the same
|
|
|
|
commit will both be shown, but the details for that commit will be shown
|
|
|
|
only once. This is more efficient, but may require more state be kept by
|
|
|
|
the reader. The `--line-porcelain` option can be used to output full
|
|
|
|
commit information for each line, allowing simpler (but less efficient)
|
|
|
|
usage like:
|
|
|
|
|
|
|
|
# count the number of lines attributed to each author
|
|
|
|
git blame --line-porcelain file |
|
|
|
|
sed -n 's/^author //p' |
|
|
|
|
sort | uniq -c | sort -rn
|
|
|
|
|
|
|
|
|
|
|
|
SPECIFYING RANGES
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
Unlike 'git blame' and 'git annotate' in older versions of git, the extent
|
|
|
|
of the annotation can be limited to both line ranges and revision
|
|
|
|
ranges. The `-L` option, which limits annotation to a range of lines, may be
|
|
|
|
specified multiple times.
|
|
|
|
|
|
|
|
When you are interested in finding the origin for
|
|
|
|
lines 40-60 for file `foo`, you can use the `-L` option like so
|
|
|
|
(they mean the same thing -- both ask for 21 lines starting at
|
|
|
|
line 40):
|
|
|
|
|
|
|
|
git blame -L 40,60 foo
|
|
|
|
git blame -L 40,+21 foo
|
|
|
|
|
|
|
|
Also you can use a regular expression to specify the line range:
|
|
|
|
|
|
|
|
git blame -L '/^sub hello {/,/^}$/' foo
|
|
|
|
|
|
|
|
which limits the annotation to the body of the `hello` subroutine.
|
|
|
|
|
|
|
|
When you are not interested in changes older than version
|
|
|
|
v2.6.18, or changes older than 3 weeks, you can use revision
|
|
|
|
range specifiers similar to 'git rev-list':
|
|
|
|
|
|
|
|
git blame v2.6.18.. -- foo
|
|
|
|
git blame --since=3.weeks -- foo
|
|
|
|
|
|
|
|
When revision range specifiers are used to limit the annotation,
|
|
|
|
lines that have not changed since the range boundary (either the
|
|
|
|
commit v2.6.18 or the most recent commit that is more than 3
|
|
|
|
weeks old in the above example) are blamed for that range
|
|
|
|
boundary commit.
|
|
|
|
|
|
|
|
A particularly useful way is to see if an added file has lines
|
|
|
|
created by copy-and-paste from existing files. Sometimes this
|
|
|
|
indicates that the developer was being sloppy and did not
|
|
|
|
refactor the code properly. You can first find the commit that
|
|
|
|
introduced the file with:
|
|
|
|
|
|
|
|
git log --diff-filter=A --pretty=short -- foo
|
|
|
|
|
|
|
|
and then annotate the change between the commit and its
|
docs: stop using asciidoc no-inline-literal
In asciidoc 7, backticks like `foo` produced a typographic
effect, but did not otherwise affect the syntax. In asciidoc
8, backticks introduce an "inline literal" inside which markup
is not interpreted. To keep compatibility with existing
documents, asciidoc 8 has a "no-inline-literal" attribute to
keep the old behavior. We enabled this so that the
documentation could be built on either version.
It has been several years now, and asciidoc 7 is no longer
in wide use. We can now decide whether or not we want
inline literals on their own merits, which are:
1. The source is much easier to read when the literal
contains punctuation. You can use `master~1` instead
of `master{tilde}1`.
2. They are less error-prone. Because of point (1), we
tend to make mistakes and forget the extra layer of
quoting.
This patch removes the no-inline-literal attribute from the
Makefile and converts every use of backticks in the
documentation to an inline literal (they must be cleaned up,
or the example above would literally show "{tilde}" in the
output).
Problematic sites were found by grepping for '`.*[{\\]' and
examined and fixed manually. The results were then verified
by comparing the output of "html2text" on the set of
generated html pages. Doing so revealed that in addition to
making the source more readable, this patch fixes several
formatting bugs:
- HTML rendering used the ellipsis character instead of
literal "..." in code examples (like "git log A...B")
- some code examples used the right-arrow character
instead of '->' because they failed to quote
- api-config.txt did not quote tilde, and the resulting
HTML contained a bogus snippet like:
<tt><sub></tt> foo <tt></sub>bar</tt>
which caused some parsers to choke and omit whole
sections of the page.
- git-commit.txt confused ``foo`` (backticks inside a
literal) with ``foo'' (matched double-quotes)
- mentions of `A U Thor <author@example.com>` used to
erroneously auto-generate a mailto footnote for
author@example.com
- the description of --word-diff=plain incorrectly showed
the output as "[-removed-] and {added}", not "{+added+}".
- using "prime" notation like:
commit `C` and its replacement `C'`
confused asciidoc into thinking that everything between
the first backtick and the final apostrophe were meant
to be inside matched quotes
- asciidoc got confused by the escaping of some of our
asterisks. In particular,
`credential.\*` and `credential.<url>.\*`
properly escaped the asterisk in the first case, but
literally passed through the backslash in the second
case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 years ago
|
|
|
parents, using `commit^!` notation:
|
|
|
|
|
|
|
|
git blame -C -C -f $commit^! -- foo
|
|
|
|
|
|
|
|
|
|
|
|
INCREMENTAL OUTPUT
|
|
|
|
------------------
|
|
|
|
|
|
|
|
When called with `--incremental` option, the command outputs the
|
|
|
|
result as it is built. The output generally will talk about
|
|
|
|
lines touched by more recent commits first (i.e. the lines will
|
|
|
|
be annotated out of order) and is meant to be used by
|
|
|
|
interactive viewers.
|
|
|
|
|
|
|
|
The output format is similar to the Porcelain format, but it
|
|
|
|
does not contain the actual lines from the file that is being
|
|
|
|
annotated.
|
|
|
|
|
|
|
|
. Each blame entry always starts with a line of:
|
|
|
|
|
|
|
|
<40-byte hex sha1> <sourceline> <resultline> <num_lines>
|
|
|
|
+
|
|
|
|
Line numbers count from 1.
|
|
|
|
|
|
|
|
. The first time that a commit shows up in the stream, it has various
|
|
|
|
other information about it printed out with a one-word tag at the
|
|
|
|
beginning of each line describing the extra commit information (author,
|
|
|
|
email, committer, dates, summary, etc.).
|
|
|
|
|
|
|
|
. Unlike the Porcelain format, the filename information is always
|
|
|
|
given and terminates the entry:
|
|
|
|
|
|
|
|
"filename" <whitespace-quoted-filename-goes-here>
|
|
|
|
+
|
|
|
|
and thus it is really quite easy to parse for some line- and word-oriented
|
|
|
|
parser (which should be quite natural for most scripting languages).
|
|
|
|
+
|
|
|
|
[NOTE]
|
|
|
|
For people who do parsing: to make it more robust, just ignore any
|
|
|
|
lines between the first and last one ("<sha1>" and "filename" lines)
|
|
|
|
where you do not recognize the tag words (or care about that particular
|
|
|
|
one) at the beginning of the "extended information" lines. That way, if
|
|
|
|
there is ever added information (like the commit encoding or extended
|
|
|
|
commit commentary), a blame viewer will not care.
|
|
|
|
|
|
|
|
|
|
|
|
MAPPING AUTHORS
|
|
|
|
---------------
|
|
|
|
|
|
|
|
include::mailmap.txt[]
|
|
|
|
|
|
|
|
|
|
|
|
SEE ALSO
|
|
|
|
--------
|
|
|
|
linkgit:git-annotate[1]
|
|
|
|
|
|
|
|
GIT
|
|
|
|
---
|
|
|
|
Part of the linkgit:git[1] suite
|