When running git describe --dirty the index should be refreshed. Previously
the cached index would cause describe to think that the index was dirty when,
in reality, it was just stale.
The issue was exposed by python setuptools which hardlinks files into another
directory when building a distribution.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running git describe --dirty the index should be refreshed. Previously
the cached index would cause describe to think that the index was dirty when,
in reality, it was just stale.
The issue was exposed by python setuptools which hardlinks files into another
directory when building a distribution.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default of 7 comes from fairly early in git development, when
seven hex digits was a lot (it covers about 250+ million hash
values). Back then I thought that 65k revisions was a lot (it was what
we were about to hit in BK), and each revision tends to be about 5-10
new objects or so, so a million objects was a big number.
These days, the kernel isn't even the largest git project, and even
the kernel has about 220k revisions (_much_ bigger than the BK tree
ever was) and we are approaching two million objects. At that point,
seven hex digits is still unique for a lot of them, but when we're
talking about just two orders of magnitude difference between number
of objects and the hash size, there _will_ be collisions in truncated
hash values. It's no longer even close to unrealistic - it happens all
the time.
We should both increase the default abbrev that was unrealistically
small, _and_ add a way for people to set their own default per-project
in the git config file.
This is the first step to first make it configurable; the default of 7
is not raised yet.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the find_exact_renames() function, this allows us to pass the
diff_options structure pointer to the low-level routines. We will use
that to distinguish between the "rename" and "copy" cases.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that struct commit.util is not used until after we've checked that
the argument doesn't exactly match a tag, we can wait until then to
look up the commits for each tag.
This avoids a lot of I/O on --exact-match queries in repositories with
many tags. For example, 'git describe --exact-match HEAD' becomes
about 12 times faster on a cold cache (3.2s instead of 39s) in a
linux-2.6 repository with 2000 packed tags. That is a huge win for the
interactivity of the __git_ps1 shell prompt helper when on a detached
HEAD.
Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
describe is currently forced to look up the commit at each tag in
order to store the struct commit_name pointers in struct commit.util.
For --exact-match queries, those lookups are wasteful. In preparation
for removing them, put the commit_names into a hash table, indexed by
commit SHA1, that can be used to quickly check for exact matches.
Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now add_to_known_names overwrites commit_names in place when multiple
tags point to the same commit. This will make it easier to store
commit_names in a hash table.
Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't waste time checking for dangling refs; they wouldn't affect the
output of 'git describe' anyway. Although this does not gain much
performance by itself, it does in conjunction with the next commits.
Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add commit_list prefix to insert_by_date function and to sort_by_date,
so it's clear that these functions refer to commit_list structure.
Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This shrinks the top-level directory a bit, and makes it much more
pleasant to use auto-completion on the thing. Instead of
[torvalds@nehalem git]$ em buil<tab>
Display all 180 possibilities? (y or n)
[torvalds@nehalem git]$ em builtin-sh
builtin-shortlog.c builtin-show-branch.c builtin-show-ref.c
builtin-shortlog.o builtin-show-branch.o builtin-show-ref.o
[torvalds@nehalem git]$ em builtin-shor<tab>
builtin-shortlog.c builtin-shortlog.o
[torvalds@nehalem git]$ em builtin-shortlog.c
you get
[torvalds@nehalem git]$ em buil<tab> [type]
builtin/ builtin.h
[torvalds@nehalem git]$ em builtin [auto-completes to]
[torvalds@nehalem git]$ em builtin/sh<tab> [type]
shortlog.c shortlog.o show-branch.c show-branch.o show-ref.c show-ref.o
[torvalds@nehalem git]$ em builtin/sho [auto-completes to]
[torvalds@nehalem git]$ em builtin/shor<tab> [type]
shortlog.c shortlog.o
[torvalds@nehalem git]$ em builtin/shortlog.c
which doesn't seem all that different, but not having that annoying
break in "Display all 180 possibilities?" is quite a relief.
NOTE! If you do this in a clean tree (no object files etc), or using an
editor that has auto-completion rules that ignores '*.o' files, you
won't see that annoying 'Display all 180 possibilities?' message - it
will just show the choices instead. I think bash has some cut-off
around 100 choices or something.
So the reason I see this is that I'm using an odd editory, and thus
don't have the rules to cut down on auto-completion. But you can
simulate that by using 'ls' instead, or something similar.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
4d23660 (describe: when failing, tell the user about options that
work, 2009-10-28) forgot to update the shortcut path where the code
detected and used a possible exact match. This means that an
unannotated tag on HEAD would be used by 'git describe'.
Guard this code path against the new circumstances, where unannotated
tags can be present in ->util even if we're not actually planning to
use them.
While there, also add some tests for --all.
Reported by 'yashi' on IRC.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users seem to call git-describe without reading the manpage, and then
wonder why it doesn't work with unannotated tags by default.
Make a minimal effort towards seeing if there would have been
unannotated tags, and tell the user. Specifically, we say that --tags
could work if we found any unannotated tags. If not, we say that
--always would have given results.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the --dirty option, git describe works on HEAD but append s"-dirty"
iff the contents of the work tree differs from HEAD. E.g.
$ git describe --dirty
v1.6.5-15-gc274db7
$ echo >> Makefile
$ git describe --dirty
v1.6.5-15-gc274db7-dirty
The --dirty option can also be used to specify what is appended, instead
of the default string "-dirty".
$ git describe --dirty=.mod
v1.6.5-15-gc274db7.mod
Many build scripts use `git describe` to produce a version number based on
the description of HEAD (on which the work tree is based) + saying that if
the build contains uncommitted changes. This patch helps the writing of
such scripts since `git describe --dirty` does directly the intended thing.
Three possiblities were considered while discussing this new feature:
1. Describe the work tree by default and describe HEAD only if "HEAD" is
explicitly specified
Pro: does the right thing by default (both for users and for scripts)
Pro: other git commands that works on the work tree by default
Con: breaks existing scripts used by the Linux kernel and other projects
2. Use --worktree instead of --dirty
Pro: does what it says: "git describe --worktree" describes the work tree
Con: other commands do not require a --worktree option when working
on the work tree (it often is the default mode for them)
Con: unusable with an optional value: "git describe --worktree=.mod"
is quite unintuitive.
3. Use --dirty as in this patch
Pro: makes sense to specify an optional value (what the dirty mark is)
Pro: does not have any of the big cons of previous alternatives
* does not break scripts
* is not inconsistent with other git commands
This patch takes the third approach.
Signed-off-by: Jean Privat <jean@pryen.org>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes a regression introduce by d68dc34 (git-describe: Die early if
there are no possible descriptions, 2009-08-06).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of the static variable that was used to prevent loading all
the refnames multiple times by moving that code out of describe(),
simply making sure it is only run once that way.
Also change the error message that is shown in case no refnames are
found to not include a hash any more, as the error condition is not
specific to any particular revision.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we find no refs that may be used for git-describe with the current
options, then die early instead of pointlessly walking the whole
history.
In git.git with all the tags dropped, this makes "git describe" go down
from 0.244 to 0.003 seconds for me. This is especially noticeable with
"git submodule status" which calls describe with increasing levels of
allowed refs to be matched. For a submodule without tags, this means
that it walks the whole history in the submodule twice (first annotated,
then plain tags), just to find out that it can't describe the commit
anyway.
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To give OPT_FILENAME the prefix, we pass the prefix to parse_options()
which passes the prefix to parse_options_start() which sets the prefix
member of parse_opts_ctx accordingly. If there isn't a prefix in the
calling context, passing NULL will suffice.
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Essentially; s/type* /type */ as per the coding guidelines.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 212945d4 ("Teach git-describe to verify annotated tag names
before output") git-describe learned how to output a warning if
an annotated tag object was matched but its internal name doesn't
match the local ref name.
However, "git describe --all" causes the local ref name to be
prefixed with "tags/", so we need to skip over this prefix before
comparing the local ref name with the name recorded inside of the
tag object.
Patch-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the caller supplies --tags they want the lightweight, unannotated
tags to be searched for a match. If a lightweight tag is closer
in the history, it should be matched, even if an annotated tag is
reachable further back in the commit chain.
The same applies with --all when matching any other type of ref.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Acked-By: Uwe Kleine-König <ukleinek@strlen.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you misuse a git command, you are shown the usage string.
But this is currently shown in the dashed form. So if you just
copy what you see, it will not work, when the dashed form
is no longer supported.
This patch makes git commands show the dash-less version.
For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh
generates a dash-less usage string now.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we match a lightweight (non-annotated tag) as the name to
output and --long was requested we do not have a tag, nor do
we have a tagged object to display. Instead we must use the
object we were passed as input for the long format display.
Reported-by: Mark Burton <markb@ordern.com>
Backtraced-by: Mikael Magnusson <mikachu@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The <pattern> given "git describe --match" was used only to filter tag
objects, and not to filter lightweight tags. This fixes it.
[jc: made the log to clarify this is a bugfix, not an enhancement, with
additional test]
Signed-off-by: Michael Dressel <MichaelTiloDressel@t-online.de>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is implausible for lookup_tag() to return NULL in this particular
codepath but we should protect ourselves against a broken repository
better.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When your refs are packed, "git-describe" can find the tag that is the
best match without ever parsing the tag itself. But lookup_tag() in
display_name() says "I've never seen it", creates an empty shell, and
returns it. We need to make sure that we actually have parsed the tag
data into it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some callers may find it useful if "git describe" always gave back a
string that can be used as a shorter name for a commit object, rather than
checking its exit status (while squelching its error message, which could
potentially talk about more grave errors that should not be squelched) and
implementing a fallback themselves.
This teaches describe/name-rev a new option, --always, to use an
abbreviated object name when no tags or refs to use is found.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If an annotated tag describes a commit we want to favor the name
listed in the body of the tag, rather than whatever name it has
been stored under locally. By doing so it is easier to converse
about tags with others, even if the tags happen to be fetched to
a different name than it was given by its creator.
To avoid confusion when a tag is stored under a different name
(and thus is not readable via git-rev-parse --verify, etc.) we show
a warning message if the name of the tag does not match the ref
we found it under and if that tag was also selected for output.
For example:
$ git tag -a -m "i am a test" testtag
$ mv .git/refs/tags/testtag .git/refs/tags/bobbytag
$ ./git-describe HEAD
warning: tag 'testtag' is really 'bobbytag' here
testtag
$ git tag -d testtag
error: tag 'testtag' not found.
$ git tag -d bobbytag
Deleted tag 'bobbytag'
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is useful when you want to see parts of the commit object name
in "describe" output, even when the commit in question happens to be
a tagged version. Instead of just emitting the tag name, it will
describe such a commit as v1.2-0-deadbeef (0th commit since tag v1.2
that points at object deadbeef....).
Signed-off-by: Santi Béjar <sbejar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes scripts want (or need) the annotated tag name that exactly
matches a specific commit, or no tag at all. In such cases it can be
difficult to determine if the output of `git describe $commit` is a
real tag name or a tag+abbreviated commit. A common idiom is to run
git-describe twice:
if test $(git describe $commit) = $(git describe --abbrev=0 $commit)
...
but this is a huge waste of time if the caller is just going to pick a
different method to describe $commit or abort because it is not exactly
an annotated tag.
Setting the maximum number of candidates to 0 allows the caller to ask
for only a tag that directly points at the supplied commit, or to have
git-describe abort if no such item exists.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we aren't going to use a ref there is no reason for us to open
its object from the object database. This avoids opening any of
the head commits reachable from refs/heads/ unless they are also
reachable through the commit we have been asked to describe and
we need to walk through it to find a tag.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By using the peeled ref information inside of the packed-refs file we
can avoid opening tag objects to obtain the commits they reference.
This speeds up git-describe when there are a large number of tags
in the repository as we have less objects to parse before we can
start commit matching.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rework get_rev_name to return NULL rather than "undefined" when a
reference is undefined. If --undefined is passed (default), git-name-rev
prints "undefined" for the name, else it die()s.
Make git-describe use --no-undefined when calling git-name-rev so
that --contains behavior matches the standard git-describe one.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Often users want to know not which tagged version a commit came
after, but which tagged version a commit is contained within.
This latter task is the job of git-name-rev, but most users are
looking to git-describe to do the job.
Junio suggested we make `git describe --contains` run the correct
tool, `git name-rev`, and that's exactly what we do here. The output
of name-rev was adjusted slightly through the new --name-only option,
allowing describe to execv into name-rev and maintain its current
output format.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This mechanically converts strncmp() to use prefixcmp(), but only when
the parameters match specific patterns, so that they can be verified
easily. Leftover from this will be fixed in a separate step, including
idiotic conversions like
if (!strncmp("foo", arg, 3))
=>
if (!(-prefixcmp(arg, "foo")))
This was done by using this script in px.perl
#!/usr/bin/perl -i.bak -p
if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) {
s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|;
}
if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) {
s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|;
}
and running:
$ git grep -l strncmp -- '*.c' | xargs perl px.perl
Signed-off-by: Junio C Hamano <junkio@cox.net>
My prior change to git-describe attempts to print the distance
between the input commit and the best matching tag, but this distance
was usually only an estimate as we always aborted revision walking
as soon as we overflowed the configured limit on the number of
possible tags (as set by --candidates).
Displaying an estimated distance is not very useful and can just be
downright confusing. Most users (heck, most Git developers) don't
immediately understand why this distance differs from the output
of common tools such as `git rev-list | wc -l`. Even worse, the
estimated distance could change in the future (including decreasing
despite no rebase occuring) if we find more possible tags earlier
on during traversal. (This could happen if more tags are merged
into the branch between queries.) These factors basically make an
estimated distance useless.
Fortunately we are usually most of the way through an accurate
distance computation by the time we abort (due to reaching the
current --candidates limit). This means we can simply finish
counting out the revisions still in our commit queue to present
the accurate distance at the end. The number of commits remaining
in the commit queue is probably less than the number of commits
already traversed, so finishing out the count is not likely to take
very long. This final distance will then always match the output of
`git rev-list | wc -l`.
We can easily reduce the total number of commits that need to be
walked at the end by stopping as soon as all of the commits in the
commit queue are flagged as being merged into the already selected
best possible tag. If that's true then there are no remaining
unseen commits which can contribute to our best possible tag's
depth counter, so further traversal is useless.
Basic testing on my Mac OS X system shows there is no noticable
performance difference between this accurate distance counting
version of git-describe and the prior version of git-describe,
at least when run on git.git.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If you get two different describes at different
times from a non-rewinding branch and they both come up with the same
tag name, you can tell which is the 'newer' one by distance. This is
rather common in practice, so its incredibly useful.
[jc: still needs documentation and fixups when traversal gives up
early.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When on a non-tag commit, git-describe normally outputs descriptions of
the form
v1.0.0-g1234567890
Some scripts (for example the update hook script) might just want to
know the name of the nearest tag, so they then have to do
x=$(git-describe HEAD | sed 's/-g*//')
This is costly, but more importantly is fragile as it is relying on the
output format of git-describe, which we would then have to maintain
forever.
This patch adds support for setting the --abbrev option to zero. In
that case git-describe does as it always has, but outputs only the
nearest found tag instead of a completely unique name. This means that
scripts would not have to parse the output format and won't need
changing if the git-describe suffix is ever changed.
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio added the found variable to enforce commit date order when two
tags have the same distance from the requested commit. Except it is
unnecessary as match_cnt is already used to record how many possible
tags have been identified thus far.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Currently we don't use the util field of struct commit but we want
fast access to the highest priority name that references any given
commit object during our matching loop. A really simple approach
is to just store the name directly in the util field.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We really want to always favor an annotated tag over a lightweight
tag when describing a commit. Unfortunately git-describe wasn't
doing this as it was favoring the depth attribute of a possible_tag
over the priority. Now priority is the highest sort and we only
consider a lightweight tag if no annotated tags were identified.
Rather than searching for the minimum tag using a simple loop we
now sort them using a stable sort algorithm, this way the possible
tags display in order if --debug gets used. The stable sort helps
to preseve the inherit topology/date order that we obtain during
our search loop.
This fix allows the tests in t6120-describe.sh to pass.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
My prior version of git-describe ran very slowly on even reasonably
sized projects like git.git and linux.git as it tended to identify
a large number of possible tags and then needed to generate the
revision list for each of those tags to sort them and select the
best tag to describe the input commit.
All we really need is the number of commits in the input revision
which are not in the tag. We can generate these counts during
the revision walking and tag matching loop by assigning a color to
each tag and coloring the commits as we walk them. This limits us
to identifying no more than 26 possible tags, as there is limited
space available within the flags field of struct commit.
The limitation of 26 possible tags is hopefully not going to be a
problem in real usage, as most projects won't create 26 maintenance
releases and merge them back into a development trunk after the
development trunk was tagged with a release candidate tag. If that
does occur git-describe will start to revert to its old behavior of
using the newer maintenance release tag to describe the development
trunk, rather than the development trunk's own tag. The suggested
workaround would be to retag the development trunk's tip.
However since even 26 possible tags can take a while to generate a
description for on some projects I'm defaulting the limit to 10 but
offering the user --candidates to increase the number of possible
matches if they need a more accurate result. I specifically chose
10 for the default as it seems unlikely projects will have more
than 10 maintenance releases merged into a development trunk before
retagging the development trunk, and it seems to perform about the
same on linux.git as v1.4.4.4 git-describe.
A large amount of debugging information was also added during
the development of this change, so I've left it in to be toggled
on with --debug. It may be useful to the end user to help them
understand why git-describe took one particular tag over another.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If a project has a really huge number of tags (such as several
thousand tags) then we are likely to have nearly a hundred tags in
some buckets. Scanning those buckets as linked lists could take
a large amount of time if done repeatedly during history traversal.
Since we are searching for a unique commit SHA1 we can sort all
tags by commit SHA1 and perform a binary search within the bucket.
Once we identify a particular tag as matching this commit we walk
backwards within the bucket matches to make sure we pick up the
highest priority tag for that commit, as the binary search may
have landed us in the middle of a set of tags which point at the
same commit.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If a project has a very large number of tags then git-describe
will spend a good part of its time looping over the tags testing
them one at a time to determine if it matches a given commit.
For 10 tags this is not a big deal, but for hundreds of tags the
time could become considerable if we don't find an exact match for
the input commit and we need to walk back along the history chain.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Several people have suggested that its always better to describe
a commit using an annotated tag, and to only use a lightweight tag
if absolutely no annotated tag matches the input commit.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>