When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds an alternative implementation of show_dirstat(), called
show_dirstat_by_line(), which uses the more expensive diffstat analysis
(as opposed to show_dirstat()'s own (relatively inexpensive) analysis)
to derive the numbers from which the --dirstat output is computed.
The alternative implementation is controlled by the new "lines" parameter
to the --dirstat option (or the diff.dirstat config variable).
For binary files, the diffstat analysis counts bytes instead of lines,
so to prevent binary files from dominating the dirstat results, the
byte counts for binary files are divided by 64 before being compared to
their textual/line-based counterparts. This is a stupid and ugly - but
very cheap - heuristic.
In linux-2.6.git, running the three different --dirstat modes:
time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null
vs.
time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null
vs.
time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null
yields the following average runtimes on my machine:
- "changes" (default): ~6.0 s
- "lines": ~9.6 s
- "files": ~0.1 s
So, as expected, there's a considerable performance hit (~60%) by going
through the full diffstat analysis as compared to the default "changes"
analysis (obviously, "files" is much faster than both). As such, the
"lines" mode is probably only useful if you really need the --dirstat
numbers to be consistent with the numbers returned from the other
--*stat options.
The patch also includes documentation and tests for the new dirstat mode.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only the first digit after the decimal point is kept, as the dirstat
calculations all happen in permille.
Selftests verifying floating-point percentage input has been added.
Improved-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new diff.dirstat config variable takes the same arguments as
'--dirstat=<args>', and specifies the default arguments for --dirstat.
The config is obviously overridden by --dirstat arguments passed on the
command line.
When not specified, the --dirstat defaults are 'changes,noncumulative,3'.
The patch also adds several tests verifying the interaction between the
diff.dirstat config variable, and the --dirstat command line option.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of having multiple interconnected dirstat-related options, teach
the --dirstat option itself to accept all behavior modifiers as parameters.
- Preserve the current --dirstat=<limit> (where <limit> is an integer
specifying a cut-off percentage)
- Add --dirstat=cumulative, replacing --cumulative
- Add --dirstat=files, replacing --dirstat-by-file
- Also add --dirstat=changes and --dirstat=noncumulative for specifying the
current default behavior. These allow the user to reset other --dirstat
parameters (e.g. 'cumulative' and 'files') occuring earlier on the
command line.
The deprecated options (--cumulative and --dirstat-by-file) are still
functional, although they have been removed from the documentation.
Allow multiple parameters to be separated by commas, e.g.:
--dirstat=files,10,cumulative
Update the documentation accordingly, and add testcases verifying the
behavior of the new syntax.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The expected output from --dirstat=0, is to include any directory with
changes, even if those changes contribute a minuscule portion of the total
changes. However, currently, directories that contribute less than 0.1% are
not included, since their 'permille' value is 0, and there is an
'if (permille)' check in gather_dirstat() that causes them to be ignored.
This test is obviously intended to exclude directories that contribute no
changes whatsoever, but in this case, it hits too broadly. The correct
check is against 'this_dir' from which the permille is calculated. Only if
this value is 0 does the directory truly contribute no changes, and should
be skipped from the output.
This patches fixes this issue, and updates corresponding testcases to
expect the new behvaior.
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, t4013 is the only selftest that exercises the --dirstat machinery,
but it only does a superficial verification of --dirstat's output.
This patch adds a new selftest - t4047-diff-dirstat.sh - which prepares a
commit containing:
- unchanged files, changed files and files with rearranged lines
- copied files, moved files, and unmoved files
It then verifies the correct dirstat output for that commit in the following
dirstat modes:
- --dirstat
- -X
- --dirstat=0
- -X0
- --cumulative
- --dirstat-by-file
- (plus combinations of the above)
Each of the above tests are also run with:
- no rename detection
- rename detection (-M)
- expensive copy detection (-C -C)
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The names and e-mails are sanitized by fmt_ident() when creating commits,
so that they do not contain "<" nor ">", and the "committer" and "author"
lines in the commit object will always be in the form:
("author" | "committer") name SP "<" email ">" SP timestamp SP zone
When parsing the email part out, the current code looks for SP starting
from the end of the email part, but the author could obfuscate the address
as "author at example dot com".
We should instead look for SP followed by "<", to match the logic of the
side that formats these lines.
Signed-off-by: Josh Stone <jistone@redhat.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply parameter expansion. Also use here document to save
test results instead of appending each line with ">>".
Signed-off-by: Mathias Lafeldt <misfire@debugon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 9d8a5a5 (diffcore-rename: refactor "too many candidates" logic,
2011-01-06), diffcore_rename() initializes num_src but does not use it
anymore. "-Wunused-but-set-variable" in gcc-4.6 complains about this.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/rename-degrade-cc-to-c:
diffcore-rename: fall back to -C when -C -C busts the rename limit
diffcore-rename: record filepair for rename src
diffcore-rename: refactor "too many candidates" logic
builtin/diff.c: remove duplicated call to diff_result_code()
* mz/rebase: (34 commits)
rebase: define options in OPTIONS_SPEC
Makefile: do not install sourced rebase scripts
rebase: use @{upstream} if no upstream specified
rebase -i: remove unnecessary state rebase-root
rebase -i: don't read unused variable preserve_merges
git-rebase--am: remove unnecessary --3way option
rebase -m: don't print exit code 2 when merge fails
rebase -m: remember allow_rerere_autoupdate option
rebase: remember strategy and strategy options
rebase: remember verbose option
rebase: extract code for writing basic state
rebase: factor out sub command handling
rebase: make -v a tiny bit more verbose
rebase -i: align variable names
rebase: show consistent conflict resolution hint
rebase: extract am code to new source file
rebase: extract merge code to new source file
rebase: remove $branch as synonym for $orig_head
rebase -i: support --stat
rebase: factor out call to pre-rebase hook
...
* en/merge-recursive:
merge-recursive: tweak magic band-aid
merge-recursive: When we detect we can skip an update, actually skip it
t6022: New test checking for unnecessary updates of files in D/F conflicts
t6022: New test checking for unnecessary updates of renamed+modified files
* jh/dirstat:
--dirstat: In case of renames, use target filename instead of source filename
Teach --dirstat not to completely ignore rearranged lines within a file
--dirstat-by-file: Make it faster and more correct
--dirstat: Describe non-obvious differences relative to --stat or regular diff
'git rebase' uses 'git merge' to preserve merges (-p). This preserves
the original merge commit correctly, except when the original merge
commit was created by 'git merge --no-ff'. In this case, 'git rebase'
will fail to preserve the merge, because during 'git rebase', 'git
merge' will simply fast-forward and skip the commit. For example:
B
/ \
A---M
/
---o---O---P---Q
If we try to rebase M onto P, we lose the merge commit and this happens:
A---B
/
---o---O---P---Q
To correct this, we simply do a "no fast-forward" on all merge commits
when rebasing. Since by the time we decided to do a 'git merge' inside
'git rebase', it means there was a merge originally, so 'git merge'
should always create a merge commit regardless of what the merge
branches look like. This way, when rebase M onto P from the above
example, we get:
B
/ \
A---M
/
---o---O---P---Q
Signed-off-by: Andrew Wong <andrew.kw.w@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In v1.7.4-rc0~11^2~2 (bash: get --pretty=m<tab> completion to work
with bash v4, 2010-12-02) we started to use _get_comp_words_by_ref()
to access completion-related variables. That was large change, and to
make it easily reviewable, we invoked _get_comp_words_by_ref() in each
completion function and systematically replaced every occurance of
bash's completion-related variables ($COMP_WORDS and $COMP_CWORD) with
variables set by _get_comp_words_by_ref().
This has the downside that _get_comp_words_by_ref() is invoked several
times during a single completion. The worst offender is perhaps 'git
log mas<TAB>': during the completion of 'master'
_get_comp_words_by_ref() is invoked no less than six times.
However, the variables $prev, $cword, and $words provided by
_get_comp_words_by_ref() are not modified in any of the completion
functions, and the previous commit ensures that the $cur variable is
not modified as well. This makes it possible to invoke
_get_comp_words_by_ref() to get those variables only once in our
toplevel completion functions _git() and _gitk(), and all other
completion functions will inherit them.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since v1.7.4-rc0~11^2~2 (bash: get --pretty=m<tab> completion to work
with bash v4, 2010-12-02) we use _get_comp_words_by_ref() to access
completion-related variables, and the $cur variable holds the word
containing the current cursor position in all completion functions.
This $cur variable is left unchanged in most completion functions;
there are only four functions modifying its value, namely __gitcomp(),
__git_complete_revlist_file(), __git_complete_remote_or_refspec(), and
_git_config().
If this variable were never modified, then it would allow us a nice
optimisation and cleanup. Therefore, this patch assigns $cur to an
other local variable and uses that for later modifications in those
four functions.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rr/doc-content-type:
Documentation: Allow custom diff tools to be specified in 'diff.tool'
Documentation: Add diff.<driver>.* to config
Documentation: Move diff.<driver>.* from config.txt to diff-config.txt
Documentation: Add filter.<driver>.* to config
* rj/sparse:
sparse: Fix some "symbol not declared" warnings
sparse: Fix errors due to missing target-specific variables
sparse: Fix an "symbol 'merge_file' not decared" warning
sparse: Fix an "symbol 'format_subject' not declared" warning
sparse: Fix some "Using plain integer as NULL pointer" warnings
sparse: Fix an "symbol 'cmd_index_pack' not declared" warning
Makefile: Use cgcc rather than sparse in the check target
* mg/reflog-with-options:
reflog: fix overriding of command line options
t/t1411: test reflog with formats
builtin/log.c: separate default and setup of cmd_log_init()
Reading the diff-family and config man pages one may think that the
color.diff and color.ui settings apply to all diff commands. Make it
clearer that they do not apply to the plumbing variants
diff-{files,index,tree}.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit a8f3e2219 introduced the strbuf_grow() call to strbuf_setlen() to
make ensure that there was at least one byte available to write the
mandatory trailing NUL, even for previously unallocated strbufs.
Then b315c5c0 added strbuf_slopbuf for the same reason, only globally for
all uses of strbufs.
Thus the strbuf_grow() call can be removed now. This avoids readers of
strbuf.h from mistakenly thinking that strbuf_setlen() can be used to
extend a strbuf.
The following assert() needs to be changed to cope with the fact that
sb->alloc can now be zero, which is OK as long as len is also zero. As
suggested by Junio, use the chance to convert it to a die() with a short
explanatory message. The pattern of 'die("BUG: ...")' is already used in
strbuf.c.
This was the only assert() in strbuf.[ch], so assert.h doesn't have to be
included anymore either.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Provide an environment variable GIT_PREFIX which contains the subdirectory
from which a !alias was called (i.e. 'git rev-parse --show-prefix') since
these cd to the to level directory before they are executed.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If bashcompinit has not already been autoloaded, do so
automatically, as it is required to properly parse the
git-completion file with ZSH.
Helped-by: Felipe Contreras
Signed-off-by: Marius Storm-Olsen <mstormo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If somebody has a name that includes an rfc822 special, we
will output it literally in the "From:" header. This is
usually OK, but certain characters (like ".") are supposed
to be enclosed in double-quotes in a mail header.
In practice, whether this matters may depend on your MUA.
Some MUAs will happily take in:
From: Foo B. Bar <author@example.com>
without quotes, and properly quote the "." when they send
the actual mail. Others may not, or may screw up harder
things like:
From: Foo "The Baz" Bar <author@example.com>
For example, mutt will strip the quotes, thinking they are
actual syntactic rfc822 quotes.
So let's quote properly, and then (if necessary) we still
apply rfc2047 encoding on top of that, which should make all
MUAs happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For projects that do not release official archives, gitweb's snapshot
feature would be an excellent alternative, and but without the '-n'
('--no-name') argument, gzip includes a timestamp in output which results
in different files. Because some systems hash/checksum downloaded files
to ensure integrity of the tarball (e.g FreeBSD), it is desirable to
produce tarballs in a reproducible way for that purpose.
Whilst '--no-name' is more descriptive, the long version of the flag is
not supported on all systems. In particular, OpenBSD does not appear to
support it.
Supply '-n' to gzip to exclude timestamp from output and produce idential
output every time.
Signed-off-by: Fraser Tweedale <frase@frase.id.au>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When --count is used with --cherry-mark, omit the patch equivalent
commits from the count for left and right commits and print the count of
equivalent commits separately.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark subcommand names as 'subcommand' to make them stand out.
Signed-off-by: Valentin Haenel <valentin.haenel@gmx.de>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The options '---use-log-author' and '--add-author-from' are applicable to other
subcommands except 'fetch' -- therefore move them from the 'fetch' section to
the more general 'options' section.
Signed-off-by: Valentin Haenel <valentin.haenel@gmx.de>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option '--add-author-from' is used in 'commit-diff', 'set-tree', and
'dcommit' subcommands.
Signed-off-by: Valentin Haenel <valentin.haenel@gmx.de>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>