The diff-delta code can exhibit O(m*n) behavior with some patological
data set where most hash entries end up in the same hash bucket.
The latest code rework reduced the block size making it particularly
vulnerable to this issue, but the issue was always there and can be
triggered regardless of the block size.
This patch does two things:
1) the hashing has been reworked to offer a better distribution to
atenuate the problem a bit, and
2) a limit is imposed to the number of entries that can exist in the
same hash bucket.
Because of the above the code is a bit more expensive on average, but
the problematic samples used to diagnoze the issue are now orders of
magnitude less expensive to process with only a slight loss in
compression.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Indexing based on adler32 has a match precision based on the block size
(currently 16). Lowering the block size would produce smaller deltas
but the indexing memory and computing cost increases significantly.
For optimal delta result the indexing block size should be 3 with an
increment of 1 (instead of 16 and 16). With such low params the adler32
becomes a clear overhead increasing the time for git-repack by a factor
of 3. And with such small blocks the adler 32 is not very useful as the
whole of the block bits can be used directly.
This patch replaces the adler32 with an open coded index value based on
3 characters directly. This gives sufficient bits for hashing and
allows for optimal delta with reasonable CPU cycles.
The resulting packs are 6% smaller on average. The increase in CPU time
is about 25%. But this cost is now hidden by the delta reuse patch
while the saving on data transfers is always there.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
In the conversion to Getopt::Long, the -S / --rev-list parameter stopped
working. We need to tell Getopt::Long that it is a string.
As a bonus, the open() now does some useful error handling.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Earlier we set up the window to never scroll
horizontally, which made it harder to use on a narrow screen.
This patch allows scrollbar to be used as needed by Gtk
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Commit 8fcf1ad9c6 has a
combination of double cast and Andreas' switch to using
unsigned long ... just the latter is sufficient (and a lot less
ugly than using the double cast).
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* lt/fix-apply:
git-am: --whitespace=x option.
git-apply: war on whitespace -- finishing touches.
git-apply --whitespace=nowarn
apply --whitespace: configuration option.
apply: squelch excessive errors and --whitespace=error-all
apply --whitespace fixes and enhancements.
The war on trailing whitespace
* lt/apply:
git-am: --whitespace=x option.
git-apply: war on whitespace -- finishing touches.
git-apply --whitespace=nowarn
apply --whitespace: configuration option.
apply: squelch excessive errors and --whitespace=error-all
apply --whitespace fixes and enhancements.
The war on trailing whitespace
Moving a directory ending in a slash was not working as the
destination was not calculated correctly.
E.g. in the git repo,
git-mv t/ Documentation
gave the error
Error: destination 'Documentation' already exists
To get rid of this problem, strip trailing slashes from all arguments.
The comment in cg-mv made me curious about this issue; Pasky, thanks!
As result, the workaround in cg-mv is not needed any more.
Also, another bug was shown by cg-mv. When moving files outside of
a subdirectory, it typically calls git-mv with something like
git-mv Documentation/git.txt Documentation/../git-mv.txt
which triggers the following error from git-update-index:
Ignoring path Documentation/../git-mv.txt
The result is a moved file, removed from git revisioning, but not
added again. To fix this, the paths have to be normalized not have ".."
in the middle. This was already done in git-mv, but only for
a better visual appearance :(
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes "git-mv -h" to output the usage without the need
to be in a git repository.
Additionally:
- fix confusing error message when only one arg was given
- fix typo in error message
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Combined diffs don't null terminate things in the same way as standard
diffs. This is presumably wrong.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 6baf0484ef commit)
For some reason, combined diffs don't honour the --full-index flag when
emitting patches. Fix this.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from e70c6b3574 commit)
We did not check if we have the same file on both sides when
computing break score. This is usually not a problem, but if
the user said --find-copies-harde with -B, we ended up trying a
delta between the same data even when we know the SHA1 hash of
both sides match.
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from aeecd23ae2 commit)
Eclipse CVS clients have an odd way of perusing the top level of
the repository, by calling update on module "". So reproduce cvs'
odd behaviour in the interest of compatibility.
It makes it much easier to get a checkout when using Eclipse.
A few things to satisfy Eclipse's strange habits as a cvs client:
- Implement Questionable
- Aliased rlog to log, but more work may be needed
- Add a space after the U that indicates updated
This is passed down to git-apply to override the built-in
default and per-repository configuration at runtime.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is passed down to git-apply to override the built-in
default and per-repository configuration at runtime.
Signed-off-by: Junio C Hamano <junkio@cox.net>
We did not check if we have the same file on both sides when
computing break score. This is usually not a problem, but if
the user said --find-copies-harde with -B, we ended up trying a
delta between the same data even when we know the SHA1 hash of
both sides match.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When on Darwin platforms don't include Fink or DarwinPorts
into the link path unless the related library directory
is actually present. The linker on MacOS 10.4 complains
if it is given a directory which does not exist.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This changes the default --whitespace policy to nowarn when we
are only getting --stat, --summary etc. IOW when not applying
the patch. When applying the patch, the default is warn (spit
out warning message but apply the patch).
Signed-off-by: Junio C Hamano <junkio@cox.net>
This changes the default --whitespace policy to nowarn when we
are only getting --stat, --summary etc. IOW when not applying
the patch. When applying the patch, the default is warn (spit
out warning message but apply the patch).
Signed-off-by: Junio C Hamano <junkio@cox.net>
Andrew insists --whitespace=warn should be the default, and I
tend to agree. This introduces --whitespace=warn, so if your
project policy is more lenient, you can squelch them by having
apply.whitespace=nowarn in your configuration file.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The new configuration option apply.whitespace can take one of
"warn", "error", "error-all", or "strip". When git-apply is run
to apply the patch to the index, they are used as the default
value if there is no command line --whitespace option.
Andrew can now tell people who feed him git trees to update to
this version and say:
git repo-config apply.whitespace error
Signed-off-by: Junio C Hamano <junkio@cox.net>
This by default makes --whitespace=warn, error, and strip to
warn only the first 5 additions of trailing whitespaces. A new
option --whitespace=error-all can be used to view all of them
before applying.
Signed-off-by: Junio C Hamano <junkio@cox.net>
In addition to fixing obvious command line parsing bugs in the
previous round, this changes the following:
* Adds "--whitespace=strip". This applies after stripping the
new trailing whitespaces introduced to the patch.
* The output error message format is changed to say
"patch-filename:linenumber:contents of the line". This makes
it similar to typical compiler error message format, and
helps C-x ` (next-error) in Emacs compilation buffer.
* --whitespace=error and --whitespace=warn do not stop at the
first error. We might want to limit the output to say first
20 such lines to prevent cluttering, but on the other hand if
you are willing to hand-fix after inspecting them, getting
everything with a single run might be easier to work with.
After all, somebody has to do the clean-up work somewhere.
Signed-off-by: Junio C Hamano <junkio@cox.net>
On Sat, 25 Feb 2006, Andrew Morton wrote:
>
> I'd suggest a) git will simply refuse to apply such a patch unless given a
> special `forcing' flag, b) even when thus forced, it will still warn and c)
> with a different flag, it will strip-then-apply, without generating a
> warning.
This doesn't do the "strip-then-apply" thing, but it allows you to make
git-apply generate a warning or error on extraneous whitespace.
Use --whitespace=warn to warn, and (surprise, surprise) --whitespace=error
to make it a fatal error to have whitespace at the end.
Totally untested, of course. But it compiles, so it must be fine.
HOWEVER! Note that this literally will check every single patch-line with
"+" at the beginning. Which means that if you fix a simple typo, and the
line had a space at the end before, and you didn't remove it, that's still
considered a "new line with whitespace at the end", even though obviously
the line wasn't really new.
I assume this is what you wanted, and there isn't really any sane
alternatives (you could make the warning activate only for _pure_
additions with no deletions at all in that hunk, but that sounds a bit
insane).
Linus
Andrew insists --whitespace=warn should be the default, and I
tend to agree. This introduces --whitespace=warn, so if your
project policy is more lenient, you can squelch them by having
apply.whitespace=nowarn in your configuration file.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Thanks to Nicolas Vilz <niv@iaglans.de> for noticing this.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When the user specifies a username -> Full Name <email@addr.es> map
file with the -A option, save a copy of that file as
$git_dir/svn-authors. When running git-svnimport with an existing GIT
directory, use $git_dir/svn-authors (if it exists) unless a file was
explicitly specified with -A.
Signed-off-by: Karl Hasselström <kha@treskal.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-cvsimport uses a username => Full Name <email@addr.es> mapping
file with this syntax:
kha=Karl Hasselström <kha@treskal.com>
Since there is no reason to use another format for git-svnimport, use
the same format.
Signed-off-by: Karl Hasselström <kha@treskal.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The new configuration option apply.whitespace can take one of
"warn", "error", "error-all", or "strip". When git-apply is run
to apply the patch to the index, they are used as the default
value if there is no command line --whitespace option.
Andrew can now tell people who feed him git trees to update to
this version and say:
git repo-config apply.whitespace error
Signed-off-by: Junio C Hamano <junkio@cox.net>
This by default makes --whitespace=warn, error, and strip to
warn only the first 5 additions of trailing whitespaces. A new
option --whitespace=error-all can be used to view all of them
before applying.
Signed-off-by: Junio C Hamano <junkio@cox.net>
As a rule, interface branches to different SCMs should never be modified
directly by the user. They are used exclusively for talking to the
foreign SCM.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Get the encoding information from repository and convert it to utf-8 before
passing to gtk.TextBuffer.set_text. gtk.TextBuffer.set_text work only with utf-8
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If the second line of the commit message isn't empty, git-format-patch
needs to add an empty line in order to generate a properly formatted
mail. Otherwise git-rebase drops the rest of the commit message.
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Combined diffs don't null terminate things in the same way as standard
diffs. This is presumably wrong.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>