* np/delta:
diff-delta: allow reusing of the reference buffer index
diff-delta: bound hash list length to avoid O(m*n) behavior
diff-delta: produce optimal pack data
Merge branch 'kh/svnimport'
Merge branch 'js/refs'
annotate: fix -S parameter to take a string
annotate: Add a basic set of test cases.
annotate: handle \No newline at end of file.
gitview: Use horizontal scroll bar in the tree view
When a reference buffer is used multiple times then its index can be
computed only once and reused multiple times. This patch adds an extra
pointer to a pointer argument (from_index) to diff_delta() for this.
If from_index is NULL then everything is like before.
If from_index is non NULL and *from_index is NULL then the index is
created and its location stored to *from_index. In this case the caller
has the responsibility to free the memory pointed to by *from_index.
If from_index and *from_index are non NULL then the index is reused as
is.
This currently saves about 10% of CPU time to repack the git archive.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The diff-delta code can exhibit O(m*n) behavior with some patological
data set where most hash entries end up in the same hash bucket.
The latest code rework reduced the block size making it particularly
vulnerable to this issue, but the issue was always there and can be
triggered regardless of the block size.
This patch does two things:
1) the hashing has been reworked to offer a better distribution to
atenuate the problem a bit, and
2) a limit is imposed to the number of entries that can exist in the
same hash bucket.
Because of the above the code is a bit more expensive on average, but
the problematic samples used to diagnoze the issue are now orders of
magnitude less expensive to process with only a slight loss in
compression.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Indexing based on adler32 has a match precision based on the block size
(currently 16). Lowering the block size would produce smaller deltas
but the indexing memory and computing cost increases significantly.
For optimal delta result the indexing block size should be 3 with an
increment of 1 (instead of 16 and 16). With such low params the adler32
becomes a clear overhead increasing the time for git-repack by a factor
of 3. And with such small blocks the adler 32 is not very useful as the
whole of the block bits can be used directly.
This patch replaces the adler32 with an open coded index value based on
3 characters directly. This gives sufficient bits for hashing and
allows for optimal delta with reasonable CPU cycles.
The resulting packs are 6% smaller on average. The increase in CPU time
is about 25%. But this cost is now hidden by the delta reuse patch
while the saving on data transfers is always there.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The Eclipse client uses cvs update when that menu option is triggered.
And doesn't like the standard cvs update response. Give it *exactly* what
it wants.
And hope the other clients don't lose the plot too badly.
Signed-off-by: Junio C Hamano <junkio@cox.net>
In the conversion to Getopt::Long, the -S / --rev-list parameter stopped
working. We need to tell Getopt::Long that it is a string.
As a bonus, the open() now does some useful error handling.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Earlier we set up the window to never scroll
horizontally, which made it harder to use on a narrow screen.
This patch allows scrollbar to be used as needed by Gtk
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Initial checkouts were failing to create Entries files under Eclipse.
Eclipse was waiting for two non-standard directory-resets to prepare for a new
directory from the server.
This patch is tricky, because the same directory resets tend to confuse other
clients. It's taken a bit of fiddling to get the commandline cvs client and
Eclipse to get a good, clean checkout.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Commit 8fcf1ad9c6 has a
combination of double cast and Andreas' switch to using
unsigned long ... just the latter is sufficient (and a lot less
ugly than using the double cast).
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We can show commit objects with human readable dates using
various --pretty options, but there was no way to do so with
tags. This introduces two such ways:
$ git-cat-file -p v1.2.3
shows the tag object with tagger dates in human readable format.
$ git-verify-tag --verbose v1.2.3
uses it to show the contents of the tag object as well as doing
GPG verification.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* lt/fix-apply:
git-am: --whitespace=x option.
git-apply: war on whitespace -- finishing touches.
git-apply --whitespace=nowarn
apply --whitespace: configuration option.
apply: squelch excessive errors and --whitespace=error-all
apply --whitespace fixes and enhancements.
The war on trailing whitespace
* lt/apply:
git-am: --whitespace=x option.
git-apply: war on whitespace -- finishing touches.
git-apply --whitespace=nowarn
apply --whitespace: configuration option.
apply: squelch excessive errors and --whitespace=error-all
apply --whitespace fixes and enhancements.
The war on trailing whitespace
Moving a directory ending in a slash was not working as the
destination was not calculated correctly.
E.g. in the git repo,
git-mv t/ Documentation
gave the error
Error: destination 'Documentation' already exists
To get rid of this problem, strip trailing slashes from all arguments.
The comment in cg-mv made me curious about this issue; Pasky, thanks!
As result, the workaround in cg-mv is not needed any more.
Also, another bug was shown by cg-mv. When moving files outside of
a subdirectory, it typically calls git-mv with something like
git-mv Documentation/git.txt Documentation/../git-mv.txt
which triggers the following error from git-update-index:
Ignoring path Documentation/../git-mv.txt
The result is a moved file, removed from git revisioning, but not
added again. To fix this, the paths have to be normalized not have ".."
in the middle. This was already done in git-mv, but only for
a better visual appearance :(
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes "git-mv -h" to output the usage without the need
to be in a git repository.
Additionally:
- fix confusing error message when only one arg was given
- fix typo in error message
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Combined diffs don't null terminate things in the same way as standard
diffs. This is presumably wrong.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 6baf0484ef commit)
For some reason, combined diffs don't honour the --full-index flag when
emitting patches. Fix this.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from e70c6b3574 commit)
We did not check if we have the same file on both sides when
computing break score. This is usually not a problem, but if
the user said --find-copies-harde with -B, we ended up trying a
delta between the same data even when we know the SHA1 hash of
both sides match.
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from aeecd23ae2 commit)
This ports the following options from rev-list based git-log
implementation:
* -<n>, -n<n>, and -n <n>. I am still wondering if we want
this natively supported by setup_revisions(), which already
takes --max-count. We may want to move them in the next
round. Also I am not sure if we can get away with not
setting revs->limited when we set max-count. The latest
rev-list.c and revision.c in this series do not, so I left
them as they are.
* --pretty and --pretty=<fmt>.
* --abbrev=<n> and --no-abbrev.
The previous commit already handles time-based limiters
(--since, --until and friends). The remaining things that
rev-list based git-log happens to do are not useful in a pure
log-viewing purposes, and not ported:
* --bisect (obviously).
* --header. I am actually in favor of doing the NUL
terminated record format, but rev-list based one always
passed --pretty, which defeated this option. Maybe next
round.
* --parents. I do not think of a reason a log viewer wants
this. The flag is primarily for feeding squashed history
via pipe to downstream tools.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* lt/rev-list:
Rip out merge-order and make "git log <paths>..." work again.
Tie it all together: "git log"
Introduce trivial new pager.c helper infrastructure
git-rev-list libification: rev-list walking
blame.c #include's epoch.h; it needed to be killed.
Well, assuming breaking --merge-order is fine, here's a patch (on top of
the other ones) that makes
git log <filename>
actually work, as far as I can tell.
I didn't add the logic for --before/--after flags, but that should be
pretty trivial, and is independent of this anyway.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Conflicts:
Documentation/git-cvsserver.txt
git-cvsserver.perl
Originally Martin's tree was based on "next", which meant that all
the other things that I am not ready to push out to "master" were
contained in it. His changes looked good, and I wanted to have them
in "master".
So, here is what I did:
- fetch Martin's tree into a temporary topic branch.
$ git fetch $URL $remote:ml/cvsserver
$ git checkout ml/cvsserver
- rebase it on top of "master".
$ git rebase --onto master next
- pull that master into "next", recording Martin's head as well.
$ git pull --append . master
Since I have apply.whitespace=strip in my configuration file, the
rebased cvsserver changes have trailing whitespaces introduced by
Martin's tree cleansed out. Hence the above conflicts.
The reason I made this octopus is to make sure that next time Martin
pulls from my "next" branch, it results in a fast forward. There is
no reason to force him do the same conflict resolution I did with this
merge.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Since git-checkout-index is often used from scripts which
may have a stream of filenames they wish to checkout it is
more convenient to use --stdin than xargs. On platforms
where fork performance is currently sub-optimal and
the length of a command line is limited (*cough* Cygwin
*cough*) running a single git-checkout-index process for
a large number of files beats spawning it multiple times
from xargs.
File names are still accepted on the command line if
--stdin is not supplied. Nothing is performed if no files
are supplied on the command line or by stdin.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Eclipse CVS clients have an odd way of perusing the top level of
the repository, by calling update on module "". So reproduce cvs'
odd behaviour in the interest of compatibility.
It makes it much easier to get a checkout when using Eclipse.
A few things to satisfy Eclipse's strange habits as a cvs client:
- Implement Questionable
- Aliased rlog to log, but more work may be needed
- Add a space after the U that indicates updated
Eclipse CVS clients have an odd way of perusing the top level of
the repository, by calling update on module "". So reproduce cvs'
odd behaviour in the interest of compatibility.
It makes it much easier to get a checkout when using Eclipse.
A few things to satisfy Eclipse's strange habits as a cvs client:
- Implement Questionable
- Aliased rlog to log, but more work may be needed
- Add a space after the U that indicates updated
This is passed down to git-apply to override the built-in
default and per-repository configuration at runtime.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is passed down to git-apply to override the built-in
default and per-repository configuration at runtime.
Signed-off-by: Junio C Hamano <junkio@cox.net>