In order to make sure the cloned repository is good, we run "rev-list
--objects --not --all $new_refs" on the repository. This is expensive
on large repositories. This patch attempts to mitigate the impact in
this special case.
In the "good" clone case, we only have one pack. If all of the
following are met, we can be sure that all objects reachable from the
new refs exist, which is the intention of running "rev-list ...":
- all refs point to an object in the pack
- there are no dangling pointers in any object in the pack
- no objects in the pack point to objects outside the pack
The second and third checks can be done with the help of index-pack as
a slight variation of --strict check (which introduces a new condition
for the shortcut: pack transfer must be used and the number of objects
large enough to call index-pack). The first is checked in
check_everything_connected after we get an "ok" from index-pack.
"index-pack + new checks" is still faster than the current "index-pack
+ rev-list", which is the whole point of this patch. If any of the
conditions fail, we fall back to the good old but expensive "rev-list
..". In that case it's even more expensive because we have to pay for
the new checks in index-pack. But that should only happen when the
other side is either buggy or malicious.
Cloning linux-2.6 over file://
before after
real 3m25.693s 2m53.050s
user 5m2.037s 4m42.396s
sys 0m13.750s 0m16.574s
A more realistic test with ssh:// over wireless
before after
real 11m26.629s 10m4.213s
user 5m43.196s 5m19.444s
sys 0m35.812s 0m37.630s
This shortcut is not applied to shallow clones, partly because shallow
clones should have no more objects than a usual fetch and the cost of
rev-list is acceptable, partly to avoid dealing with corner cases when
grafting is involved.
This shortcut does not apply to unpack-objects code path either
because the number of objects must be small in order to trigger that
code path.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 38a4556 (index-pack: start learning to emulate
"verify-pack -v", 2011-06-03) added a "delta_depth" counter
to each "struct object_entry". Initially, all object entries
have their depth set to 0; in resolve_delta, we then set the
depth of each delta to "base + 1". Base entries never have
their depth touched, and remain at 0.
To ensure that all depths start at 0, that commit changed
calls to xmalloc the object_entry list into calls to
xcalloc. However, it forgot that we grow the list with
xrealloc later. These extra entries are used when we add an
object from elsewhere to complete a thin pack. If we add a
non-delta object, its depth value will just be uninitialized
heap data.
This patch fixes it by zero-initializing entries we add to
the objects list via the xrealloc.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The threaded parts of index-pack increment the number of resolved
deltas in nr_resolved_deltas guarded by counter_mutex. However, the
per-thread outer loop accessed nr_resolved_deltas without any locks.
This is not wrong as such, since it doesn't matter all that much
whether we get an outdated value. However, unless someone proves that
this one lock makes all the performance difference, it would be much
cleaner to guard _all_ accesses to the variable with the lock.
The only such use is display_progress() in the threaded section (all
others are in the conclude_pack() callchain outside the threaded
part). To make it obvious that it cannot deadlock, move it out of
work_mutex.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
deepest_delta is a global variable but is updated without protection
in resolve_delta(), a multithreaded function. Add a new mutex for it,
but only protect and update when it's actually used (i.e. show_stat is
non-zero).
Another variable that will not be updated is delta_depth in "struct
object_entry" as it's only useful when show_stat is 1. Putting it in
"if (show_stat)" makes it clearer.
The local variable "stat" is renamed to "show_stat" after moving to
global scope because the name "stat" conflicts with stat(2) syscall.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The translation of "completed with %d local objects" is put in a
48-byte buffer, which may be enough for English but not true for any
translations. Convert it to use strbuf (i.e. no hard limit on
translation length).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the unpack_data function is given a consume() callback,
it unpacks only 64K of the input at a time, feeding it to
git_inflate along with a 64K output buffer. However,
because we are inflating, there is a good chance that the
output buffer will fill before consuming all of the input.
In this case, we need to loop on git_inflate until we have
fed the whole input buffer, feeding each chunk of output to
the consume buffer.
The current code does not do this, and as a result, will
fail the loop condition and trigger a fatal "serious inflate
inconsistency" error in this case.
While we're rearranging the loop, let's get rid of the
extra last_out pointer. It is meant to point to the
beginning of the buffer that we feed to git_inflate, but in
practice this is always the beginning of our same 64K
buffer, because:
1. At the beginning of the loop, we are feeding the
buffer.
2. At the end of the loop, if we are using a consume()
function, we reset git_inflate's pointer to the
beginning of the buffer. If we are not using a
consume() function, then we do not care about the value
of last_out at all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Cygwin implementation of pread() is not thread-safe since, just
like the emulation provided by compat/pread.c, it uses a sequence of
seek-read-seek calls. In order to avoid failues due to thread-safety
issues, commit b038a61 disables threading when NO_PREAD is defined.
(ie when using the emulation code in compat/pread.c).
We introduce a new build variable, NO_THREAD_SAFE_PREAD, which allows
use to disable the threaded index-pack code on cygwin, in addition to
the above NO_PREAD case.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When putting whole objects in core is unavoidable, try match object
type and size first before actually inflating.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows caller to consume large inflated object with a fixed
amount of memory.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
unpack_raw_entry() will not allocate and return decompressed blobs if
they are larger than core.bigFileThreshold. sha1_object() may not be
called on those objects because there's no actual content.
sha1_object() is called later on those objects, where we can safely
use get_data_from_pack() to retrieve blob content for checking.
However we always do that when we definitely need the blob
content. And we often don't.
There are two cases when we may need object content. The first case is
when we find an in-repo blob with the same SHA-1. We need to do
collision test, byte-on-byte. If this test is on, the blob must be
loaded on memory (i.e. no streaming). Normally (e.g. in
fetch/pull/clone) this does not happen because git avoid to send
objects that client already has.
The other case is when --strict is specified and the object in
question is not a blob, which can't happen in reality becase we deal
with large _blobs_ here.
Note: --verify (or git-verify-pack) a pack from current repository
will trigger collision test on every object in the pack, which
effectively disables this patch. This could be easily worked around by
setting GIT_DIR to an imaginary place with no packs.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
NO_PREAD simulates pread() as a sequence of seek, read, seek in
compat/pread.c. The simulation is not thread-safe because another
thread could move the file offset away in the middle of pread
operation. Do not allow threading in that case.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This puts delta resolving on each base on a separate thread, one base
cache per thread. Per-thread data is grouped in struct thread_local.
When running with nr_threads == 1, no pthreads calls are made. The
system essentially runs in non-thread mode.
An experiment on a Xeon 24 core machine with git.git shows that
performance does not increase proportional to the number of cores. So
by default, we use maximum 3 cores. Some numbers with --threads from 1
to 16:
1..4
real 0m8.003s 0m5.307s 0m4.321s 0m3.830s
user 0m7.720s 0m8.009s 0m8.133s 0m8.305s
sys 0m0.224s 0m0.372s 0m0.360s 0m0.360s
5..8
real 0m3.727s 0m3.604s 0m3.332s 0m3.369s
user 0m9.361s 0m9.817s 0m9.525s 0m9.769s
sys 0m0.584s 0m0.624s 0m0.540s 0m0.560s
9..12
real 0m3.036s 0m3.139s 0m3.177s 0m2.961s
user 0m8.977s 0m10.205s 0m9.737s 0m10.073s
sys 0m0.596s 0m0.680s 0m0.684s 0m0.680s
13..16
real 0m2.985s 0m2.894s 0m2.975s 0m2.971s
user 0m9.825s 0m10.573s 0m10.833s 0m11.361s
sys 0m0.788s 0m0.732s 0m0.904s 0m1.016s
On an Intel dual core and linux-2.6.git
1..4
real 2m37.789s 2m7.963s 2m0.920s 1m58.213s
user 2m28.415s 2m52.325s 2m50.176s 2m41.187s
sys 0m7.808s 0m11.181s 0m11.224s 0m10.731s
Thanks Ramsay Jones for troubleshooting and support on MinGW platform.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The second pass in parse_pack_objects() are split into
resolve_deltas(). The final phase, fixing thin pack or just seal the
pack, is now in conclude_pack() function. Main pack processing is now
a sequence of these functions:
- parse_pack_objects() reads through the input pack
- resolve_deltas() makes sure all deltas can be resolved
- conclude_pack() seals the output pack
- write_idx_file() writes companion index file
- final() moves the pack/index to proper place
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Revese the order of delta applying so that by the time a delta is
applied, its base is either non-delta or already inflated.
get_base_data() is still recursive, but because base's data is always
ready, the inner get_base_data() call never has any chance to call
itself again.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Current find_unresolved_deltas() links all bases together in a form of
tree, using struct base_data, with prev_base pointer to point to
parent node. Then it traverses down from parent to children in
recursive manner with all base_data allocated on stack.
To eliminate recursion, we simply need to put all on heap
(parse_pack_objects and fix_unresolved_deltas). After that, it's
simple non-recursive depth-first traversal loop. Each node also
maintains its own state (ofs and ref indices) to iterate over all
children nodes.
So we process one node:
- if it returns a new (child) node (a parent base), we link it to our
tree, then process the new node.
- if it returns nothing, the node is done, free it. We go back to
parent node and resume whatever it's doing.
and do it until we have no nodes to process.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Solaris the system headers define the "tmpfile" name, which'll
cause Git compiled with Sun Studio 12 Update 1 to whine about us
redefining the name:
"pack-write.c", line 76: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
"sha1_file.c", line 2455: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
"fast-import.c", line 858: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
"builtin/index-pack.c", line 175: warning: name redefined by pragma redefine_extname declared static: tmpfile (E_PRAGMA_REDEFINE_STATIC)
Just renaming the "tmpfile" variable to "tmp_file" in the relevant
places is the easiest way to fix this.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When receive-pack & fetch-pack are run and store the pack obtained over
the wire to a local repository, they internally run the index-pack command
with the --strict option. Make sure that we reject incoming packfile that
records objects twice to avoid spreading such a damage.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The size of objects we read from the repository and data we try to put
into the repository are represented in "unsigned long", so that on larger
architectures we can handle objects that weigh more than 4GB.
But the interface defined in zlib.h to communicate with inflate/deflate
limits avail_in (how many bytes of input are we calling zlib with) and
avail_out (how many bytes of output from zlib are we ready to accept)
fields effectively to 4GB by defining their type to be uInt.
In many places in our code, we allocate a large buffer (e.g. mmap'ing a
large loose object file) and tell zlib its size by assigning the size to
avail_in field of the stream, but that will truncate the high octets of
the real size. The worst part of this story is that we often pass around
z_stream (the state object used by zlib) to keep track of the number of
used bytes in input/output buffer by inspecting these two fields, which
practically limits our callchain to the same 4GB limit.
Wrap z_stream in another structure git_zstream that can express avail_in
and avail_out in unsigned long. For now, just die() when the caller gives
a size that cannot be given to a single zlib call. In later patches in the
series, we would make git_inflate() and git_deflate() internally loop to
give callers an illusion that our "improved" version of zlib interface can
operate on a buffer larger than 4GB in one go.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use
of deflateInit2 in remote-curl.c to tell the library to use gzip header
and trailer in git_deflate_init_gzip().
There is only one caller that cares about the status from deflateEnd().
Introduce git_deflate_end_gently() to let that sole caller retrieve the
status and act on it (i.e. die) for now, but we would probably want to
make inflate_end/deflate_end die when they ran out of memory and get
rid of the _gently() kind.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The histogram produced by "verify-pack -v" always had an artificial
limit of 50, but index-pack knows what the maximum delta depth is, so
we do not have to limit it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "index-pack" machinery already has almost enough knowledge to produce
the same output as "verify-pack -v". Fill small gaps in its bookkeeping,
and teach it to show what it knows.
Add a few more command line options that do not have to be advertised to
the end users. They will be used internally when verify-pack calls this.
The eventual goal is to remove verify-pack implementation and redo it as a
thin wrapper around the index-pack, so that we can remove the rather
expensive packed_object_info_detail() API.
This still does not do the delta-chain-depth histogram yet but that part
is easy.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a helper function that takes the type of an object and
tell if it is a delta, as we seem to use this check in many places.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* load_file() returns a void pointer but is using 0 for the return
value
* builtin/receive-pack.c forgot to include builtin.h
* packet_trace_prefix can be marked static
* ll_merge takes a pointer for its last argument, not an int
* crc32 expects a pointer as the second argument but Z_NULL is defined
to be 0 (see 38f4d13 sparse fix: Using plain integer as NULL pointer,
2006-11-18 for more info)
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix warnings from 'make check'.
- These files don't include 'builtin.h' causing sparse to complain that
cmd_* isn't declared:
builtin/clone.c:364, builtin/fetch-pack.c:797,
builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
builtin/merge-index.c:69, builtin/merge-recursive.c:22
builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
builtin/notes.c:822, builtin/pack-redundant.c:596,
builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
builtin/remote.c:1512, builtin/remote-ext.c:240,
builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
builtin/unpack-file.c:25, builtin/var.c:75
- These files have symbols which should be marked static since they're
only file scope:
submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48
- These files redeclare symbols to be different types:
builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
usage.c:49, usage.c:58, usage.c:63, usage.c:72
- These files use a literal integer 0 when they really should use a NULL
pointer:
daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362
While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a struct definitions, unlike functions, the prevailing style is for
the opening brace to go on the same line as the struct name, like so:
struct foo {
int bar;
char *baz;
};
Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many
matches as 'struct [a-z_]*$'.
Linus sayeth:
Heretic people all over the world have claimed that this inconsistency
is ... well ... inconsistent, but all right-thinking people know that
(a) K&R are _right_ and (b) K&R are right.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A pack v2 .idx file usually records offset using 64-bit representation
only when the offset does not fit within 31-bit, but you can handcraft
your .idx file to record smaller offset using 64-bit, storing all zero
in the upper 4-byte. By inspecting the original idx file when running
index-pack --verify, encode such low offsets that do not need to be in
64-bit but are encoded using 64-bit just like the original idx file so
that we can still validate the pack/idx pair by comparing the idx file
recomputed with the original.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Given an existing .pack file and the .idx file that describes it,
this new mode of operation reads and re-index the packfile and makes
sure the existing .idx file matches the result byte-for-byte.
All the objects in the .pack file are validated during this operation as
well. Unlike verify-pack, which visits each object described in the .idx
file in the SHA-1 order, index-pack efficiently exploits the delta-chain
to avoid rebuilding the objects that are used as the base of deltified
objects over and over again while validating the objects, resulting in
much quicker verification of the .pack file and its .idx file.
This version however cannot verify a .pack/.idx pair with a handcrafted v2
index that uses 64-bit offset representation for offsets that would fit
within 31-bit. You can create such an .idx file by giving a custom offset
to --index-version option to the command.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove two globals, pack_idx_default version and pack_idx_off32_limit,
and place them in a pack_idx_option structure. Allow callers to pass
it to write_idx_file() as a parameter.
Adjust all callers to the API change.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Entries in the delta_base array are only grouped by the bytepattern in
the delta_base union, some of which have 20-byte object name of the base
object (i.e. base for REF_DELTA objects), while others have sizeof(off_t)
bytes followed by enough NULs to fill 20-byte. The loops to iterate
through a range inside this array still needs to inspect the type of the
delta, and skip over false hits.
Group the entries also by type to eliminate the potential of false hits.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove some stray usage of other bracket types and asterisks for the
same purpose.
Signed-off-by: Štěpán Němec <stepnem@gmail.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed integer overflow is not defined in C, so do not depend on it.
This fixes a problem with GCC 4.4.0 and -O3 where the optimizer would
consider "consumed_bytes > consumed_bytes + bytes" as a constant
expression, and never execute the die()-call.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
index-pack already runs a repository search unconditionally; running
such a search earlier is not risky and ensures GIT_DIR will be set
correctly if the configuration needs to be accessed from
run_builtin().
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this, attempting to index a pack containing objects that have been
replaced results in a fatal error that looks like:
fatal: SHA1 COLLISION FOUND WITH <replaced-object> !
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Acked-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now setup_git_directory_gently behaves sanely even from subdirs of
.git, so simplify index-pack by no longer protecting against that.
This reverts commit a672ea6ac5
(excluding tests).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This shrinks the top-level directory a bit, and makes it much more
pleasant to use auto-completion on the thing. Instead of
[torvalds@nehalem git]$ em buil<tab>
Display all 180 possibilities? (y or n)
[torvalds@nehalem git]$ em builtin-sh
builtin-shortlog.c builtin-show-branch.c builtin-show-ref.c
builtin-shortlog.o builtin-show-branch.o builtin-show-ref.o
[torvalds@nehalem git]$ em builtin-shor<tab>
builtin-shortlog.c builtin-shortlog.o
[torvalds@nehalem git]$ em builtin-shortlog.c
you get
[torvalds@nehalem git]$ em buil<tab> [type]
builtin/ builtin.h
[torvalds@nehalem git]$ em builtin [auto-completes to]
[torvalds@nehalem git]$ em builtin/sh<tab> [type]
shortlog.c shortlog.o show-branch.c show-branch.o show-ref.c show-ref.o
[torvalds@nehalem git]$ em builtin/sho [auto-completes to]
[torvalds@nehalem git]$ em builtin/shor<tab> [type]
shortlog.c shortlog.o
[torvalds@nehalem git]$ em builtin/shortlog.c
which doesn't seem all that different, but not having that annoying
break in "Display all 180 possibilities?" is quite a relief.
NOTE! If you do this in a clean tree (no object files etc), or using an
editor that has auto-completion rules that ignores '*.o' files, you
won't see that annoying 'Display all 180 possibilities?' message - it
will just show the choices instead. I think bash has some cut-off
around 100 choices or something.
So the reason I see this is that I'm using an odd editory, and thus
don't have the rules to cut down on auto-completion. But you can
simulate that by using 'ls' instead, or something similar.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This required some fairly trivial packfile function 'const' cleanup,
since the builtin commands get a const char *argv[] array.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no need for "git <command> -h" to depend on being inside
a repository.
Reported by Gerfried Fuchs through http://bugs.debian.org/462557
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some compilers (including at least MSVC) support NORETURN
on function declarations, but only before the function-name.
This patch makes it possible to define NORETURN to something
meaningful for those compilers.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Change calls to die(..., strerror(errno)) to use the new die_errno().
In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few remaining ones, but this fixes the trivial ones. It boils
down to two main issues that sparse complains about:
- warning: Using plain integer as NULL pointer
Sparse doesn't like you using '0' instead of 'NULL'. For various good
reasons, not the least of which is just the visual confusion. A NULL
pointer is not an integer, and that whole "0 works as NULL" is a
historical accident and not very pretty.
A few of these remain: zlib is a total mess, and Z_NULL is just a 0.
I didn't touch those.
- warning: symbol 'xyz' was not declared. Should it be static?
Sparse wants to see declarations for any functions you export. A lack
of a declaration tends to mean that you should either add one, or you
should mark the function 'static' to show that it's in file scope.
A few of these remain: I only did the ones that should obviously just
be made static.
That 'wt_status_submodule_summary' one is debatable. It has a few related
flags (like 'wt_status_use_color') which _are_ declared, and are used by
builtin-commit.c. So maybe we'd like to export it at some point, but it's
not declared now, and not used outside of that file, so 'static' it is in
this patch.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shifting 'unsigned char' or 'unsigned short' left can result in sign
extension errors, since the C integer promotion rules means that the
unsigned char/short will get implicitly promoted to a signed 'int' due to
the shift (or due to other operations).
This normally doesn't matter, but if you shift things up sufficiently, it
will now set the sign bit in 'int', and a subsequent cast to a bigger type
(eg 'long' or 'unsigned long') will now sign-extend the value despite the
original expression being unsigned.
One example of this would be something like
unsigned long size;
unsigned char c;
size += c << 24;
where despite all the variables being unsigned, 'c << 24' ends up being a
signed entity, and will get sign-extended when then doing the addition in
an 'unsigned long' type.
Since git uses 'unsigned char' pointers extensively, we actually have this
bug in a couple of places.
I may have missed some, but this is the result of looking at
git grep '[^0-9 ][ ]*<<[ ][a-z]' -- '*.c' '*.h'
git grep '<<[ ]*24'
which catches at least the common byte cases (shifting variables by a
variable amount, and shifting by 24 bits).
I also grepped for just 'unsigned char' variables in general, and
converted the ones that most obviously ended up getting implicitly cast
immediately anyway (eg hash_name(), encode_85()).
In addition to just avoiding 'unsigned char', this patch also tries to use
a common idiom for the delta header size thing. We had three different
variations on it: "& 0x7fUL" in one place (getting the sign extension
right), and "& ~0x80" and "& 0x7f" in two other places (not getting it
right). Apart from making them all just avoid using "unsigned char" at
all, I also unified them to then use a simple "& 0x7f".
I considered making a sparse extension which warns about doing implicit
casts from unsigned types to signed types, but it gets rather complex very
quickly, so this is just a hack.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing out a loose object or a pack (index), move_temp_to_file() is
called to finalize the resulting file. These files (loose files and packs)
should all have permission mode 0444 (modulo adjust_shared_perm()).
Therefore, instead of doing chmod(foo, 0444) explicitly from each callsite
(or even forgetting to chmod() at all), do the chmod() call from within
move_temp_to_file().
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
http-push.c::finish_request():
request is initialized by the for loop
index-pack.c::free_base_data():
b is initialized by the for loop
merge-recursive.c::process_renames():
move compare to narrower scope, and remove unused assignments to it
remove unused variable renames2
xdiff/xdiffi.c::xdl_recs_cmp():
remove unused variable ec
xdiff/xemit.c::xdl_emit_diff():
xche is always overwritten
Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a repository created with git older than f49fb35 (git-init-db: create
"pack" subdirectory under objects, 2005-06-27), objects/pack/ directory is
not created upon initialization. It was Ok because subdirectories are
created as needed inside directories init-db creates, and back then,
packfiles were recent invention.
After the said commit, new codepaths started relying on the presense of
objects/pack/ directory in the repository. This was exacerbated with
8b4eb6b (Do not perform cross-directory renames when creating packs,
2008-09-22) that moved the location temporary pack files are created from
objects/ directory to objects/pack/ directory, because moving temporary to
the final location was done carefully with lazy leading directory creation.
Many packfile related operations in such an old repository can fail
mysteriously because of this.
This commit introduces two helper functions to make things work better.
- odb_mkstemp() is a specialized version of mkstemp() to refactor the
code and teach it to create leading directories as needed;
- odb_pack_keep() refactors the code to create a ".keep" file while
create leading directories as needed.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Programs that use git_config need to find the global configuration.
When runtime prefix computation is enabled, this requires that
git_extract_argv0_path() is called early in the program's main().
This commit adds the necessary calls.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>