Lots of die() calls did not actually report the kind of error, which
can leave the user confused as to the real problem. Use die_errno()
where we check a system/library call that sets errno on failure, or
one of the following that wrap such calls:
Function Passes on error from
-------- --------------------
odb_pack_keep open
read_ancestry fopen
read_in_full xread
strbuf_read xread
strbuf_read_file open or strbuf_read_file
strbuf_readlink readlink
write_in_full xwrite
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change calls to die(..., strerror(errno)) to use the new die_errno().
In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few remaining ones, but this fixes the trivial ones. It boils
down to two main issues that sparse complains about:
- warning: Using plain integer as NULL pointer
Sparse doesn't like you using '0' instead of 'NULL'. For various good
reasons, not the least of which is just the visual confusion. A NULL
pointer is not an integer, and that whole "0 works as NULL" is a
historical accident and not very pretty.
A few of these remain: zlib is a total mess, and Z_NULL is just a 0.
I didn't touch those.
- warning: symbol 'xyz' was not declared. Should it be static?
Sparse wants to see declarations for any functions you export. A lack
of a declaration tends to mean that you should either add one, or you
should mark the function 'static' to show that it's in file scope.
A few of these remain: I only did the ones that should obviously just
be made static.
That 'wt_status_submodule_summary' one is debatable. It has a few related
flags (like 'wt_status_use_color') which _are_ declared, and are used by
builtin-commit.c. So maybe we'd like to export it at some point, but it's
not declared now, and not used outside of that file, so 'static' it is in
this patch.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This new "read_replace_refs" global variable is set to 1 by
default, so that replace refs are used by default. But
reachability traversal and packing commands ("cmd_fsck",
"cmd_prune", "cmd_pack_objects", "upload_pack",
"cmd_unpack_objects") set it to 0, as they must work with the
original DAG.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To give OPT_FILENAME the prefix, we pass the prefix to parse_options()
which passes the prefix to parse_options_start() which sets the prefix
member of parse_opts_ctx accordingly. If there isn't a prefix in the
calling context, passing NULL will suffice.
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the callers of this function except only one pass NULL to its last
parameter, ignore_packed.
Introduce has_sha1_kept_pack() function that has the function signature
and the semantics of this function, and convert the sole caller that does
not pass NULL to call this new function.
All other callers and has_sha1_pack() lose the ignore_packed parameter.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fsck" used to validate only loose objects that are local and nothing
else by default. This is not just too little when a repository is
borrowing objects from other object stores, but also caused the
connectivity check to mistakenly declare loose objects borrowed from them
to be missing.
The rationale behind the default mode that validates only loose objects is
because these objects are still young and more unlikely to have been
pushed to other repositories yet. That holds for loose objects borrowed
from alternate object stores as well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default we looked at all refs but not HEAD. The only thing that made
fsck not lose sight of commits that are only reachable from a detached
HEAD was the reflog for the HEAD.
This fixes it, with a new test.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the man page, if "git fsck" is passed one or more heads, it
should verify connectivity and validity of only objects reachable from the
heads it is passed.
However, since 5ac0a20 (Make builtin-fsck.c use parse_options.,
2007-10-15) the command behaved as if no heads were passed, when given
only one argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new inline function is_dot_or_dotdot is used to check if the
directory name is either "." or "..". It returns a non-zero value if
the given string is "." or "..". It's applicable to a lot of Git
source code.
Signed-off-by: Alexander Potashev <aspotashev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The logic to mark all objects that are reachable from tips of refs were
implemented as a set of recursive functions. In a repository with a deep
enough history, this can easily eat up all the available stack space.
Restructure the code to require less stackspace by using an object array
to keep track of the objects that still need to be processed.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
So that full filesystem conditions or permissions problems won't go
unnoticed.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As Shawn pointed out, not all temporary file creation routines can
ensure that the generated temporary file is of a certain length.
e.g. Java's createTempFile(prefix, suffix). So just depend on the
prefix 'tmp_obj_' for detection.
Update prune, and fix the "fix" introduced by a08c53a1 :)
Signed-off-by: Brandon "appendixless" Casey <casey@nrlssc.navy.mil>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Not all temporary file creation routines will ensure 14 bytes are
used to generate the temporary file name. In C Git this may be
true, but alternate implementations such as jgit are not always
able to generate a temporary file name with a specific prefix and
also ensure the file name length is 14 bytes long.
Since temporary files in a directory we are fsck'ing should be
uncommon (as they are short lived only long enough for an active
writer to finish writing the file and rename it) we shouldn't see
these show up very often. Always using a prefixcmp() call and
ignoring the length opens up room for other implementations to use
different name generation schemes.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 5723fe7e, temporary objects are now created in their final destination
directories, rather than in .git/objects/. Teach fsck to recognize and
ignore the temporary objects it encounters, and teach prune to remove them.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you misuse a git command, you are shown the usage string.
But this is currently shown in the dashed form. So if you just
copy what you see, it will not work, when the dashed form
is no longer supported.
This patch makes git commands show the dash-less version.
For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh
generates a dash-less usage string now.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is called when verify_pack() has its verbose argument set, and
verbose in this context makes sense only for the actual 'git verify-pack'
command. Therefore let's move show_pack_info() to builtin-verify-pack.c
instead and remove useless verbose argument from verify_pack().
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_commit ignores parent commits with certain errors
(eg. a non commit object is already loaded under the sha1 of
the parent). To make fsck reports such errors, it has to compare
the nummer of parent commits returned by parse commit with the
number of parent commits in the object or in the graft/shallow file.
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A zero commit date could be caused by:
* a missing author line
* a missing commiter line
* a malformed email address in the commiter line
* a malformed commit date
Simply reporting it as zero commit date is missleading.
Additionally, it upgrades the message to an error (instead of an printf).
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.
In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.
This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since having non-commits in branches is a no-no, and just means you cannot
commit on them, let's make fsck tell you when a branch is bad.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The typo was introduced by 5ac0a2063e
(Make builtin-fsck.c use parse_options.)
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Acked-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When looking for a lost blob, it is much nicer to be able to grep
through .git/lost-found/other/* than to write an inefficient loop
over the file names. So write the contents of the dangling blobs,
not their object names.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make every builtin-*.c file #include "builtin.h".
Also takes care of some declaration/definition mismatches.
Signed-off-by: Peter Hagervall <hager@cs.umu.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With this option, dangling objects are not only reported, but also
written to .git/lost-found/commit/ or .git/lost-found/other/. This
option implies '--full' and '--no-reflogs'.
'git fsck --lost-found' is meant as a replacement for git-lost-found.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time. There are a few files that need
to have trailing whitespaces (most notably, test vectors). The results
still passes the test, and build result in Documentation/ area is unchanged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With --verbose, it gets really chatty now.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this particular location of fsck the index should have already
been opened by verify_pack, which is called just before we get
here and loop through the object names. However, just in case a
future version of that function does not use the index file we'll
double-check its open before we access the num_objects field.
Better safe now than sorry later.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Released versions of fast-import have been able to create a tree that
contains files or subtrees that contain no name. Unfortunately these
trees aren't valid, but people may have actually tried to create them
due to bugs in import-tars.perl or their own fast-import frontend.
We now look for this unusual condition and warn the user if at
least one of their tree objects contains the problem.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Detached HEAD is just a normal state of a repository. Do not
say anything about it.
Do not give worrying "error:" messages when we let the user know
that the HEAD points at nothing (i.e. yet to be born branch),
nor we do not have any default refs to start following the
objects chain. Reword them as "notice:".
Signed-off-by: Junio C Hamano <junkio@cox.net>
Since the subprojects don't necessarily even exist in the current tree,
much less in the current git repository (they are totally independent
repositories), we do not want to try to follow the chain from one git
repository to another through a gitlink.
This involves teaching fsck to ignore references to gitlink objects from
a tree and from the current index.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The coming index format change doesn't allow for the number of objects
to be determined from the size of the index file directly. Instead, Let's
initialize a field in the packed_git structure with the object count when
the index is validated since the count is always known at that point.
While at it let's reorder some struct packed_git fields to avoid padding
due to needed 64-bit alignment for some of them.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Prior to 1.5.0 the git-lost-found utility was useful to locate
commits that were not referenced by any ref. These were often
amends, or resets, or tips of branches that had been deleted.
Being able to locate a 'lost' commit and recover it by creating a
new branch was a useful feature in those days.
Unfortunately 1.5.0 added the reflogs to the reachability analysis
performed by git-fsck, which means that most commits users would
consider to be lost are still reachable through a reflog. So most
(or all!) commits are reachable, and nothing gets output from
git-lost-found.
Now git-fsck can be told to ignore reflogs during its reachability
analysis, making git-lost-found useful again to locate commits
that are no longer referenced by a ref itself, but may still be
referenced by a reflog.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Let's avoid the open coded pack index reference in pack-object and use
nth_packed_object_sha1() instead. This will help encapsulating index
format differences in one place.
And while at it there is no reason to copy SHA1's over and over while a
direct pointer to it in the index will do just fine.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes slightly more lines than it adds, but the real reason for
doing this is that future optimizations will require more setup of the
tree descriptor, and so we want to do it in one place.
Also renamed the "desc.buf" field to "desc.buffer" just to trigger
compiler errors for old-style manual initializations, making sure I
didn't miss anything.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
As we permit up to 2^32-1 objects in a single packfile we cannot
use a signed int to represent the object offset within a packfile,
after 2^31-1 objects we will start seeing negative indexes and
error out or compute bad addresses within the mmap'd index.
This is a minor cleanup that does not introduce any significant
logic changes. It is roach free.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-fsck always exited with status 0, which was a bit sloppy.
This makes it exit with a non-zero status when errors are
found. The error code is an OR'ed result of:
1 if corrupted objects are found.
2 if objects that are ought to be reachable are missing or corrupt.
For example, it would exit with 1 in a repository with an
unreachable corrupt object. If a tree object of the HEAD commit
is corrupt, you would get 3.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When "git fsck" without --full found a loose object missing
because it was broken, it mistakenly thought it was not parsed
because we found it in one of the packs. Back when this code
was written, we did not have a way to explicitly check if we
have the object in pack, but we do now.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This mechanically converts strncmp() to use prefixcmp(), but only when
the parameters match specific patterns, so that they can be verified
easily. Leftover from this will be fixed in a separate step, including
idiotic conversions like
if (!strncmp("foo", arg, 3))
=>
if (!(-prefixcmp(arg, "foo")))
This was done by using this script in px.perl
#!/usr/bin/perl -i.bak -p
if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) {
s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|;
}
if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) {
s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|;
}
and running:
$ git grep -l strncmp -- '*.c' | xargs perl px.perl
Signed-off-by: Junio C Hamano <junkio@cox.net>
The earlier change df391b192 to rename fsck-objects to fsck broke
fsck-objects. This should fix it again.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This separates the connectivity check into separate codepaths,
one for reachable objects and the other for unreachable ones,
while adding a lot of comments to explain what is going on.
When checking an unreachable object, unlike a reachable one, we
do not have to complain if it does not exist (we used to
complain about a missing blob even when the only thing that
references it is a tree that is dangling). Also we do not have
to check and complain about objects that are referenced by an
unreachable object.
This makes the messages from fsck-objects a lot less noisy and
more useful.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes another problem that Andy's case showed: git-fsck-objects
reports nonsensical results for corrupt objects.
There were actually two independent and confusing problems:
- when we had a zero-sized file and used map_sha1_file, mmap() would
return EINVAL, and git-fsck-objects would report that as an insane and
confusing error. I don't know when this was introduced, it might have
been there forever.
- when "parse_object()" returned NULL, fsck would say "object not found",
which can be very confusing, since obviously the object might "exist",
it's just unparseable because it's totally corrupt.
So this just makes "xmmap()" return NULL for a zero-sized object (which is
a valid thing pointer, exactly the same way "malloc()" can return NULL for
a zero-sized allocation). That fixes the first problem (but we could have
fixed it in the caller too - I don't personally much care whichever way it
goes, but maybe somebody should check that the NO_MMAP case does
something sane in this case too?).
And the second problem is solved by just making the error message slightly
clearer - the failure to parse an object may be because it's missing or
corrupt, not necessarily because it's not "found".
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It used to ignore the return value of the helper function; now, it
expects it to return 0, and stops iteration upon non-zero return
values; this value is then passed on as the return value of
for_each_reflog_ent().
Further, it makes no sense to force the parsing upon the helper
functions; for_each_reflog_ent() now calls the helper function with
old and new sha1, the email, the timestamp & timezone, and the message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>