When declaring a structure with a flexible array member, instead
of defaulting to the c99 syntax for non-gnu compilers (which
burned people with older compilers), default to the traditional
and more portable "member[1]; /* more */" syntax.
At the same time, other c99 compilers should be able to take
advantage of the modern syntax to flexible array members without
being gcc. Check __STDC_VERSION__ for that.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
... since all system headers are pulled in via git-compat-util.h
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apart from the error in the condition (&& should actually be ||), the
construct
#if !defined(A) || !A
leads to a syntax error in the C preprocessor if A is indeed not defined.
Tested-by: David Symonds <dsymonds@gmail.com>
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
options flags:
~~~~~~~~~~~~~
PARSE_OPT_NONEG allow the caller to disallow the negated option to exists.
option types:
~~~~~~~~~~~~
OPTION_BIT: ORs (or NANDs) a mask.
OPTION_SET_INT: force the value to be set to this integer.
OPTION_SET_PTR: force the value to be set to this pointer.
helper:
~~~~~~
HAS_MULTI_BITS (in git-compat-util.h) is a bit-hack to check if an
unsigned integer has more than one bit set, useful to check if conflicting
options have been used.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strchrnul() was introduced in glibc in April 1999 and included in
glibc-2.1. Checking for that version means the majority of all git
users would get to use the optimized version in glibc. Of the
remaining few some might get to use a slightly slower version
than necessary but probably not slower than what we have today.
Unfortunately, __GLIBC_PREREQ() macro was not available in glibc 2.1.1
which was short lived but already supported strchrnul(). Odd minority
users of that library needs to live with our compatibility inline version.
Rediffed-against-next-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As suggested by Pierre Habouzit, add strchrnul(). It's a useful GNU
extension and can simplify string parser code. There are several
places in git that can be converted to strchrnul(); as a trivial
example, this patch introduces its usage to builtin-fetch--tool.c.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usually you cannot revert a merge because you do not know which
side of the merge should be considered the mainline (iow, what
change to reverse).
With this patch, cherry-pick and revert learn -m (--mainline)
option that lets you specify the parent number (starting from 1)
of the mainline, so that you can:
git revert -m 1 $merge
to reverse the changes introduced by the $merge commit relative
to its first parent, and:
git cherry-pick -m 2 $merge
to replay the changes introduced by the $merge commit relative
to its second parent.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Solaris 9 doesn't have mkdtemp() so we need to emulate it for the
rsync transport implementation. Since Solaris 9 is lacking this
function we can also reasonably assume it is not available on
Solaris 8 either. The new Makfile definition NO_MKDTEMP can be
set to enable the git compat version.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
A lot of places in git's code use code like:
char *res;
len = ... find length of an interesting segment in src ...;
res = xmalloc(len + 1);
memcpy(res, src, len);
res[len] = '\0';
return res;
A new function xmemdupz() captures the allocation, copy and NUL
termination. Existing xstrndup() is reimplemented in terms of
this new function.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
memmem() is a nice GNU extension for searching a length limited string
in another one.
This compat version is based on the version found in glibc 2.2 (GPL 2);
I only removed the optimization of checking the first char by hand, and
generally tried to keep the code simple. We can add it back if memcmp
shows up high in a profile, but for now I prefer to keep it (almost
trivially) simple.
Since I don't really know which platforms beside those with a glibc
have their own memmem(), I used a heuristic: if NO_STRCASESTR is set,
then NO_MEMMEM is set, too.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a wrapper for mkstemp() that performs error checking and
calls die() when an error occur.
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This defines xdup() and xfdopen() in git-compat-util.h to give
us error-catching variants of them without cluttering the code
too much.
Signed-off-by: Jim Meyering <jim@meyering.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function converts the value of h_errno (last error of name
resolver library, see netdb.h).
One of systems which supposedly do not have the function is SunOS.
POSIX does not mandate its presence.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using git name-rev on my kernel tree I triggered a malloc()
corruption warning from glibc.
apw@pinky$ git log --pretty=one $N/base.. | git name-rev --stdin
*** glibc detected *** malloc(): memory corruption: 0x0bff8950 ***
Aborted
This comes from name_rev() which is building the name of the revision
in a malloc'd string, which it sprintf's into:
char *new_name = xmalloc(len + 8);
[...]
sprintf(new_name, "%.*s~%d^%d", len, tip_name,
generation, parent_number);
This allocation is only sufficient if the generation number is
less than 5 digits, in my case generation was 13432. In reality
parent_number can be up to 16 so that also can require two digits,
reducing us to 3 digits before we are at risk of blowing this
allocation.
This patch introduces a decimal_length() which approximates the
number of digits a type may hold, it produces the following:
Type Longest Value Len Est
---- ------------- --- ---
unsigned char 256 3 4
unsigned short 65536 5 6
unsigned long 4294967296 10 11
unsigned long long 18446744073709551616 20 21
char -128 4 4
short -32768 6 6
long -2147483648 11 11
long long -9223372036854775808 20 21
This is then used to size the new_name.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This also improves the implementation to match how strndup is
specified (by GNU): if the length given is longer than the string,
only the string's length is allocated and copied, but the string need
not be null-terminated if it is at least as long as the given length.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Tim Ansell discovered his Debian server didn't permit git-daemon to
use as much memory as it needed to handle cloning a project with
a 128 MiB packfile. Filtering the strace provided by Tim of the
rev-list child showed this gem of a sequence:
open("./objects/pack/pack-*.pack", O_RDONLY|O_LARGEFILE <unfinished ...>
<... open resumed> ) = 5
OK, so the packfile is fd 5...
mmap2(NULL, 33554432, PROT_READ, MAP_PRIVATE, 5, 0 <unfinished ...>
<... mmap2 resumed> ) = 0xb5e2d000
and we mapped one 32 MiB window from it at position 0...
mmap2(NULL, 31020635, PROT_READ, MAP_PRIVATE, 5, 0x6000 <unfinished ...>
<... mmap2 resumed> ) = -1 ENOMEM (Cannot allocate memory)
And we asked for another window further into the file. But got
denied. In Tim's case this was due to a resource limit on the
git-daemon process, and its children.
Now where are we in the code? We're down inside use_pack(),
after we have called unuse_one_window() enough times to make sure
we stay within our allowed maximum window size. However since we
didn't unmap the prior window at 0xb5e2d000 we aren't exceeding
the current limit (which probably was just the defaults).
But we're actually down inside xmmap()...
So we release the window we do have (by calling release_pack_memory),
assuming there is some memory pressure...
munmap(0xb5e2d000, 33554432 <unfinished ...>
<... munmap resumed> ) = 0
close(5 <unfinished ...>
<... close resumed> ) = 0
And that was the last window in this packfile. So we closed it.
Way to go us. Our xmmap did not expect release_pack_memory to
close the fd its about to map...
mmap2(NULL, 31020635, PROT_READ, MAP_PRIVATE, 5, 0x6000 <unfinished ...>
<... mmap2 resumed> ) = -1 EBADF (Bad file descriptor)
And so the Linux kernel happily tells us f' off.
write(2, "fatal: ", 7 <unfinished ...>
<... write resumed> ) = 7
write(2, "Out of memory? mmap failed: Bad "..., 47 <unfinished ...>
<... write resumed> ) = 47
And we report the bad file descriptor error, and not the ENOMEM,
and die, claiming we are out of memory. But actually that mmap
should have succeeded, as we had enough memory for that window,
seeing as how we released the prior one.
Originally when I developed the sliding window mmap feature I had
this exact same bug in fast-import, and I dealt with it by handing
in the struct packed_git* we want to open the new window for, as the
caller wasn't prepared to reopen the packfile if unuse_one_window
closed it. The same is true here from xmmap, but the caller doesn't
have the struct packed_git* handy. So I'm using the file descriptor
instead to perform the same test.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep.c (strtoul_ui): Move function definition from here, to...
* git-compat-util.h (strtoul_ui): ...here, with an added "base" parameter.
* builtin-grep.c (cmd_grep): Update use of strtoul_ui to include base, "10".
* builtin-update-index.c (read_index_info): Diagnose an invalid mode integer
that is out of range or merely larger than INT_MAX.
(cmd_update_index): Use strtoul_ui, not sscanf.
* convert-objects.c (write_subdirectory): Likewise.
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep.c (strtoul_ui): Move function definition from here, to...
* git-compat-util.h (strtoul_ui): ...here, with an added "base" parameter.
* builtin-grep.c (cmd_grep): Update use of strtoul_ui to include base, "10".
* builtin-update-index.c (read_index_info): Diagnose an invalid mode integer
that is out of range or merely larger than INT_MAX.
(cmd_update_index): Use strtoul_ui, not sscanf.
* convert-objects.c (write_subdirectory): Likewise.
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch introduces the MSB() macro to obtain the desired number of
most significant bits from a given variable independently of the variable
type.
It is then used to better implement the overflow test on the OBJ_OFS_DELTA
base offset variable with the property of always working correctly
regardless of the type/size of that variable.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes a problem reported by Randal Schwartz:
>I finally tracked down all the (albeit inconsequential) errors I was getting
>on both OpenBSD and OSX. It's the warn() function in usage.c. There's
>warn(3) in BSD-style distros. It'd take a "great rename" to change it, but if
>someone with better C skills than I have could do that, my linker and I would
>appreciate it.
It was annoying to me, too, when I was doing some mergetool testing on
Mac OS X, so here's a fix.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Randal L. Schwartz" <merlyn@stonehenge.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.
On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().
In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Not all platforms have declared 'unsigned long' to be a 64 bit value,
but we want to support a 64 bit packfile (or close enough anyway)
in the near future as some projects are getting large enough that
their packed size exceeds 4 GiB.
By using off_t, the POSIX type that is declared to mean an offset
within a file, we support whatever maximum file size the underlying
operating system will handle. For most modern systems this is up
around 2^60 or higher.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The function at issue being initgroups() from the <grp.h> header
file. On Cygwin, setting _XOPEN_SOURCE suppresses the definition
of initgroups(), which causes the warning while compiling daemon.c.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Glibc uses the same size for int and off_t by default.
In order to support large pack sizes (>2GB) we force Glibc to a 64bit off_t.
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Solaris 8 was pre-c99, and they weren't willing to commit to
the strtoumax definition according to /usr/include/inttypes.h.
This adds NO_STRTOUMAX and NO_STRTOULL for ancient systems.
If NO_STRTOUMAX is defined, the routine in compat/strtoumax.c
will be used instead. That routine passes its arguments to
strtoull unless NO_STRTOULL is defined. If NO_STRTOULL, then
the routine uses strtoul (unsigned long).
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Acked-by: Shawn O Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Older Solaris machines lack stdint.h but have inttypes.h.
The standard has inttypes.h including stdint.h, so at worst
this pollutes the namespace a bit.
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Plain integer types without a fixed size can vary between platforms. Even
though all common platforms use 32-bit ints, there is no guarantee that
this won't change at some point. Furthermore, specifying an integer type
with explicit size makes the definition of structures more obvious.
Signed-off-by: Simon 'corecode' Schubert <corecode@fs.ei.tum.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Currently the pack .idx file format uses 32-bit unsigned integers
for the fan-out table and the object offsets. We had previously
defined these as 'unsigned int', but not every system will define
that type to be a 32 bit value. To ensure maximum portability we
should always use 'uint32_t'.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
AIX 5.3 seems to need _ALL_SOURCE for struct addrinfo, but that
introduces a struct list in grp.h.
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If a frontend wants to use a mark per file revision and per commit
and is doing a truly huge import (such as a 32 GiB SVN repository)
we may need more than 2**32 unique mark values, especially if the
frontend is unable (or unwilling) to recycle mark values. For mark
idnums we should use the largest unsigned integer type available,
hoping that will be at least 64 bits when we are compiled as a 64
bit executable. This way we may consume huge amounts of memory
storing our mark table, but we'll at least be able to process
the entire import without failing.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This fixes another problem that Andy's case showed: git-fsck-objects
reports nonsensical results for corrupt objects.
There were actually two independent and confusing problems:
- when we had a zero-sized file and used map_sha1_file, mmap() would
return EINVAL, and git-fsck-objects would report that as an insane and
confusing error. I don't know when this was introduced, it might have
been there forever.
- when "parse_object()" returned NULL, fsck would say "object not found",
which can be very confusing, since obviously the object might "exist",
it's just unparseable because it's totally corrupt.
So this just makes "xmmap()" return NULL for a zero-sized object (which is
a valid thing pointer, exactly the same way "malloc()" can return NULL for
a zero-sized allocation). That fixes the first problem (but we could have
fixed it in the caller too - I don't personally much care whichever way it
goes, but maybe somebody should check that the NO_MMAP case does
something sane in this case too?).
And the second problem is solved by just making the error message slightly
clearer - the failure to parse an object may be because it's missing or
corrupt, not necessarily because it's not "found".
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Using cygwin with cygwin.dll before 1.5.22 the system call pread() is buggy.
This patch introduces NO_PREAD. If NO_PREAD is set git uses a sequence of
lseek()/xread()/lseek() to emulate pread.
Signed-off-by: Stefan-W. Hahn <stefan.hahn@s-hahn.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is shorter and easier to read, and also makes sure the
constant expression does not overflow integer range.
Signed-off-by: Junio C Hamano <junkio@cox.net>
If we have a 64 bit address space we can easily afford to commit
a larger amount of virtual address space to pack file access.
So on these platforms we should increase the default settings of
core.packedGit{Limit,WindowSize} to something that will better
handle very large projects.
Thanks to Andy Whitcroft for pointing out that we can safely
increase these defaults on such systems.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
In some cases we did not even bother to check the return value of
mmap() and just assume it worked. This is bad, because if we are
out of virtual address space the kernel returned MAP_FAILED and we
would attempt to dereference that address, segfaulting without any
real error output to the user.
We are replacing all calls to mmap() with xmmap() and moving all
MAP_FAILED checking into that single location. If a mmap call
fails we try to release enough least-recently-used pack windows
to possibly succeed, then retry the mmap() attempt. If we cannot
mmap even after releasing pack memory then we die() as none of our
callers have any reasonable recovery strategy for a failed mmap.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If we are about to fail because this process has run out of memory we
should first try to automatically control our appetite for address
space by releasing enough least-recently-used pack windows to gain
back enough memory such that we might actually be able to meet the
current allocation request.
This should help users who have fairly large repositories but are
working on systems with relatively small virtual address space.
Many times we see reports on the mailing list of these users running
out of memory during various Git operations. Dynamically decreasing
the amount of pack memory used when the demand for heap memory is
increasing is an intelligent solution to this problem.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If the compiler has asked us to disable use of mmap() on their
platform then we are forced to use git_mmap and its emulation via
pread. In this case large (e.g. 32 MiB) windows for pack access
are simply too big as a command will wind up reading a lot more
data than it will ever need, significantly reducing response time.
To prevent a high latency when NO_MMAP has been selected we now
use a default of 1 MiB for core.packedGitWindowSize. Credit goes
to Linus and Junio for recommending this more reasonable setting.
[jc: upcased the name of the symbolic constant, and made another
hardcoded constant into a symbolic constant while at it. ]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This minor cleanup was suggested by Johannes Schindelin.
The mmap is still fake in the sense that we don't support PROT_WRITE
or MAP_SHARED with external modification at all, but that hasn't
stopped us from using mmap() thoughout the Git code.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Like the existing error() function the new warn() function can be
used to describe a situation that probably should not be occuring,
but which the user (and Git) can continue to work around without
running into too many problems.
An example situation is a bad commit SHA1 found in a reflog.
Attempting to read this record out of the reflog isn't really an
error as we have skipped over it in the past.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
For Mac OS X 10.4, _XOPEN_SOURCE defines _POSIX_C_SOURCE which
hides many symbols from the program.
Breakage noticed and initial analysis provided by Randal
L. Schwartz.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a mechanical clean-up of the way *.c files include
system header files.
(1) sources under compat/, platform sha-1 implementations, and
xdelta code are exempt from the following rules;
(2) the first #include must be "git-compat-util.h" or one of
our own header file that includes it first (e.g. config.h,
builtin.h, pkt-line.h);
(3) system headers that are included in "git-compat-util.h"
need not be included in individual C source files.
(4) "git-compat-util.h" does not have to include subsystem
specific header files (e.g. expat.h).
Signed-off-by: Junio C Hamano <junkio@cox.net>
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.
I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.
[jc: removed the part that deals with last_XXX, which I am
finding more and more dubious these days.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
According to sys/paramh.h it's a "BSD name" for values defined in
<limits.h>. Besides PATH_MAX seems to be more commonly used.
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
As Fredrik points out the current interface of has_extension() is
potentially confusing. Its parameters include both a nul-terminated
string and a length-limited string.
This patch drops the length argument, requiring two nul-terminated
strings; all callsites are updated. I checked that all of them indeed
provide nul-terminated strings. Filenames need to be nul-terminated
anyway if they are to be passed to open() etc. The performance penalty
of the additional strlen() is negligible compared to the system calls
which inevitably surround has_extension() calls.
Additionally, change has_extension() to use size_t inside instead of
int, as that is the exact type strlen() returns and memcmp() expects.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The little helper has_extension() documents through its name what we are
trying to do and makes sure we don't forget the underrun check.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>