Commit Graph

59 Commits (3ffcd08688a0e5f5de808c1077d10552d1f84e70)

Author SHA1 Message Date
Junio C Hamano b6691092d7 Add streaming filter API
This introduces an API to plug custom filters to an input stream.

The caller gets get_stream_filter("path") to obtain an appropriate
filter for the path, and then uses it when opening an input stream
via open_istream().  After that, the caller can read from the stream
with read_istream(), and close it with close_istream(), just like an
unfiltered stream.

This only adds a "null" filter that is a pass-thru filter, but later
changes can add LF-to-CRLF and other filters, and the callers of the
streaming API do not have to change.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-26 16:47:15 -07:00
Junio C Hamano de6182db67 streaming_write_entry(): support files with holes
One typical use of a large binary file is to hold a sparse on-disk hash
table with a lot of holes. Help preserving the holes with lseek().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-20 23:16:53 -07:00
Junio C Hamano dd8e912190 streaming_write_entry(): use streaming API in write_entry()
When the output to a path does not have to be converted, we can read from
the object database from the streaming API and write to the file in the
working tree, without having to hold everything in the memory.

The ident, auto- and safe- crlf conversions inherently require you to read
the whole thing before deciding what to do, so while it is technically
possible to support them by using a buffer of an unbound size or rewinding
and reading the stream twice, it is less practical than the traditional
"read the whole thing in core and convert" approach.

Adding streaming filters for the other conversions on top of this should
be doable by tweaking the can_bypass_conversion() function (it should be
renamed to can_filter_stream() when it happens). Then the streaming API
can be extended to wrap the git_istream streaming_write_entry() opens on
the underlying object in another git_istream that reads from it, filters
what is read, and let the streaming_write_entry() read the filtered
result. But that is outside the scope of this series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-20 18:46:58 -07:00
Junio C Hamano fd5db55d8b write_entry(): separate two helper functions out
In the write-out codepath, a block of code determines what file in the
working tree to write to, and opens an output file descriptor to it.

After writing the contents out to the file, another block of code runs
fstat() on the file descriptor when appropriate.

Separate these blocks out to open_output_fd() and fstat_output()
helper functions.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-20 18:38:54 -07:00
Nguyễn Thái Ngọc Duy d43e90732b entry.c: remove "checkout-index" from error messages
Back then when entry.c was part of checkout-index (or checkout-cache
at that time [1]). It makes sense to print the command name in error
messages. Nowadays entry.c is in libgit and can be used by any
commands, printing "git checkout-index: blah" does no more than
confusion. The error messages without it still give enough information.

[1] 12dccc1 (Make fiel checkout function available to the git library - 2005-06-05)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 14:03:07 -08:00
Junio C Hamano 56eb8b43eb Merge branch 'jc/symbol-static'
* jc/symbol-static:
  date.c: mark file-local function static
  Replace parse_blob() with an explanatory comment
  symlinks.c: remove unused functions
  object.c: remove unused functions
  strbuf.c: remove unused function
  sha1_file.c: remove unused function
  mailmap.c: remove unused function
  utf8.c: mark file-local function static
  submodule.c: mark file-local function static
  quote.c: mark file-local function static
  remote-curl.c: mark file-local function static
  read-cache.c: mark file-local functions static
  parse-options.c: mark file-local function static
  entry.c: mark file-local function static
  http.c: mark file-local functions static
  pretty.c: mark file-local function static
  builtin-rev-list.c: mark file-local function static
  bisect.c: mark file-local function static
2010-01-20 14:37:25 -08:00
Junio C Hamano 73d66323ac Merge branch 'nd/sparse'
* nd/sparse: (25 commits)
  t7002: test for not using external grep on skip-worktree paths
  t7002: set test prerequisite "external-grep" if supported
  grep: do not do external grep on skip-worktree entries
  commit: correctly respect skip-worktree bit
  ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
  tests: rename duplicate t1009
  sparse checkout: inhibit empty worktree
  Add tests for sparse checkout
  read-tree: add --no-sparse-checkout to disable sparse checkout support
  unpack-trees(): ignore worktree check outside checkout area
  unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
  unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
  unpack-trees.c: generalize verify_* functions
  unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
  Introduce "sparse checkout"
  dir.c: export excluded_1() and add_excludes_from_file_1()
  excluded_1(): support exclude files in index
  unpack-trees(): carry skip-worktree bit over in merged_entry()
  Read .gitignore from index if it is skip-worktree
  Avoid writing to buffer in add_excludes_from_file_1()
  ...

Conflicts:
	.gitignore
	Documentation/config.txt
	Documentation/git-update-index.txt
	Makefile
	entry.c
	t/t7002-grep.sh
2010-01-13 11:58:34 -08:00
Junio C Hamano 61b97df7d9 entry.c: mark file-local function static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-12 01:06:08 -08:00
Nguyễn Thái Ngọc Duy 56cac48c35 ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
Previously CE_MATCH_IGNORE_VALID flag is used by both valid and
skip-worktree bits. While the two bits have similar behaviour, sharing
this flag means "git update-index --really-refresh" will ignore
skip-worktree while it should not. Instead another flag is
introduced to ignore skip-worktree bit, CE_MATCH_IGNORE_VALID only
applies to valid bit.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-14 14:03:58 -08:00
Junio C Hamano da02ca508b check_path(): allow symlinked directories to checkout-index --prefix
Merlyn noticed that Documentation/install-doc-quick.sh no longer correctly
removes old installed documents when the target directory has a leading
path that is a symlink.  It turns out that "checkout-index --prefix" was
broken by recent b6986d8 (git-checkout: be careful about untracked
symlinks, 2009-07-29).

I suspect has_symlink_leading_path() could learn the third parameter
(prefix that is allowed to be symlinked directories) to allow us to retire
a similar function has_dirs_only_path().

Another avenue of fixing this I considered was to get rid of base_dir and
base_dir_len from "struct checkout", and instead make "git checkout-index"
when run with --prefix mkdir the leading path and chdir in there.  It
might be the best longer term solution to this issue, as the base_dir
feature is used only by that rather obscure codepath as far as I know.

But at least this patch should fix this breakage.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-18 03:32:45 -07:00
Linus Torvalds b6986d8a75 git-checkout: be careful about untracked symlinks
This fixes the case where an untracked symlink that points at a directory
with tracked paths confuses the checkout logic, demostrated in t6035.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29 20:24:28 -07:00
Thomas Rast 0721c314a5 Use die_errno() instead of die() when checking syscalls
Lots of die() calls did not actually report the kind of error, which
can leave the user confused as to the real problem.  Use die_errno()
where we check a system/library call that sets errno on failure, or
one of the following that wrap such calls:

  Function              Passes on error from
  --------              --------------------
  odb_pack_keep         open
  read_ancestry         fopen
  read_in_full          xread
  strbuf_read           xread
  strbuf_read_file      open or strbuf_read_file
  strbuf_readlink       readlink
  write_in_full         xwrite

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27 11:14:53 -07:00
Thomas Rast d824cbba02 Convert existing die(..., strerror(errno)) to die_errno()
Change calls to die(..., strerror(errno)) to use the new die_errno().

In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27 11:14:53 -07:00
Alex Riesen 691f1a28bf replace direct calls to unlink(2) with unlink_or_warn
This helps to notice when something's going wrong, especially on
systems which lock open files.

I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
  called from such a function
- it is in a static function, returning void and the function is only
  called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-29 18:37:41 -07:00
Johannes Sixt 34779c535c Windows: Skip fstat/lstat optimization in write_entry()
Commit e4c72923 (write_entry(): use fstat() instead of lstat() when file
is open, 2009-02-09) introduced an optimization of write_entry().
Unfortunately, we cannot take advantage of this optimization on Windows
because there is no guarantee that the time stamps are updated before the
file is closed:

  "The only guarantee about a file timestamp is that the file time is
   correctly reflected when the handle that makes the change is closed."

(http://msdn.microsoft.com/en-us/library/ms724290(VS.85).aspx)

The failure of this optimization on Windows can be observed most easily by
running a 'git checkout' that has to update several large files. In this
case, 'git checkout' will report modified files, but infact only the
timestamps were incorrectly recorded in the index, as can be verified by a
subsequent 'git diff', which shows no change.

Dmitry Potapov reports the same fix needs on Cygwin; this commit contains
his updates for that.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-20 12:14:02 -07:00
Kjetil Barvik e4c7292353 write_entry(): use fstat() instead of lstat() when file is open
Currently inside write_entry() we do an lstat(path, &st) call on a
file which have just been opened inside the exact same function.  It
should be better to call fstat(fd, &st) on the file while it is open,
and it should be at least as fast as the lstat() method.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Kjetil Barvik 4857c761e3 write_entry(): cleanup of some duplicated code
The switch-cases for S_IFREG and S_IFLNK was so similar that it will
be better to do some cleanup and use the common parts of it.

And the entry.c file should now be clean for 'gcc -Wextra' warnings.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Kjetil Barvik 81a9aa60a1 create_directories(): remove some memcpy() and strchr() calls
Remove the call to memcpy() and strchr() for each path component
tested, and instead add each path component as we go forward inside
the while-loop.

Impact: small optimisation

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Kjetil Barvik 571998921d lstat_cache(): swap func(length, string) into func(string, length)
Swap function argument pair (length, string) into (string, length) to
conform with the commonly used order inside the GIT source code.

Also, add a note about this fact into the coding guidelines.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Junio C Hamano 0990e7aaaa Merge branch 'kb/lstat-cache'
* kb/lstat-cache:
  lstat_cache(): introduce clear_lstat_cache() function
  lstat_cache(): introduce invalidate_lstat_cache() function
  lstat_cache(): introduce has_dirs_only_path() function
  lstat_cache(): introduce has_symlink_or_noent_leading_path() function
  lstat_cache(): more cache effective symlink/directory detection
2009-01-25 17:13:34 -08:00
Kjetil Barvik bad4a54fa6 lstat_cache(): introduce has_dirs_only_path() function
The create_directories() function in entry.c currently calls stat()
or lstat() for each path component of the pathname 'path' each and every
time.  For the 'git checkout' command, this function is called on each
file for which we must do an update (ce->ce_flags & CE_UPDATE), so we get
lots and lots of calls.

To fix this, we make a new wrapper to the lstat_cache() function, and
call the wrapper function instead of the calls to the stat() or the
lstat() functions.  Since the paths given to the create_directories()
function, is sorted alphabetically, the new wrapper would be very
cache effective in this situation.

To support it we must update the lstat_cache() function to be able to
say that "please test the complete length of 'name'", and also to give
it the length of a prefix, where the cache should use the stat()
function instead of the lstat() function to test each path component.

Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable
comments to this patch!

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:54:54 -08:00
Alexander Potashev 8ca12c0d62 add is_dot_or_dotdot inline function
A new inline function is_dot_or_dotdot is used to check if the
directory name is either "." or "..". It returns a non-zero value if
the given string is "." or "..". It's applicable to a lot of Git
source code.

Signed-off-by: Alexander Potashev <aspotashev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-11 13:21:57 -08:00
Junio C Hamano 7e44c93558 'git foo' program identifies itself without dash in die() messages
This is a mechanical conversion of all '*.c' files with:

	s/((?:die|error|warning)\("git)-(\S+:)/$1 $2/;

The result was manually inspected and no false positive was found.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-31 09:39:19 -07:00
Linus Torvalds 971f229c50 Fix possible Solaris problem in 'checkout_entry()'
Currently when checking out an entry "path", we try to unlink(2) it first
(because there could be stale file), and if there is a directory there,
try to deal with it (typically we run recursive rmdir).  We ignore the
error return from this unlink because there may not even be any file
there.

However if you are root on Solaris, you can unlink(2) a directory
successfully and corrupt your filesystem.

This moves the code around and check the directory first, and then
unlink(2).  Also we check the error code from it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-18 22:18:57 -07:00
Linus Torvalds 7a51ed66f6 Make on-disk index representation separate from in-core one
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.

In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.

This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Junio C Hamano c78a24986d Merge branch 'jc/maint-add-sync-stat'
* jc/maint-add-sync-stat:
  t2200: test more cases of "add -u"
  git-add: make the entry stat-clean after re-adding the same contents
  ce_match_stat, run_diff_files: use symbolic constants for readability

Conflicts:

	builtin-add.c
2007-11-14 14:15:40 -08:00
Junio C Hamano 4bd5b7dacc ce_match_stat, run_diff_files: use symbolic constants for readability
ce_match_stat() can be told:

 (1) to ignore CE_VALID bit (used under "assume unchanged" mode)
     and perform the stat comparison anyway;

 (2) not to perform the contents comparison for racily clean
     entries and report mismatch of cached stat information;

using its "option" parameter.  Give them symbolic constants.

Similarly, run_diff_files() can be told not to report anything
on removed paths.  Also give it a symbolic constant for that.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:24:51 -08:00
René Scharfe c32f749fec Correct some sizeof(size_t) != sizeof(unsigned long) typing errors
Fix size_t vs. unsigned long pointer mismatch warnings introduced
with the addition of strbuf_detach().

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-22 00:00:40 -04:00
Pierre Habouzit b315c5c081 strbuf change: be sure ->buf is never ever NULL.
For that purpose, the ->buf is always initialized with a char * buf living
in the strbuf module. It is made a char * so that we can sloppily accept
things that perform: sb->buf[0] = '\0', and because you can't pass "" as an
initializer for ->buf without making gcc unhappy for very good reasons.

strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf
anymore.

as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying
->buf isn't an option anymore, if ->buf is going to escape from the scope,
and eventually be free'd.

API changes:
  * strbuf_setlen now always works, so just make strbuf_reset a convenience
    macro.
  * strbuf_detatch takes a size_t* optional argument (meaning it can be
    NULL) to copy the buffer's len, as it was needed for this refactor to
    make the code more readable, and working like the callers.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-29 02:13:33 -07:00
Pierre Habouzit 5ecd293d14 Rewrite convert_to_{git,working_tree} to use strbuf's.
* Now, those functions take an "out" strbuf argument, where they store their
  result if any. In that case, it also returns 1, else it returns 0.
* those functions support "in place" editing, in the sense that it's OK to
  call them this way:
    convert_to_git(path, sb->buf, sb->len, sb);
  When doable, conversions are done in place for real, else the strbuf
  content is just replaced with the new one, transparentely for the caller.

If you want to create a new filter working this way, being the accumulation
of filter1, filter2, ... filtern, then your meta_filter would be:

    int meta_filter(..., const char *src, size_t len, struct strbuf *sb)
    {
        int ret = 0;
        ret |= filter1(...., src, len, sb);
        if (ret) {
            src = sb->buf;
            len = sb->len;
        }
        ret |= filter2(...., src, len, sb);
        if (ret) {
            src = sb->buf;
            len = sb->len;
        }
        ....
        return ret | filtern(..., src, len, sb);
    }

That's why subfilters the convert_to_* functions called were also rewritten
to work this way.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-16 17:30:03 -07:00
Junio C Hamano 1a9d7e9b48 attr.c: read .gitattributes from index as well.
This makes .gitattributes files to be read from the index when
they are not checked out to the work tree.  This is in line with
the way we always allowed low-level tools to operate in sparsely
checked out work tree in a reasonable way.

It swaps the order of new file creation and converting the blob
to work tree representation; otherwise when we are in the middle
of checking out .gitattributes we would notice an empty but
unwritten .gitattributes file in the work tree and will ignore
the copy in the index.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-14 23:19:10 -07:00
Junio C Hamano c1c10a3f27 Merge branch 'maint'
* maint:
  Force listingblocks to be monospaced in manpages
  Do not expect unlink(2) to fail on a directory.
2007-07-18 17:00:36 -07:00
Junio C Hamano fa2e71c9e7 Do not expect unlink(2) to fail on a directory.
When "git checkout-index" checks out path A/B/C, it makes sure A
and A/B are truly directories; if there is a regular file or
symlink at A, we prefer to remove it.

We used to do this by catching an error return from mkdir(2),
and on EEXIST did unlink(2), and when it succeeded, tried
another mkdir(2).

Thomas Glanzmann found out the above does not work on Solaris
for a root user, as unlink(2) was so old fashioned there that it
allowed to unlink a directory.

As pointed out, this still doesn't guarantee that git won't call
"unlink()" on a directory (race conditions etc), but that's
fundamentally true (there is no "funlink()" like there is
"fstat()"), and besides, that is in no way git-specific (ie it's
true of any application that gets run as root).

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-18 00:53:09 -07:00
Junio C Hamano a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
Martin Waitz 302b9282c9 rename dirlink to gitlink.
Unify naming of plumbing dirlink/gitlink concept:

git ls-files -z '*.[ch]' |
xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;'

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-21 23:34:54 -07:00
Luiz Fernando N. Capitulino efbc583126 entry.c: Use const qualifier for 'struct checkout' parameters
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-25 13:15:59 -07:00
Luiz Fernando N. Capitulino cc2903fc70 remove_subtree(): Use strerror() when possible
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-25 13:15:27 -07:00
Junio C Hamano a2d7c6c620 Merge branch 'jc/attr'
* 'jc/attr': (28 commits)
  lockfile: record the primary process.
  convert.c: restructure the attribute checking part.
  Fix bogus linked-list management for user defined merge drivers.
  Simplify calling of CR/LF conversion routines
  Document gitattributes(5)
  Update 'crlf' attribute semantics.
  Documentation: support manual section (5) - file formats.
  Simplify code to find recursive merge driver.
  Counto-fix in merge-recursive
  Fix funny types used in attribute value representation
  Allow low-level driver to specify different behaviour during internal merge.
  Custom low-level merge driver: change the configuration scheme.
  Allow the default low-level merge driver to be configured.
  Custom low-level merge driver support.
  Add a demonstration/test of customized merge.
  Allow specifying specialized merge-backend per path.
  merge-recursive: separate out xdl_merge() interface.
  Allow more than true/false to attributes.
  Document git-check-attr
  Change attribute negation marker from '!' to '-'.
  ...
2007-04-21 17:38:00 -07:00
Alex Riesen ac78e54804 Simplify calling of CR/LF conversion routines
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-20 23:24:34 -07:00
Linus Torvalds f0807e62b4 Teach "git-read-tree -u" to check out submodules as a directory
This actually allows us to check out a supermodule after cloning, although
the submodules themselves will obviously not be checked out, and will just
be empty directories.

Checking out the submodules will be up to higher levels - we may not even
want to!

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 03:14:16 -07:00
Johannes Sixt 78a8d641c1 Add core.symlinks to mark filesystems that do not support symbolic links.
Some file systems that can host git repositories and their working copies
do not support symbolic links. But then if the repository contains a symbolic
link, it is impossible to check out the working copy.

This patch enables partial support of symbolic links so that it is possible
to check out a working copy on such a file system.  A new flag
core.symlinks (which is true by default) can be set to false to indicate
that the filesystem does not support symbolic links. In this case, symbolic
links that exist in the trees are checked out as small plain files, and
checking in modifications of these files preserve the symlink property in
the database (as long as an entry exists in the index).

Of course, this does not magically make symbolic links work on such defective
file systems; hence, this solution does not help if the working copy relies
on that an entry is a real symbolic link.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-02 16:58:05 -08:00
Nicolas Pitre 21666f1aae convert object type handling from a string to a number
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value.  One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.

This is an initial step for the removal of the version using a char array
found in object reading code paths.  The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
Linus Torvalds 6c510bee20 Lazy man's auto-CRLF
It currently does NOT know about file attributes, so it does its
conversion purely based on content. Maybe that is more in the "git
philosophy" anyway, since content is king, but I think we should try to do
the file attributes to turn it off on demand.

Anyway, BY DEFAULT it is off regardless, because it requires a

	[core]
		AutoCRLF = true

in your config file to be enabled. We could make that the default for
Windows, of course, the same way we do some other things (filemode etc).

But you can actually enable it on UNIX, and it will cause:

 - "git update-index" will write blobs without CRLF
 - "git diff" will diff working tree files without CRLF
 - "git checkout" will write files to the working tree _with_ CRLF

and things work fine.

Funnily, it actually shows an odd file in git itself:

	git clone -n git test-crlf
	cd test-crlf
	git config core.autocrlf true
	git checkout
	git diff

shows a diff for "Documentation/docbook-xsl.css". Why? Because we have
actually checked in that file *with* CRLF! So when "core.autocrlf" is
true, we'll always generate a *different* hash for it in the index,
because the index hash will be for the content _without_ CRLF.

Is this complete? I dunno. It seems to work for me. It doesn't use the
filename at all right now, and that's probably a deficiency (we could
certainly make the "is_binary()" heuristics also take standard filename
heuristics into account).

I don't pass in the filename at all for the "index_fd()" case
(git-update-index), so that would need to be passed around, but this
actually works fine.

NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours
truly. I will not guarantee that they work at all reasonable. Caveat
emptor. But it _is_ simple, and it _is_ safe, since it's all off by
default.

The patch is pretty simple - the biggest part is the new "convert.c" file,
but even that is really just basic stuff that anybody can write in
"Teaching C 101" as a final project for their first class in programming.
Not to say that it's bug-free, of course - but at least we're not talking
about rocket surgery here.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-14 11:19:22 -08:00
Linus Torvalds bd3a5b5ee5 Mark places that need blob munging later for CRLF conversion.
Here's a patch that I think we can merge right now. There may be
other places that need this, but this at least points out the
three places that read/write working tree files for git
update-index, checkout and diff respectively. That should cover
a lot of it [jc: git-apply uses an entirely different codepath
both for reading and writing].

Some day we can actually implement it. In the meantime, this
points out a place for people to start. We *can* even start with
a really simple "we do CRLF conversion automatically, regardless
of filename" kind of approach, that just look at the data (all
three cases have the _full_ file data already in memory) and
says "ok, this is text, so let's convert to/from DOS format
directly".

THAT somebody can write in ten minutes, and it would already
make git much nicer on a DOS/Windows platform, I suspect.

And it would be totally zero-cost if you just make it a config
option (but please make it dynamic with the _default_ just being
0/1 depending on whether it's UNIX/Windows, just so that UNIX
people can _test_ it easily).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-13 10:12:37 -08:00
Andy Whitcroft 93822c2239 short i/o: fix calls to write to use xwrite or write_in_full
We have a number of badly checked write() calls.  Often we are
expecting write() to write exactly the size we requested or fail,
this fails to handle interrupts or short writes.  Switch to using
the new write_in_full().  Otherwise we at a minimum need to check
for EINTR and EAGAIN, where this is appropriate use xwrite().

Note, the changes to config handling are much larger and handled
in the next patch in the sequence.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 15:44:47 -08:00
Junio C Hamano 85023577a8 simplify inclusion of system header files.
This is a mechanical clean-up of the way *.c files include
system header files.

 (1) sources under compat/, platform sha-1 implementations, and
     xdelta code are exempt from the following rules;

 (2) the first #include must be "git-compat-util.h" or one of
     our own header file that includes it first (e.g. config.h,
     builtin.h, pkt-line.h);

 (3) system headers that are included in "git-compat-util.h"
     need not be included in individual C source files.

 (4) "git-compat-util.h" does not have to include subsystem
     specific header files (e.g. expat.h).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-20 09:51:35 -08:00
Jonas Fonseca 095c424d08 Use PATH_MAX instead of MAXPATHLEN
According to sys/paramh.h it's a "BSD name" for values defined in
<limits.h>. Besides PATH_MAX seems to be more commonly used.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-26 17:52:58 -07:00
Peter Eriksen 8e44025925 Use blob_, commit_, tag_, and tree_type throughout.
This replaces occurences of "blob", "commit", "tag", and "tree",
where they're really used as type specifiers, which we already
have defined global constants for.

Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-04 00:11:19 -07:00
Shawn Pearce de84f99c12 Add --temp and --stage=all options to checkout-index.
Sometimes it is convient for a Porcelain to be able to checkout all
unmerged files in all stages so that an external merge tool can be
executed by the Porcelain or the end-user.  Using git-unpack-file
on each stage individually incurs a rather high penalty due to the
need to fork for each file version obtained.  git-checkout-index -a
--stage=all will now do the same thing, but faster.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-05 00:58:13 -08:00
Junio C Hamano 5f73076c1a "Assume unchanged" git
This adds "assume unchanged" logic, started by this message in the list
discussion recently:

	<Pine.LNX.4.64.0601311807470.7301@g5.osdl.org>

This is a workaround for filesystems that do not have lstat()
that is quick enough for the index mechanism to take advantage
of.  On the paths marked as "assumed to be unchanged", the user
needs to explicitly use update-index to register the object name
to be in the next commit.

You can use two new options to update-index to set and reset the
CE_VALID bit:

	git-update-index --assume-unchanged path...
	git-update-index --no-assume-unchanged path...

These forms manipulate only the CE_VALID bit; it does not change
the object name recorded in the index file.  Nor they add a new
entry to the index.

When the configuration variable "core.ignorestat = true" is set,
the index entries are marked with CE_VALID bit automatically
after:

 - update-index to explicitly register the current object name to the
   index file.

 - when update-index --refresh finds the path to be up-to-date.

 - when tools like read-tree -u and apply --index update the working
   tree file and register the current object name to the index file.

The flag is dropped upon read-tree that does not check out the index
entry.  This happens regardless of the core.ignorestat settings.

Index entries marked with CE_VALID bit are assumed to be
unchanged most of the time.  However, there are cases that
CE_VALID bit is ignored for the sake of safety and usability:

 - while "git-read-tree -m" or git-apply need to make sure
   that the paths involved in the merge do not have local
   modifications.  This sacrifices performance for safety.

 - when git-checkout-index -f -q -u -a tries to see if it needs
   to checkout the paths.  Otherwise you can never check
   anything out ;-).

 - when git-update-index --really-refresh (a new flag) tries to
   see if the index entry is up to date.  You can start with
   everything marked as CE_VALID and run this once to drop
   CE_VALID bit for paths that are modified.

Most notably, "update-index --refresh" honours CE_VALID and does
not actively stat, so after you modified a file in the working
tree, update-index --refresh would not notice until you tell the
index about it with "git-update-index path" or "git-update-index
--no-assume-unchanged path".

This version is not expected to be perfect.  I think diff
between index and/or tree and working files may need some
adjustment, and there probably needs other cases we should
automatically unmark paths that are marked to be CE_VALID.

But the basics seem to work, and ready to be tested by people
who asked for this feature.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-08 21:54:42 -08:00