Commit Graph

1082 Commits (2bf3501150145d1f05678c20ab8e8d66f849851f)

Author SHA1 Message Date
Junio C Hamano 988f98f61f Merge branch 'jx/clean-interactive'
Add "interactive" mode to "git clean".

The early part to refactor relative path related helper functions
looked sensible.

* jx/clean-interactive:
  test: run testcases with POSIX absolute paths on Windows
  test: add t7301 for git-clean--interactive
  git-clean: add documentation for interactive git-clean
  git-clean: add ask each interactive action
  git-clean: add select by numbers interactive action
  git-clean: add filter by pattern interactive action
  git-clean: use a git-add-interactive compatible UI
  git-clean: add colors to interactive git-clean
  git-clean: show items of del_list in columns
  git-clean: add support for -i/--interactive
  git-clean: refactor git-clean into two phases
  write_name{_quoted_relative,}(): remove redundant parameters
  quote_path_relative(): remove redundant parameter
  quote.c: substitute path_relative with relative_path
  path.c: refactor relative_path(), not only strip prefix
  test: add test cases for relative_path
2013-07-22 11:24:11 -07:00
Junio C Hamano c714f9fd8a Merge branch 'hv/config-from-blob'
Allow configuration data to be read from in-tree blob objects,
which would help working in a bare repository and submodule
updates.

* hv/config-from-blob:
  do not die when error in config parsing of buf occurs
  teach config --blob option to parse config from database
  config: make parsing stack struct independent from actual data source
  config: drop cf validity check in get_next_char()
  config: factor out config file stack management
2013-07-22 11:24:09 -07:00
Junio C Hamano d3aeb31dc4 Merge branch 'nd/const-struct-cache-entry'
* nd/const-struct-cache-entry:
  Convert "struct cache_entry *" to "const ..." wherever possible
2013-07-22 11:24:01 -07:00
Junio C Hamano cb29dfde48 Merge branch 'tr/protect-low-3-fds'
When "git" is spawned in such a way that any of the low 3 file
descriptors is closed, our first open() may yield file descriptor 2,
and writing error message to it would screw things up in a big way.

* tr/protect-low-3-fds:
  git: ensure 0/1/2 are open in main()
  daemon/shell: refactor redirection of 0/1/2 from /dev/null
2013-07-22 11:23:35 -07:00
Junio C Hamano 802f878b86 Merge branch 'jk/in-pack-size-measurement'
"git cat-file --batch-check=<format>" is added, primarily to allow
on-disk footprint of objects in packfiles (often they are a lot
smaller than their true size, when expressed as deltas) to be
reported.

* jk/in-pack-size-measurement:
  pack-revindex: radix-sort the revindex
  pack-revindex: use unsigned to store number of objects
  cat-file: split --batch input lines on whitespace
  cat-file: add %(objectsize:disk) format atom
  cat-file: add --batch-check=<format>
  cat-file: refactor --batch option parsing
  cat-file: teach --batch to stream blob objects
  t1006: modernize output comparisons
  teach sha1_object_info_extended a "disk_size" query
  zero-initialize object_info structs
2013-07-18 12:59:41 -07:00
Thomas Rast 1d999ddd1d daemon/shell: refactor redirection of 0/1/2 from /dev/null
Both daemon.c and shell.c contain logic to open FDs 0/1/2 from
/dev/null if they are not already open.  Move the function in daemon.c
to setup.c and use it in shell.c, too.

While there, remove a 'not' that inverted the meaning of the comment.
The point is indeed to *avoid* messing up.

Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17 12:50:34 -07:00
Heiko Voigt 1bc888193e teach config --blob option to parse config from database
This can be used to read configuration values directly from git's
database. For example it is useful for reading to be checked out
.gitmodules files directly from the database.

Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12 09:34:57 -07:00
Nguyễn Thái Ngọc Duy 9c5e6c802c Convert "struct cache_entry *" to "const ..." wherever possible
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-09 09:12:48 -07:00
Jeff King 161f00e708 teach sha1_object_info_extended a "disk_size" query
Using sha1_object_info_extended, a caller can find out the
type of an object, its size, and information about where it
is stored. In addition to the object's "true" size, it can
also be useful to know the size that the object takes on
disk (e.g., to generate statistics about which refs consume
space).

This patch adds a "disk_sizep" field to "struct object_info",
and fills it in during sha1_object_info_extended if it is
non-NULL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-07 10:53:22 -07:00
Junio C Hamano 079424a2cf Merge branch 'mh/ref-races'
"git pack-refs" that races with new ref creation or deletion have
been susceptible to lossage of refs under right conditions, which
has been tightened up.

* mh/ref-races:
  for_each_ref: load all loose refs before packed refs
  get_packed_ref_cache: reload packed-refs file when it changes
  add a stat_validity struct
  Extract a struct stat_data from cache_entry
  packed_ref_cache: increment refcount when locked
  do_for_each_entry(): increment the packed refs cache refcount
  refs: manage lifetime of packed refs cache via reference counting
  refs: implement simple transactions for the packed-refs file
  refs: wrap the packed refs cache in a level of indirection
  pack_refs(): split creation of packed refs and entry writing
  repack_without_ref(): split list curation and entry writing
2013-06-30 15:40:05 -07:00
Jiang Xin e02ca72f70 path.c: refactor relative_path(), not only strip prefix
Original design of relative_path() is simple, just strip the prefix
(*base) from the absolute path (*abs).

In most cases, we need a real relative path, such as: ../foo,
../../bar.  That's why there is another reimplementation
(path_relative()) in quote.c.

Borrow some codes from path_relative() in quote.c to refactor
relative_path() in path.c, so that it could return real relative
path, and user can reuse this function without reimplementing
his/her own.  The function path_relative() in quote.c will be
substituted, and I would use the new relative_path() function when
implementing the interactive git-clean later.

Different results for relative_path() before and after this refactor:

    abs path  base path  relative (original)  relative (refactor)
    ========  =========  ===================  ===================
    /a/b      /a/b       .                    ./
    /a/b/     /a/b       .                    ./
    /a        /a/b/      /a                   ../
    /         /a/b/      /                    ../../
    /a/c      /a/b/      /a/c                 ../c
    /x/y      /a/b/      /x/y                 ../../x/y

    a/b/      a/b/       .                    ./
    a/b/      a/b        .                    ./
    a         a/b        a                    ../
    x/y       a/b/       x/y                  ../../x/y
    a/c       a/b        a/c                  ../c

    (empty)   (null)     (empty)              ./
    (empty)   (empty)    (empty)              ./
    (empty)   /a/b       (empty)              ./
    (null)    (null)     (null)               ./
    (null)    (empty)    (null)               ./
    (null)    /a/b       (segfault)           ./

You may notice that return value "." has been changed to "./".
It is because:

 * Function quote_path_relative() in quote.c will show the relative
   path as "./" if abs(in) and base(prefix) are the same.

 * Function relative_path() is called only once (in setup.c), and
   it will be OK for the return value as "./" instead of ".".

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-26 09:59:00 -07:00
Junio C Hamano 8f0c843aab Merge branch 'nd/traces'
* nd/traces:
  git.txt: document GIT_TRACE_PACKET
  core: use env variable instead of config var to turn on logging pack access
2013-06-20 16:02:28 -07:00
Michael Haggerty 3861253224 add a stat_validity struct
It can sometimes be useful to know whether a path in the
filesystem has been updated without going to the work of
opening and re-reading its content. We trust the stat()
information on disk already to handle index updates, and we
can use the same trick here.

This patch introduces a "stat_validity" struct which
encapsulates the concept of checking the stat-freshness of a
file. It is implemented on top of "struct stat_data" to
reuse the logic about which stat entries to trust for a
particular platform, but hides the complexity behind two
simple functions: check and update.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-20 15:50:17 -07:00
Michael Haggerty c21d39d7c7 Extract a struct stat_data from cache_entry
Add public functions fill_stat_data() and match_stat_data() to work
with it.  This infrastructure will later be used to check the validity
of other types of file.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-20 15:50:17 -07:00
Junio C Hamano dd261b1727 Merge branch 'rs/unpack-trees-plug-leak'
* rs/unpack-trees-plug-leak:
  unpack-trees: free cache_entry array members for merges
  diff-lib, read-tree, unpack-trees: mark cache_entry array paramters const
  diff-lib, read-tree, unpack-trees: mark cache_entry pointers const
  unpack-trees: create working copy of merge entry in merged_entry
  unpack-trees: factor out dup_entry
  read-cache: mark cache_entry pointers const
  cache: mark cache_entry pointers const
2013-06-11 13:30:05 -07:00
Nguyễn Thái Ngọc Duy b12ca9631f core: use env variable instead of config var to turn on logging pack access
5f44324 (core: log offset pack data accesses happened - 2011-07-06)
provides a way to observe pack access patterns via a config
switch. Setting an environment variable looks more obvious than a
config var, especially when you just need to _observe_, and more
inline with other tracing knobs we have.

Document it as it may be useful for remote troubleshooting.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-09 16:07:50 -07:00
Junio C Hamano db400949b3 Merge branch 'jk/fetch-always-update-tracking'
"git fetch origin master" unlike "git fetch origin" or "git fetch"
did not update "refs/remotes/origin/master"; this was an early
design decision to keep the update of remote tracking branches
predictable, but in practice it turns out that people find it more
convenient to opportunisticly update them whenever we have a chance,
and we have been updating them when we run "git push" which already
breaks the original "predictability" anyway.

Now such a fetch does update refs/remotes/origin/master.

* jk/fetch-always-update-tracking:
  fetch: don't try to update unfetched tracking refs
  fetch: opportunistically update tracking refs
  refactor "ref->merge" flag
  fetch/pull doc: untangle meaning of bare <ref>
  t5510: start tracking-ref tests from a known state
2013-06-02 15:57:26 -07:00
René Scharfe 21a6b9fa42 read-cache: mark cache_entry pointers const
ie_match_stat and ie_modified only derefence their struct cache_entry
pointers for reading.  Add const to the parameter declaration here and
do the same for the static helper function used by them, as it's the
same there as well.  This allows callers to pass in const pointers.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:31:12 -07:00
René Scharfe 20d142b48c cache: mark cache_entry pointers const
Add const for pointers that are only dereferenced for reading by the
inline functions copy_cache_entry and ce_mode_from_stat.  This allows
callers to pass in const pointers.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:31:12 -07:00
Junio C Hamano 3e1e7624aa Merge branch 'jc/prune-all'
We used the approxidate() parser for "--expire=<timestamp>" options
of various commands, but it is better to treat --expire=all and
--expire=now a bit more specially than using the current timestamp.
Update "git gc" and "git reflog" with a new parsing function for
expiry dates.

* jc/prune-all:
  prune: introduce OPT_EXPIRY_DATE() and use it
  api-parse-options.txt: document "no-" for non-boolean options
  git-gc.txt, git-reflog.txt: document new expiry options
  date.c: add parse_expiry_date()
2013-05-29 14:23:04 -07:00
Jeff King 900f2814b8 refactor "ref->merge" flag
Each "struct ref" has a boolean flag that is set by the
fetch code to determine whether the ref should be marked as
"not-for-merge" or not when we write it out to FETCH_HEAD.

It would be useful to turn this boolean into a tri-state,
with the third state meaning "do not bother writing it out
to FETCH_HEAD at all". That would let us add extra refs to
the set of refs to be stored (e.g., to store copies of
things we fetched) without impacting FETCH_HEAD.

This patch turns it into an enum that covers the tri-state
case, and hopefully makes the code more explicit and easier
to read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-12 15:23:48 -07:00
Junio C Hamano 4b35b007a6 Merge branch 'lf/read-blob-data-from-index'
Reduce duplicated code between convert.c and attr.c.

* lf/read-blob-data-from-index:
  convert.c: remove duplicate code
  read_blob_data_from_index(): optionally return the size of blob data
  attr.c: extract read_index_data() as read_blob_data_from_index()
2013-04-21 18:39:45 -07:00
Junio C Hamano de91daf5e6 Merge branch 'jn/add-2.0-u-A-sans-pathspec' (early part)
In Git 2.0, "git add -u" and "git add -A" without any pathspec will
update the index for all paths, including those outside the current
directory, making it more consistent with "commit -a".  To help the
migration pain, a warning is issued when the differences between the
current behaviour and the upcoming behaviour matters, i.e. when the
user has local changes outside the current directory.

* 'jn/add-2.0-u-A-sans-pathspec' (early part):
  add -A: only show pathless 'add -A' warning when changes exist outside cwd
  add -u: only show pathless 'add -u' warning when changes exist outside cwd
  add: make warn_pathless_add() a no-op after first call
  add: add a blank line at the end of pathless 'add [-u|-A]' warning
  add: make pathless 'add [-u|-A]' warning a file-global function
2013-04-19 13:37:36 -07:00
Junio C Hamano 3d27b9b005 date.c: add parse_expiry_date()
"git reflog --expire=all" tries to expire reflog entries up to the
current second, because the approxidate() parser gives the current
timestamp for anything it does not understand (and it does not know
what time "all" means).  When the user tells us to expire "all" (or
set the expiration time to "now"), the user wants to remove all the
reflog entries (no reflog entry should record future time).

Just set it to ULONG_MAX and to let everything that is older that
timestamp expire.

While at it, allow "now" to be treated the same way for callers that
parse expiry date timestamp with this function.  Also use an error
reporting version of approxidate() to report misspelled date.  When
the user says e.g. "--expire=mnoday" to delete entries two days or
older on Wednesday, we wouldn't want the "unknown, default to now"
logic to kick in.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-17 16:03:56 -07:00
Lukas Fleischer ff36682505 read_blob_data_from_index(): optionally return the size of blob data
This allows for optionally getting the size of the returned data and
will be used in a follow-up patch.

Signed-off-by: Lukas Fleischer <git@cryptocrack.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-17 09:51:47 -07:00
Lukas Fleischer 29fb37b272 attr.c: extract read_index_data() as read_blob_data_from_index()
Extract the read_index_data() function from attr.c and move it to
read-cache.c; rename it to read_blob_data_from_index() and update
the function signature of it to align better with index/cache API
functions.

This allows for reusing the function in convert.c later.

Signed-off-by: Lukas Fleischer <git@cryptocrack.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-17 09:49:11 -07:00
Junio C Hamano e65cdde454 Merge branch 'tb/shared-perm'
Simplifies adjust_shared_perm() implementation.

* tb/shared-perm:
  path.c: optimize adjust_shared_perm()
  path.c: simplify adjust_shared_perm()
2013-04-07 14:33:11 -07:00
Torsten Bögershausen 3a429d3b8d path.c: simplify adjust_shared_perm()
All calls to set_shared_perm() use mode == 0, so simplify the
function.

Because all callers use the macro adjust_shared_perm(path) from
cache.h to call this function, convert it to a proper function,
losing set_shared_perm().

Since path.c has much more functions than just mkpath() these days,
drop the stale comment about it.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-05 12:37:55 -07:00
Jonathan Nieder 71c7b0538f add -u: only show pathless 'add -u' warning when changes exist outside cwd
A common workflow in large projects is to chdir into a subdirectory of
interest and only do work there:

	cd src
	vi foo.c
	make test
	git add -u
	git commit

The upcoming change to 'git add -u' behavior would not affect such a
workflow: when the only changes present are in the current directory,
'git add -u' will add all changes, and whether that happens via an
implicit "." or implicit ":/" parameter is an unimportant
implementation detail.

The warning about use of 'git add -u' with no pathspec is annoying
because it seemingly serves no purpose in this case.  So suppress the
warning unless there are changes outside the cwd that are not being
added.

A previous version of this patch ran two I/O-intensive diff-files
passes: one to find changes outside the cwd, and another to find
changes to add to the index within the cwd.  This version runs one
full-tree diff and decides for each change whether to add it or warn
and suppress it in update_callback.  As a result, even on very large
repositories "git add -u" will not be significantly slower than the
future default behavior ("git add -u :/"), and the slowdown relative
to "git add -u ." should be a useful clue to users of such
repositories to get into the habit of explicitly passing '.'.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-03 11:34:22 -07:00
Junio C Hamano 97fefaf6d3 Merge branch 'nd/checkout-paths-reduce-match-pathspec-calls'
Consolidate repeated pathspec matches on the same paths, while
fixing a bug in "git checkout dir/" code started from an unmerged
index.

* nd/checkout-paths-reduce-match-pathspec-calls:
  checkout: avoid unnecessary match_pathspec calls
2013-04-03 09:34:00 -07:00
Junio C Hamano c81e2c61b3 Merge branch 'kb/name-hash' into maint-1.8.1
* kb/name-hash:
  name-hash.c: fix endless loop with core.ignorecase=true
2013-04-03 08:44:54 -07:00
Junio C Hamano c044bed8f0 Merge branch 'kb/name-hash'
The code to keep track of what directory names are known to Git on
platforms with case insensitive filesystems can get confused upon
a hash collision between these pathnames and looped forever.

* kb/name-hash:
  name-hash.c: fix endless loop with core.ignorecase=true
2013-04-01 08:59:53 -07:00
Junio C Hamano e013bdab0f Merge branch 'jk/pkt-line-cleanup'
Clean up pkt-line API, implementation and its callers to make them
more robust.

* jk/pkt-line-cleanup:
  do not use GIT_TRACE_PACKET=3 in tests
  remote-curl: always parse incoming refs
  remote-curl: move ref-parsing code up in file
  remote-curl: pass buffer straight to get_remote_heads
  teach get_remote_heads to read from a memory buffer
  pkt-line: share buffer/descriptor reading implementation
  pkt-line: provide a LARGE_PACKET_MAX static buffer
  pkt-line: move LARGE_PACKET_MAX definition from sideband
  pkt-line: teach packet_read_line to chomp newlines
  pkt-line: provide a generic reading function with options
  pkt-line: drop safe_write function
  pkt-line: move a misplaced comment
  write_or_die: raise SIGPIPE when we get EPIPE
  upload-archive: use argv_array to store client arguments
  upload-archive: do not copy repo name
  send-pack: prefer prefixcmp over memcmp in receive_status
  fetch-pack: fix out-of-bounds buffer offset in get_ack
  upload-pack: remove packet debugging harness
  upload-pack: do not add duplicate objects to shallow list
  upload-pack: use get_sha1_hex to parse "shallow" lines
2013-04-01 08:59:37 -07:00
Junio C Hamano e96a3b3649 Merge branch 'rs/archive-zip-raw-compression'
* rs/archive-zip-raw-compression:
  archive-zip: use deflateInit2() to ask for raw compressed data
2013-03-27 09:28:53 -07:00
Nguyễn Thái Ngọc Duy e721c1544f checkout: avoid unnecessary match_pathspec calls
In checkout_paths() we do this

 - for all updated items, call match_pathspec
 - for all items, call match_pathspec (inside unmerge_cache)
 - for all items, call match_pathspec (for showing "path .. is unmerged)
 - for updated items, call match_pathspec and update paths

That's a lot of duplicate match_pathspec(s) and the function is not
exactly cheap to be called so many times, especially on large indexes.
This patch makes it call match_pathspec once per updated index entry,
save the result in ce_flags and reuse the results in the following
loops.

The changes in 0a1283b (checkout $tree $path: do not clobber local
changes in $path not in $tree - 2011-09-30) limit the affected paths
to ones we read from $tree. We do not do anything to other modified
entries in this case, so the "for all items" above could be modified
to "for all updated items". But..

The command's behavior now is modified slightly: unmerged entries that
match $path, but not updated by $tree, are now NOT touched.  Although
this should be considered a bug fix, not a regression. A new test is
added for this change.

And while at there, free ps_matched after use.

The following command is tested on webkit, 215k entries. The pattern
is chosen mainly to make match_pathspec sweat:

git checkout -- "*[a-zA-Z]*[a-zA-Z]*[a-zA-Z]*"

        before      after
real    0m3.493s    0m2.737s
user    0m2.239s    0m1.586s
sys     0m1.252s    0m1.151s

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-27 08:53:15 -07:00
Junio C Hamano fb3b7b1f95 Merge branch 'jk/alias-in-bare'
An aliased command spawned from a bare repository that does not say
it is bare with "core.bare = yes" is treated as non-bare by mistake.

* jk/alias-in-bare:
  setup: suppress implicit "." work-tree for bare repos
  environment: add GIT_PREFIX to local_repo_env
  cache.h: drop LOCAL_REPO_ENV_SIZE
2013-03-25 14:00:44 -07:00
Junio C Hamano f5715de54a Merge branch 'nd/count-garbage'
"git count-objects -v" did not count leftover temporary packfiles
and other kinds of garbage.

* nd/count-garbage:
  count-objects: report how much disk space taken by garbage files
  count-objects: report garbage files in pack directory too
  sha1_file: reorder code in prepare_packed_git_one()
  git-count-objects.txt: describe each line in -v output
2013-03-21 14:02:34 -07:00
Junio C Hamano e4e1c54990 Merge branch 'jc/fetch-raw-sha1'
Allows requests to fetch objects at any tip of refs (including
hidden ones).  It seems that there may be use cases even outside
Gerrit (e.g. $gmane/215701).

* jc/fetch-raw-sha1:
  fetch: fetch objects by their exact SHA-1 object names
  upload-pack: optionally allow fetching from the tips of hidden refs
  fetch: use struct ref to represent refs to be fetched
  parse_fetch_refspec(): clarify the codeflow a bit
2013-03-21 14:02:27 -07:00
René Scharfe c3c2e1a09b archive-zip: use deflateInit2() to ask for raw compressed data
We use the function git_deflate_init() -- which wraps the zlib function
deflateInit() -- to initialize compression of ZIP file entries.  This
results in compressed data prefixed with a two-bytes long header and
followed by a four-bytes trailer.  ZIP file entries consist of ZIP
headers and raw compressed data instead, so we remove the zlib wrapper
before writing the result.

We can ask zlib for the the raw compressed data without the unwanted
parts in the first place by using deflateInit2() and specifying a
negative number of bits to size the window.  For that purpose, factor
out the function do_git_deflate_init() and add git_deflate_init_raw(),
which wraps it.  Then use the latter in archive-zip.c and get rid of
the code that stripped the zlib header and trailer.

Also rename the helper function zlib_deflate() to zlib_deflate_raw()
to reflect the change.

Thus we avoid generating data that we throw away anyway, the code
becomes shorter and some magic constants are removed.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-16 22:07:02 -07:00
Jeff King 2cd83d10bb setup: suppress implicit "." work-tree for bare repos
If an explicit GIT_DIR is given without a working tree, we
implicitly assume that the current working directory should
be used as the working tree. E.g.,:

  GIT_DIR=/some/repo.git git status

would compare against the cwd.

Unfortunately, we fool this rule for sub-invocations of git
by setting GIT_DIR internally ourselves. For example:

  git init foo
  cd foo/.git
  git status ;# fails, as we expect
  git config alias.st status
  git status ;# does not fail, but should

What happens is that we run setup_git_directory when doing
alias lookup (since we need to see the config), set GIT_DIR
as a result, and then leave GIT_WORK_TREE blank (because we
do not have one). Then when we actually run the status
command, we do setup_git_directory again, which sees our
explicit GIT_DIR and uses the cwd as an implicit worktree.

It's tempting to argue that we should be suppressing that
second invocation of setup_git_directory, as it could use
the values we already found in memory. However, the problem
still exists for sub-processes (e.g., if "git status" were
an external command).

You can see another example with the "--bare" option, which
sets GIT_DIR explicitly. For example:

  git init foo
  cd foo/.git
  git status ;# fails
  git --bare status ;# does NOT fail

We need some way of telling sub-processes "even though
GIT_DIR is set, do not use cwd as an implicit working tree".
We could do it by putting a special token into
GIT_WORK_TREE, but the obvious choice (an empty string) has
some portability problems.

Instead, we add a new boolean variable, GIT_IMPLICIT_WORK_TREE,
which suppresses the use of cwd as a working tree when
GIT_DIR is set. We trigger the new variable when we know we
are in a bare setting.

The variable is left intentionally undocumented, as this is
an internal detail (for now, anyway). If somebody comes up
with a good alternate use for it, and once we are confident
we have shaken any bugs out of it, we can consider promoting
it further.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-08 14:02:40 -08:00
Jeff King a6f7f9a325 environment: add GIT_PREFIX to local_repo_env
The GIT_PREFIX variable is set based on our location within
the working tree. It should therefore be cleared whenever
GIT_WORK_TREE is cleared.

In practice, this doesn't cause any bugs, because none of
the sub-programs we invoke with local_repo_env cleared
actually care about GIT_PREFIX. But this is the right thing
to do, and future proofs us against that assumption changing.

While we're at it, let's define a GIT_PREFIX_ENVIRONMENT
macro; this avoids repetition of the string literal, which
can help catch any spelling mistakes in the code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-08 14:02:31 -08:00
Jeff King 2163e5dbb4 cache.h: drop LOCAL_REPO_ENV_SIZE
We keep a static array of variables that should be cleared
when invoking a sub-process on another repo. We statically
size the array with the LOCAL_REPO_ENV_SIZE macro so that
any readers do not have to count it themselves.

As it turns out, no readers actually use the macro, and it
creates a maintenance headache, as modifications to the
array need to happen in two places (one to add the new
element, and another to bump the size).

Since it's NULL-terminated, we can just drop the size macro
entirely. While we're at it, we'll clean up some comments
around it, and add a new mention of it at the top of the
list of environment variable macros. Even though
local_repo_env is right below that list, it's easy to miss,
and additions to that list should consider local_repo_env.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-08 07:55:54 -08:00
Karsten Blees 2092678cd5 name-hash.c: fix endless loop with core.ignorecase=true
With core.ignorecase=true, name-hash.c builds a case insensitive index of
all tracked directories. Currently, the existing cache entry structures are
added multiple times to the same hashtable (with different name lengths and
hash codes). However, there's only one dir_next pointer, which gets
completely messed up in case of hash collisions. In the worst case, this
causes an endless loop if ce == ce->dir_next (see t7062).

Use a separate hashtable and separate structures for the directory index
so that each directory entry has its own next pointer. Use reference
counting to track which directory entry contains files.

There are only slight changes to the name-hash.c API:
- new free_name_hash() used by read_cache.c::discard_index()
- remove_name_hash() takes an additional index_state parameter
- index_name_exists() for a directory (trailing '/') may return a cache
  entry that has been removed (CE_UNHASHED). This is not a problem as the
  return value is only used to check if the directory exists (dir.c) or to
  normalize casing of directory names (read-cache.c).

Getting rid of cache_entry.dir_next reduces memory consumption, especially
with core.ignorecase=false (which doesn't use that member at all).

With core.ignorecase=true, building the directory index is slightly faster
as we add / check the parent directory first (instead of going through all
directory levels for each file in the index). E.g. with WebKit (~200k
files, ~7k dirs), time spent in lazy_init_name_hash is reduced from 176ms
to 130ms.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-27 23:29:04 -08:00
Jeff King 85edf4f58b teach get_remote_heads to read from a memory buffer
Now that we can read packet data from memory as easily as a
descriptor, get_remote_heads can take either one as a
source. This will allow further refactoring in remote-curl.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24 00:17:38 -08:00
Nguyễn Thái Ngọc Duy 543c5caa6c count-objects: report garbage files in pack directory too
prepare_packed_git_one() is modified to allow count-objects to hook a
report function to so we don't need to duplicate the pack searching
logic in count-objects.c. When report_pack_garbage is NULL, the
overhead is insignificant.

The garbage is reported with warning() instead of error() in packed
garbage case because it's not an error to have garbage. Loose garbage
is still reported as errors and will be converted to warnings later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-15 08:13:13 -08:00
Junio C Hamano f2db854d24 fetch: use struct ref to represent refs to be fetched
Even though "git fetch" has full infrastructure to parse refspecs to
be fetched and match them against the list of refs to come up with
the final list of refs to be fetched, the list of refs that are
requested to be fetched were internally converted to a plain list of
strings at the transport layer and then passed to the underlying
fetch-pack driver.

Stop this conversion and instead pass around an array of refs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-07 13:53:59 -08:00
Junio C Hamano 370855e967 Merge branch 'jc/push-reject-reasons'
Improve error and advice messages given locally when "git push"
refuses when it cannot compute fast-forwardness by separating these
cases from the normal "not a fast-forward; merge first and push
again" case.

* jc/push-reject-reasons:
  push: finishing touches to explain REJECT_ALREADY_EXISTS better
  push: introduce REJECT_FETCH_FIRST and REJECT_NEEDS_FORCE
  push: further simplify the logic to assign rejection reason
  push: further clean up fields of "struct ref"
2013-02-04 10:25:04 -08:00
Junio C Hamano 099ba556d0 Merge branch 'jk/config-parsing-cleanup'
Configuration parsing for tar.* configuration variables were
broken. Introduce a new config-keyname parser API to make the
callers much less error prone.

* jk/config-parsing-cleanup:
  reflog: use parse_config_key in config callback
  help: use parse_config_key for man config
  submodule: simplify memory handling in config parsing
  submodule: use parse_config_key when parsing config
  userdiff: drop parse_driver function
  convert some config callbacks to parse_config_key
  archive-tar: use parse_config_key when parsing config
  config: add helper function for parsing key names
2013-02-04 10:24:50 -08:00
Junio C Hamano 149a4211a4 Merge branch 'jc/custom-comment-char'
Allow a configuration variable core.commentchar to customize the
character used to comment out the hint lines in the edited text from
the default '#'.

* jc/custom-comment-char:
  Allow custom "comment char"
2013-02-04 10:23:49 -08:00
Junio C Hamano 070c57df42 Merge branch 'rr/minimal-stat'
Some reimplementations of Git does not write all the stat info back
to the index due to their implementation limitations (e.g. jgit
running on Java).  A configuration option can tell Git to ignore
changes to most of the stat fields and only pay attention to mtime
and size, which these implementations can reliably update.  This
avoids excessive revalidation of contents.

* rr/minimal-stat:
  Enable minimal stat checking
2013-01-30 08:53:02 -08:00