Commit Graph

39 Commits (3bf25c23cd68dd9e99e54bf5443aebbb4a40b7d6)

Author SHA1 Message Date
Junio C Hamano f0c103b49c Merge branch 'rs/leave-base-name-in-name-field-of-tar' into maint
A tar archive created by "git archive" recorded a directory in a
way that made NetBSD's implementation of "tar" sometimes unhappy.

* rs/leave-base-name-in-name-field-of-tar:
  archive-tar: split long paths more carefully
2013-01-14 07:34:12 -08:00
René Scharfe 22f0dcd963 archive-tar: split long paths more carefully
The name field of a tar header has a size of 100 characters.  This limit
was extended long ago in a backward compatible way by providing the
additional prefix field, which can hold 155 additional characters.  The
actual path is constructed at extraction time by concatenating the prefix
field, a slash and the name field.

get_path_prefix() is used to determine which slash in the path is used as
the cutting point and thus which part of it is placed into the field
prefix and which into the field name.  It tries to cram as much into the
prefix field as possible.  (And only if we can't fit a path into the
provided 255 characters we use a pax extended header to store it.)

If a path is longer than 100 but shorter than 156 characters and ends
with a slash (i.e. is for a directory) then get_path_prefix() puts the
whole path in the prefix field and leaves the name field empty.  GNU tar
reconstructs the path without complaint, but the tar included with
NetBSD 6 does not: It reports the header to be invalid.

For compatibility with this version of tar, make sure to never leave the
name field empty.  In order to do that, trim the trailing slash from the
part considered as possible prefix, if it exists -- that way the last
path component (or more, but not less) will end up in the name field.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-05 22:56:36 -08:00
Junio C Hamano a5a46eb90f archive: ustar header checksum is computed unsigned
POSIX.1 (pax) is pretty clear on this:

  The chksum field shall be the ISO/IEC 646:1991 standard IRV
  representation of the octal value of the simple sum of all octets
  in the header logical record. Each octet in the header shall be
  treated as an unsigned value. These values shall be added to an
  unsigned integer, initialized to zero, the precision of which is
  not less than 17 bits. When calculating the checksum, the chksum
  field is treated as if it were all <space> characters.

so is GNU:

  http://www.gnu.org/software/tar/manual/html_node/Checksumming.html

Found by 7zip folks and reported by Rafał Mużyło.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-13 10:47:21 -07:00
Junio C Hamano b83cfa5949 Merge branch 'rs/archive-tree-in-tip-simplify'
By René Scharfe
* rs/archive-tree-in-tip-simplify:
  archive-tar: keep const in checksum calculation
  archive: simplify refname handling
2012-05-23 13:35:22 -07:00
René Scharfe bf38245be8 archive-tar: keep const in checksum calculation
For correctness, don't needlessly drop the const qualifier when casting.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-18 11:26:18 -07:00
Nguyễn Thái Ngọc Duy 5544049def archive-tar: stream large blobs to tar file
t5000 verifies output while t1050 makes sure the command always
respects core.bigfilethreshold

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03 10:22:56 -07:00
Nguyễn Thái Ngọc Duy 9cb513b798 archive: delegate blob reading to backend
archive-tar.c and archive-zip.c now perform conversion check, with
help of sha1_file_to_archive() from archive.c

This gives backends more freedom in dealing with (streaming) large
blobs.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03 10:22:56 -07:00
Nguyễn Thái Ngọc Duy 853907097a archive-tar: unindent write_tar_entry by one level
It's used to be

if (!sha1) {
  ...
} else if (!path) {
  ...
} else {
  ...
}

Now that the first two blocks are no-op. We can remove the if/else
skeleton and put the else block back by one indent level.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03 10:22:56 -07:00
Nguyễn Thái Ngọc Duy d240d41021 archive-tar: turn write_tar_entry into blob-writing only
Before this patch write_tar_entry() can:

 - write global header
   by write_global_extended_header() calling write_tar_entry with
   with both sha1 and path == NULL

 - write extended header for symlinks, by write_tar_entry() calling
   itself with sha1 != NULL and path == NULL

 - write a normal blob. In this case both sha1 and path are valid.

After this patch, the first two call sites are modified to write the
header without calling write_tar_entry(). The function is now for
writing blobs only. This simplifies handling when write_tar_entry()
learns about large blobs.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03 10:22:56 -07:00
Jeff King 7b97730b76 upload-archive: allow user to turn off filters
Some tar filters may be very expensive to run, so sites do
not want to expose them via upload-archive. This patch lets
users configure tar.<filter>.remote to turn them off.

By default, gzip filters are left on, as they are about as
expensive as creating zip archives.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King 0e804e0993 archive: provide builtin .tar.gz filter
This works exactly as if the user had configured it via:

  [tar "tgz"]
	command = gzip -cn
  [tar "tar.gz"]
	command = gzip -cn

but since it is so common, it's convenient to have it
builtin without the user needing to do anything.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King 767cf4579f archive: implement configurable tar filters
It's common to pipe the tar output produce by "git archive"
through gzip or some other compressor. Locally, this can
easily be done by using a shell pipe. When requesting a
remote archive, though, it cannot be done through the
upload-archive interface.

This patch allows configurable tar filters, so that one
could define a "tar.gz" format that automatically pipes tar
output through gzip.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King 4d7c989863 archive: pass archiver struct to write_archive callback
The current archivers are very static; when you are in the
write_tar_archive function, you know you are writing a tar.
However, to facilitate runtime-configurable archivers
that will share a common write function we need to tell the
function which archiver was used.

As a convenience, we also provide an opaque data pointer in
the archiver struct so that individual archivers can put
something useful there when they register themselves.
Technically they could just use the "name" field to look in
an internal map of names to data, but this is much simpler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King 13e0f88d4a archive: refactor list of archive formats
Most of the tar and zip code was nicely split out into two
abstracted files which knew only about their specific
formats. The entry point to this code was a single "write
archive" function.

However, as these basic formats grow more complex (e.g., by
handling multiple file extensions and format names), a
static list of the entry point functions won't be enough.
Instead, let's provide a way for the tar and zip code to
tell the main archive code what they support by registering
archiver names and functions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King 40e7629194 archive-tar: don't reload default config options
We load our own tar-specific config, and then chain to
git_default_config. This is pointless, as our caller should
already have loaded the default config. It also introduces a
needless inconsistency with the zip archiver, which does not
look at the config files at all (and therefore relies on the
caller to have loaded config).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:34 -07:00
Junio C Hamano 3e1629f5a6 archive-tar.c: squelch a type mismatch warning
On some systems, giving a value of type time_t to printf "%lo" that
expects an unsigned long would give a type mismatch warning.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-08 23:57:29 -07:00
Brandon Casey f285a2d7ed Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializer
Many call sites use strbuf_init(&foo, 0) to initialize local
strbuf variable "foo" which has not been accessed since its
declaration. These can be replaced with a static initialization
using the STRBUF_INIT macro which is just as readable, saves a
function call, and takes up fewer lines.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-12 12:36:19 -07:00
René Scharfe 8575ea559e archive: remove unused headers
Remove obsolete #includes.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-19 11:17:43 -07:00
René Scharfe 562e25abea archive: centralize archive entry writing
Add the exported function write_archive_entries() to archive.c, which uses
the new ability of read_tree_recursive() to pass a context pointer to its
callback in order to centralize previously duplicated code.

The new callback function write_archive_entry() does the work that every
archiver backend needs to do: loading file contents, entering subdirectories,
handling file attributes, constructing the full path of the entry.  All that
done, it calls the backend specific write_archive_entry_fn_t function.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-15 07:18:04 -07:00
René Scharfe d53fe8187c archive: add baselen member to struct archiver_args
Calculate the length of base and save it in a new member of struct
archiver_args.  This way we don't have to compute it in each of the
format backends.

Note: parse_archive_args() guarantees that ->base won't ever be NULL.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-15 07:18:04 -07:00
René Scharfe 671f070721 add context pointer to read_tree_recursive()
Add a pointer parameter to read_tree_recursive(), which is passed to the
callback function.  This allows callers of read_tree_recursive() to
share data with the callback without resorting to global variables.  All
current callers pass NULL.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-15 07:17:59 -07:00
René Scharfe 008d896df5 Teach new attribute 'export-ignore' to git-archive
Paths marked with this attribute are not output to git-archive
output.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-09 14:53:46 -07:00
Johannes Schindelin ef90d6d420 Provide git_config with a callback-data parameter
git_config() only had a function parameter, but no callback data
parameter.  This assumes that all callback functions only modify
global variables.

With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-14 12:34:44 -07:00
René Scharfe ac7fa2776c git-archive: ignore prefix when checking file attribute
Ulrik Sverdrup noticed that git-archive doesn't correctly apply the attribute
export-subst when the option --prefix is given, too.

When it checked if a file has the attribute turned on, git-archive would try
to look up the full path -- including the prefix -- in .gitattributes.  That's
wrong, as the prefix doesn't need to have any relation to any existing
directories, tracked or not.

This patch makes git-archive ignore the prefix when looking up if value of the
attribute export-subst for a file.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-10 00:20:38 -07:00
Junio C Hamano cc1816b0d6 archive-tar.c: guard config parser from value=NULL
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-11 13:11:36 -08:00
Pierre Habouzit e03e05ff73 Fix the expansion pattern of the pseudo-static path buffer.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
2007-09-20 22:19:17 -07:00
Pierre Habouzit ba3ed09728 Now that cache.h needs strbuf.h, remove useless includes.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-16 17:30:03 -07:00
Pierre Habouzit f1696ee398 Strbuf API extensions and fixes.
* Add strbuf_rtrim to remove trailing spaces.
  * Add strbuf_insert to insert data at a given position.
  * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final
    \0 so the overflow test for snprintf is the strict comparison. This is
    not critical as the growth mechanism chosen will always allocate _more_
    memory than asked, so the second test will not fail. It's some kind of
    miracle though.
  * Add size extension hints for strbuf_init and strbuf_read. If 0, default
    applies, else:
      + initial buffer has the given size for strbuf_init.
      + first growth checks it has at least this size rather than the
        default 8192.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-10 12:48:24 -07:00
Junio C Hamano ddb95de33e Merge branch 'master' into ph/strbuf
* master:
  archive - leakfix for format_subst()
  Make --no-thin the default in git-push to save server resources
  fix doc for --compression argument to pack-objects
  git-tag -s must fail if gpg cannot sign the tag.
  git-svn: understand grafts when doing dcommit
  git-diff: don't squelch the new SHA1 in submodule diffs
  Define NO_MEMMEM on Darwin as it lacks the function
  git-svn: fix "Malformed network data" with svn:// servers
  (cvs|svn)import: Ask git-tag to overwrite old tags.
  git-rebase: fix -C option
  git-rebase: support --whitespace=<option>
  Documentation / grammer nit
  archive: rename attribute specfile to export-subst
  archive: specfile syntax change: "$Format:%PLCHLDR$" instead of just "%PLCHLDR" (take 2)
  add memmem()
  Remove unused function convert_sha1_file()
  archive: specfile support (--pretty=format: in archive files)
  Export format_commit_message()
2007-09-10 11:32:58 -07:00
Pierre Habouzit 7a604f16b7 Simplify strbuf uses in archive-tar.c using strbuf API
This is just cleaner way to deal with strbufs, using its API rather than
reinventing it in the module (e.g. strbuf_append_string is just the plain
strbuf_addstr function, and it was used to perform what strbuf_addch does
anyways).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-06 23:57:44 -07:00
Pierre Habouzit b449f4cfc9 Rework strbuf API and semantics.
The gory details are explained in strbuf.h. The change of semantics this
patch enforces is that the embeded buffer has always a '\0' character after
its last byte, to always make it a C-string. The offs-by-one changes are all
related to that very change.

  A strbuf can be used to store byte arrays, or as an extended string
library. The `buf' member can be passed to any C legacy string function,
because strbuf operations always ensure there is a terminating \0 at the end
of the buffer, not accounted in the `len' field of the structure.

  A strbuf can be used to generate a string/buffer whose final size is not
really known, and then "strbuf_detach" can be used to get the built buffer,
and keep the wrapping "strbuf" structure usable for further work again.

  Other interesting feature: strbuf_grow(sb, size) ensure that there is
enough allocated space in `sb' to put `size' new octets of data in the
buffer. It helps avoiding reallocating data for nothing when the problem the
strbuf helps to solve has a known typical size.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-06 23:57:44 -07:00
René Scharfe 8460b2fcd4 archive: specfile support (--pretty=format: in archive files)
Add support for a new attribute, specfile.  Files marked as being
specfiles are expanded by git-archive when they are written to an
archive.  It has no effect on worktree files.  The same placeholders
as those for the option --pretty=format: of git-log et al. can be
used.

The attribute is useful for creating auto-updating specfiles.  It is
limited by the underlying function format_commit_message(), though.
E.g. currently there is no placeholder for git-describe like output,
and expanded specfiles can't contain NUL bytes.  That can be fixed
in format_commit_message() later and will then benefit users of
git-log, too.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-03 16:46:16 -07:00
Martin Waitz 302b9282c9 rename dirlink to gitlink.
Unify naming of plumbing dirlink/gitlink concept:

git ls-files -z '*.[ch]' |
xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;'

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-21 23:34:54 -07:00
René Scharfe 5e6cfc80e2 git-archive: convert archive entries like checkouts do
As noted by Johan Herland, git-archive is a kind of checkout and needs
to apply any checkout filters that might be configured.

This patch adds the convenience function convert_sha1_file which returns
a buffer containing the object's contents, after converting, if necessary
(i.e. it's a combination of read_sha1_file and convert_to_working_tree).
Direct calls to read_sha1_file in git-archive are then replaced by calls
to convert_sha1_file.

Since convert_sha1_file expects its path argument to be NUL-terminated --
a convention it inherits from convert_to_working_tree -- the patch also
changes the path handling in archive-tar.c to always NUL-terminate the
string.  It used to solely rely on the len field of struct strbuf before.

archive-zip.c already NUL-terminates the path and thus needs no such
change.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-18 16:36:45 -07:00
Lars Hjemli 02851e0b9e git-archive: don't die when repository uses subprojects
Both archive-tar and archive-zip needed to be taught about subprojects.
The tar function died when trying to read the subproject commit object,
while the zip function reported "unsupported file mode".

This fixes both by representing the subproject as an empty directory.

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-12 09:35:07 -07:00
Nicolas Pitre 21666f1aae convert object type handling from a string to a number
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value.  One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.

This is an initial step for the removal of the version using a char array
found in object reading code paths.  The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
René Scharfe f08b3b0e2e Set default "tar" umask to 002 and owner.group to root.root
In order to make the generated tar files more friendly to users who
extract them as root using GNU tar and its implied -p option, change
the default umask to 002 and change the owner name and group name to
root.  This ensures that a) the extracted files and directories are
not world-writable and b) that they belong to user and group root.

Before they would have been assigned to a user and/or group named
git if it existed.  This also answers the question in the removed
comment: uid=0, gid=0, uname=root, gname=root is exactly what we
want.

Normal users who let tar apply their umask while extracting are
only affected if their umask allowed the world to change their
files (e.g. a umask of zero).  This case is so unlikely and strange
that we don't need to support it.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-06 10:37:14 -08:00
Junio C Hamano 85023577a8 simplify inclusion of system header files.
This is a mechanical clean-up of the way *.c files include
system header files.

 (1) sources under compat/, platform sha-1 implementations, and
     xdelta code are exempt from the following rules;

 (2) the first #include must be "git-compat-util.h" or one of
     our own header file that includes it first (e.g. config.h,
     builtin.h, pkt-line.h);

 (3) system headers that are included in "git-compat-util.h"
     need not be included in individual C source files.

 (4) "git-compat-util.h" does not have to include subsystem
     specific header files (e.g. expat.h).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-20 09:51:35 -08:00
Rene Scharfe 3d74982f0b git-tar-tree: Move code for git-archive --format=tar to archive-tar.c
This patch doesn't change any functionality, it only moves code around.  It
makes seeing the few remaining lines of git-tar-tree code easier. ;-)

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-24 19:55:08 -07:00