Browse Source

builtin/clone.c: disallow `--local` clones with symlinks

When cloning a repository with `--local`, Git relies on either making a
hardlink or copy to every file in the "objects" directory of the source
repository. This is done through the callpath `cmd_clone()` ->
`clone_local()` -> `copy_or_link_directory()`.

The way this optimization works is by enumerating every file and
directory recursively in the source repository's `$GIT_DIR/objects`
directory, and then either making a copy or hardlink of each file. The
only exception to this rule is when copying the "alternates" file, in
which case paths are rewritten to be absolute before writing a new
"alternates" file in the destination repo.

One quirk of this implementation is that it dereferences symlinks when
cloning. This behavior was most recently modified in 36596fd2df (clone:
better handle symlinked files at .git/objects/, 2019-07-10), which
attempted to support `--local` clones of repositories with symlinks in
their objects directory in a platform-independent way.

Unfortunately, this behavior of dereferencing symlinks (that is,
creating a hardlink or copy of the source's link target in the
destination repository) can be used as a component in attacking a
victim by inadvertently exposing the contents of file stored outside of
the repository.

Take, for example, a repository that stores a Dockerfile and is used to
build Docker images. When building an image, Docker copies the directory
contents into the VM, and then instructs the VM to execute the
Dockerfile at the root of the copied directory. This protects against
directory traversal attacks by copying symbolic links as-is without
dereferencing them.

That is, if a user has a symlink pointing at their private key material
(where the symlink is present in the same directory as the Dockerfile,
but the key itself is present outside of that directory), the key is
unreadable to a Docker image, since the link will appear broken from the
container's point of view.

This behavior enables an attack whereby a victim is convinced to clone a
repository containing an embedded submodule (with a URL like
"file:///proc/self/cwd/path/to/submodule") which has a symlink pointing
at a path containing sensitive information on the victim's machine. If a
user is tricked into doing this, the contents at the destination of
those symbolic links are exposed to the Docker image at runtime.

One approach to preventing this behavior is to recreate symlinks in the
destination repository. But this is problematic, since symlinking the
objects directory are not well-supported. (One potential problem is that
when sharing, e.g. a "pack" directory via symlinks, different writers
performing garbage collection may consider different sets of objects to
be reachable, enabling a situation whereby garbage collecting one
repository may remove reachable objects in another repository).

Instead, prohibit the local clone optimization when any symlinks are
present in the `$GIT_DIR/objects` directory of the source repository.
Users may clone the repository again by prepending the "file://" scheme
to their clone URL, or by adding the `--no-local` option to their `git
clone` invocation.

The directory iterator used by `copy_or_link_directory()` must no longer
dereference symlinks (i.e., it *must* call `lstat()` instead of `stat()`
in order to discover whether or not there are symlinks present). This has
no bearing on the overall behavior, since we will immediately `die()` on
encounter a symlink.

Note that t5604.33 suggests that we do support local clones with
symbolic links in the source repository's objects directory, but this
was likely unintentional, or at least did not take into consideration
the problem with sharing parts of the objects directory with symbolic
links at the time. Update this test to reflect which options are and
aren't supported.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
maint
Taylor Blau 3 years ago
parent
commit
6f054f9fb3
  1. 8
      builtin/clone.c
  2. 50
      t/t5604-clone-reference.sh

8
builtin/clone.c

@ -420,13 +420,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, @@ -420,13 +420,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
int src_len, dest_len;
struct dir_iterator *iter;
int iter_status;
unsigned int flags;
struct strbuf realpath = STRBUF_INIT;

mkdir_if_missing(dest->buf, 0777);

flags = DIR_ITERATOR_PEDANTIC | DIR_ITERATOR_FOLLOW_SYMLINKS;
iter = dir_iterator_begin(src->buf, flags);
iter = dir_iterator_begin(src->buf, DIR_ITERATOR_PEDANTIC);

if (!iter)
die_errno(_("failed to start iterator over '%s'"), src->buf);
@ -442,6 +440,10 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, @@ -442,6 +440,10 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
strbuf_setlen(dest, dest_len);
strbuf_addstr(dest, iter->relative_path);

if (S_ISLNK(iter->st.st_mode))
die(_("symlink '%s' exists, refusing to clone with --local"),
iter->relative_path);

if (S_ISDIR(iter->st.st_mode)) {
mkdir_if_missing(dest->buf, 0777);
continue;

50
t/t5604-clone-reference.sh

@ -300,8 +300,6 @@ test_expect_success SYMLINKS 'setup repo with manually symlinked or unknown file @@ -300,8 +300,6 @@ test_expect_success SYMLINKS 'setup repo with manually symlinked or unknown file
ln -s ../an-object $obj &&

cd ../ &&
find . -type f | sort >../../../T.objects-files.raw &&
find . -type l | sort >../../../T.objects-symlinks.raw &&
echo unknown_content >unknown_file
) &&
git -C T fsck &&
@ -310,19 +308,27 @@ test_expect_success SYMLINKS 'setup repo with manually symlinked or unknown file @@ -310,19 +308,27 @@ test_expect_success SYMLINKS 'setup repo with manually symlinked or unknown file


test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at objects/' '
for option in --local --no-hardlinks --shared --dissociate
# None of these options work when cloning locally, since T has
# symlinks in its `$GIT_DIR/objects` directory
for option in --local --no-hardlinks --dissociate
do
git clone $option T T$option || return 1 &&
git -C T$option fsck || return 1 &&
git -C T$option rev-list --all --objects >T$option.objects &&
test_cmp T.objects T$option.objects &&
(
cd T$option/.git/objects &&
find . -type f | sort >../../../T$option.objects-files.raw &&
find . -type l | sort >../../../T$option.objects-symlinks.raw
)
test_must_fail git clone $option T T$option 2>err || return 1 &&
test_i18ngrep "symlink.*exists" err || return 1
done &&

# But `--shared` clones should still work, even when specifying
# a local path *and* that repository has symlinks present in its
# `$GIT_DIR/objects` directory.
git clone --shared T T--shared &&
git -C T--shared fsck &&
git -C T--shared rev-list --all --objects >T--shared.objects &&
test_cmp T.objects T--shared.objects &&
(
cd T--shared/.git/objects &&
find . -type f | sort >../../../T--shared.objects-files.raw &&
find . -type l | sort >../../../T--shared.objects-symlinks.raw
) &&

for raw in $(ls T*.raw)
do
sed -e "s!/../!/Y/!; s![0-9a-f]\{38,\}!Z!" -e "/commit-graph/d" \
@ -330,26 +336,6 @@ test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at obje @@ -330,26 +336,6 @@ test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at obje
sort $raw.de-sha-1 >$raw.de-sha || return 1
done &&

cat >expected-files <<-EOF &&
./Y/Z
./Y/Z
./Y/Z
./a-loose-dir/Z
./an-object
./info/packs
./pack/pack-Z.idx
./pack/pack-Z.pack
./packs/pack-Z.idx
./packs/pack-Z.pack
./unknown_file
EOF

for option in --local --no-hardlinks --dissociate
do
test_cmp expected-files T$option.objects-files.raw.de-sha || return 1 &&
test_must_be_empty T$option.objects-symlinks.raw.de-sha || return 1
done &&

echo ./info/alternates >expected-files &&
test_cmp expected-files T--shared.objects-files.raw &&
test_must_be_empty T--shared.objects-symlinks.raw

Loading…
Cancel
Save