As the design of signature handling is still being discussed, it is
likely that the data stream produced by the code in Git 2.50 would
have to be changed in such a way that is not backward compatible.
Mark the feature as experimental and discourge its use for now.
Also flip the default on the generation side to "strip"; users of
existing versions would not have passed --signed-commits=strip and
will be broken by this change if the default is made to abort, and
will be encouraged by the error message to produce data stream with
future breakage guarantees by passing --signed-commits option.
As we tone down the default behaviour, we no longer need the
FAST_EXPORT_SIGNED_COMMITS_NOABORT environment variable, which was
not discoverable enough.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"make test" used to have a hard dependency on (basic) Perl; tests
have been rewritten help environment with NO_PERL test the build as
much as possible.
* ps/test-wo-perl-prereq:
t5703: refactor test to not depend on Perl
t5316: refactor `max_chain()` to not depend on Perl
t0210: refactor trace2 scrubbing to not use Perl
t0021: refactor `generate_random_characters()` to not depend on Perl
t/lib-httpd: refactor "one-time-perl" CGI script to not depend on Perl
t/lib-t6000: refactor `name_from_description()` to not depend on Perl
t/lib-gpg: refactor `sanitize_pgp()` to not depend on Perl
t: refactor tests depending on Perl for textconv scripts
t: refactor tests depending on Perl to print data
t: refactor tests depending on Perl substitution operator
t: refactor tests depending on Perl transliteration operator
Makefile: stop requiring Perl when running tests
meson: stop requiring Perl when tests are enabled
t: adapt existing PERL prerequisites
t: introduce PERL_TEST_HELPERS prerequisite
t: adapt `test_readlink()` to not use Perl
t: adapt `test_copy_bytes()` to not use Perl
t: adapt character translation helpers to not use Perl
t: refactor environment sanitization to not use Perl
t: skip chain lint when PERL_PATH is unset
In the early days of Git, Perl was used quite prominently throughout the
project. This has changed significantly as almost all of the executables
we ship nowadays have eventually been rewritten in C. Only a handful of
subsystems remain that require Perl:
- gitweb, a read-only web interface.
- A couple of scripts that allow importing repositories from GNU Arch,
CVS and Subversion.
- git-send-email(1), which can be used to send mails.
- git-request-pull(1), which is used to request somebody to pull from
a URL by sending an email.
- git-filter-branch(1), which uses Perl with the `--state-branch`
option. This command is typically recommended against nowadays in
favor of git-filter-repo(1).
- Our Perl bindings for Git.
- The netrc Git credential helper.
None of these subsystems can really be considered to be part of the
"core" of Git, and an installation without them is fully functional.
It is more likely than not that an end user wouldn't even notice that
any features are missing if those tools weren't installed. But while
Perl nowadays very much is an optional dependency of Git, there is a
significant limitation when Perl isn't available: developers cannot run
our test suite.
Preceding commits have started to lift this restriction by removing the
strict dependency on Perl in many central parts of the test library. But
there are still many tests that rely on small Perl helpers to do various
different things.
Introduce a new PERL_TEST_HELPERS prerequisite that guards all tests
that require Perl. This prerequisite is explicitly different than the
preexisting PERL prerequisite:
- PERL records whether or not features depending on the Perl
interpreter are built.
- PERL_TEST_HELPERS records whether or not a Perl interpreter is
available for our tests.
By having these two separate prerequisites we can thus distinguish
between tests that inherently depend on Perl because the underlying
feature does, and those tests that depend on Perl because the test
itself is using Perl.
Adapt all tests to set the PERL_TEST_HELPERS prerequisite as needed.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-export has a --signed-tags= option that controls how to handle tag
signatures. However, there is no equivalent for commit signatures; it
just silently strips the signature out of the commit (analogously to
--signed-tags=strip).
While signatures are generally problematic for fast-export/fast-import
(because hashes are likely to change), if they're going to support tag
signatures, there's no reason to not also support commit signatures.
So, implement a --signed-commits= option that mirrors the --signed-tags=
option.
On the fast-export side, try to be as much like signed-tags as possible,
in both implementation and in user-interface. This will change the
default behavior to '--signed-commits=abort' from what is now
'--signed-commits=strip'. In order to provide an escape hatch for users
of third-party tools that call fast-export and do not yet know of the
--signed-commits= option, add an environment variable
'FAST_EXPORT_SIGNED_COMMITS_NOABORT=1' that changes the default to
'--signed-commits=warn-strip'.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --signed-tags= option takes one of five arguments specifying how to
handle signed tags during export. Among these arguments, 'strip' is to
'warn-strip' as 'verbatim' is to 'warn' (the unmentioned argument is
'abort', which stops the fast-export process entirely). That is,
signatures are either stripped or copied verbatim while exporting, with
or without a warning.
Match the pattern and rename 'warn' to 'warn-verbatim' to make it clear
that it instructs fast-export to copy signatures verbatim.
To maintain backwards compatibility, 'warn' is still recognized as
deprecated synonym of 'warn-verbatim'.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fast-import" learned to reject paths with ".." and "." as
their components to avoid creating invalid tree objects.
* en/fast-import-verify-path:
t9300: test verification of renamed paths
fast-import: disallow more path components
fast-import: disallow "." and ".." path components
Instead of just disallowing '.' and '..', make use of verify_path() to
ensure that fast-import will disallow anything we wouldn't allow into
the index, such as anything under .git/, .gitmodules as a symlink, or
a dos drive prefix on Windows.
Since a few fast-export and fast-import tests that tried to stress-test
the correct handling of quoting relied on filenames that fail
is_valid_win32_path(), such as spaces or periods at the end of filenames
or backslashes within the filename, turn off core.protectNTFS for those
tests to ensure they keep passing.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there
is no longer a need to have that variable declared in all of our tests.
Drop it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The iconv library is used by Git to reencode files, commit messages and
other things. As such it is a rather integral part, but given that many
platforms nowadays use UTF-8 everywhere you can live without support for
reencoding in many situations. It is thus optional to build Git with
iconv, and some of our platforms wired up in "config.mak.uname" disable
it. But while we support building without it, running our test suite
with "NO_ICONV=Yes" causes many test failures.
Wire up a new test prerequisite ICONV that gets populated via our
GIT-BUILD-OPTIONS. Annotate failing tests accordingly.
Note that this commit does not do a deep dive into every single test to
assess whether the failure is expected or not. Most of the tests do
smell like the expected kind of failure though.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Both `rewrite_parents()` and `remove_duplicate_parents()` may end up
dropping some parents from a commit without freeing the respective
`struct commit_list` items. This causes a bunch of memory leaks. Plug
these.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we added generic end-of-options support in 51b4594b40
(parse-options: allow --end-of-options as a synonym for "--",
2019-08-06), we made them true synonyms. They both stop option parsing,
and they are both returned in the resulting argv if the KEEP_DASHDASH
flag is used.
The hope was that this would work for all callers:
- most generic callers would not pass KEEP_DASHDASH, and so would just
do the right thing (stop parsing there) without needing to know
anything more.
- callers with KEEP_DASHDASH were generally going to rely on
setup_revisions(), which knew to handle --end-of-options specially
But that turned out miss quite a few cases that pass KEEP_DASHDASH but
do their own manual parsing. For example, "git reset", "git checkout",
and so on want pass KEEP_DASHDASH so they can support:
git reset $revs -- $paths
but of course aren't going to actually do a traversal, so they don't
call setup_revisions(). And those cases currently get confused by
--end-of-options being left in place, like:
$ git reset --end-of-options HEAD
fatal: option '--end-of-options' must come before non-option arguments
We could teach each of these callers to handle the leftover option
explicitly. But let's try to be a bit more clever and see if we can
solve it centrally in parse-options.c.
The bogus assumption here is that KEEP_DASHDASH tells us the caller
wants to see --end-of-options in the result. But really, the callers
which need to know that --end-of-options was reached are those that may
potentially parse more options from argv. In other words, those that
pass the KEEP_UNKNOWN_OPT flag.
If such a caller is aware of --end-of-options (e.g., because they call
setup_revisions() with the result), then this will continue to do the
right thing, treating anything after --end-of-options as a non-option.
And if the caller is not aware of --end-of-options, they are better off
keeping it intact, because either:
1. They are just passing the options along to somebody else anyway, in
which case that somebody would need to know about the
--end-of-options marker.
2. They are going to parse the remainder themselves, at which point
choking on --end-of-options is much better than having it silently
removed. The point is to avoid option injection from untrusted
command line arguments, and bailing is better than quietly treating
the untrusted argument as an option.
This fixes bugs with --end-of-options across several commands, but I've
focused on two in particular here:
- t7102 confirms that "git reset --end-of-options --foo" now works.
This checks two things. One, that we no longer barf on
"--end-of-options" itself (which previously we did, even if the rev
was something vanilla like "HEAD" instead of "--foo"). And two, that
we correctly treat "--foo" as a revision rather than an option.
This fix applies to any other cases which pass KEEP_DASHDASH but not
KEEP_UNKNOWN_OPT, like "git checkout", "git check-attr", "git grep",
etc, which would previously choke on "--end-of-options".
- t9350 shows the opposite case: fast-export passed KEEP_UNKNOWN_OPT
but not KEEP_DASHDASH, but then passed the result on to
setup_revisions(). So it never saw --end-of-options, and would
erroneously parse "fast-export --end-of-options --foo" as having a
"--foo" option. This is now fixed.
Note that this does shut the door for callers which want to know if we
hit end-of-options, but don't otherwise need to keep unknown opts. The
obvious thing here is feeding it to the DWIM verify_filename()
machinery. And indeed, this is a problem even for commands which do
understand --end-of-options already. For example, without this patch,
you get:
$ git log --end-of-options --foo
fatal: option '--foo' must come before non-option arguments
because we refuse to accept "--foo" as a filename (because it starts
with a dash) even though we could know that we saw end-of-options. The
verify_filename() function simply doesn't accept this extra information.
So that is the status quo, and this patch doubles down further on that.
Commands like "git reset" have the same problem, but they won't even
know that parse-options saw --end-of-options! So even if we fixed
verify_filename(), they wouldn't have anything to pass to it.
But in practice I don't think this is a big deal. If you are being
careful enough to use --end-of-options, then you should also be using
"--" to disambiguate and avoid the DWIM behavior in the first place. In
other words, doing:
git log --end-of-options --this-is-a-rev -- --this-is-a-path
works correctly, and will continue to do so. And likewise, with this
patch now:
git reset --end-of-options --this-is-a-rev -- --this-is-a-path
will work, as well.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many test scripts use hash-object to create malformed objects to see how
we handle the results in various commands. In some cases we already have
to use "hash-object --literally", because it does some rudimentary
quality checks. But let's use "--literally" more consistently to
future-proof these tests against hash-object learning to be more
careful.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prepare for the default value of `protocol.file.allow` to change to
"user", ensure tests that rely on local submodules can initialize them
over the file protocol.
Tests that interact with submodules a handful of times use
`test_config_global`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
e900d494dc (diff: add an API for deferred freeing, 2021-02-11) added a
way to allow reusing diffopts: the no_free bit. 244c27242f (diff.[ch]:
have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made
that mechanism mandatory.
git fast-export doesn't set no_free, so path limiting stopped working
after the first commit. Set the flag and add a basic test to make sure
only changes to the specified files are exported.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The revision traversal machinery typically processes and returns all
children before any parent. fast-export needs to operate in the
reverse fashion, handling parents before any of their children in
order to build up the history starting from the root commit(s). This
would be a clear case where we could just use the revision traversal
machinery's "reverse" option to achieve this desired affect.
However, this wasn't what the code did. It added its own array for
queuing. The obvious hand-rolled solution would be to just push all
the commits into the array and then traverse afterwards, but it didn't
quite do that either. It instead attempted to process anything it
could as soon as it could, and once it could, check whether it could
process anything that had been queued. As far as I can tell, this was
an effort to save a little memory in the case of multiple root commits
since it could process some commits before queueing all of them. This
involved some helper functions named has_unshown_parent() and
handle_tail(). For typical invocations of fast-export, this
alternative essentially amounted to a hand-rolled method of reversing
the commits -- it was a bunch of work to duplicate the revision
traversal machinery's "reverse" option.
This hand-rolled reversing mechanism is actually somewhat difficult to
reason about. It takes some time to figure out how it ensures in
normal cases that it will actually process all traversed commits
(rather than just dropping some and not printing anything for them).
And it turns out there are some cases where the code does drop commits
without handling them, and not even printing an error or warning for
the user. Due to the has_unshown_parent() checks, some commits could
be left in the array at the end of the "while...get_revision()" loop
which would be unprocessed. This could be triggered for example with
git fast-export main -- --first-parent
or non-sensical traversal rules such as
git fast-export main -- --grep=Merge --invert-grep
While most traversals that don't include all parents should likely
trigger errors in fast-export (or at least require being used in
combination with --reference-excluded-parents), the --first-parent
traversal is at least reasonable and it'd be nice if it didn't just drop
commits. It'd also be nice for future readers of the code to have a
simpler "reverse traversal" mechanism. Use the "reverse" option of the
revision traversal machinery to achieve both.
Even for the non-sensical traversal flags like the --grep one above,
this would be an improvement. For example, in that case, the code
previously would have silently truncated history to only those commits
that do not have an ancestor containing "Merge" in their commit message.
After this code change, that case would include all commits without
"Merge" in their commit message -- but any commit that previously had a
"Merge"-mentioning parent would lose that parent
(likely resulting in many new root commits). While the new behavior is
still odd, it is at least understandable given that
--reference-excluded-parents is not the default.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: William Sprent <williams@unity3d.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This trick was performed via
$ (cd t &&
sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
-e 's/Master/Main/g' -- t9[0-4]*.sh)
This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition to the manual adjustment to let the `linux-gcc` CI job run
the test suite with `master` and then with `main`, this patch makes sure
that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts
that currently rely on the initial branch name being `master by default.
To determine which test scripts to mark up, the first step was to
force-set the default branch name to `master` in
- all test scripts that contain the keyword `master`,
- t4211, which expects `t/t4211/history.export` with a hard-coded ref to
initialize the default branch,
- t5560 because it sources `t/t556x_common` which uses `master`,
- t8002 and t8012 because both source `t/annotate-tests.sh` which also
uses `master`)
This trick was performed by this command:
$ sed -i '/^ *\. \.\/\(test-lib\|lib-\(bash\|cvs\|git-svn\)\|gitweb-lib\)\.sh$/i\
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
' $(git grep -l master t/t[0-9]*.sh) \
t/t4211*.sh t/t5560*.sh t/t8002*.sh t/t8012*.sh
After that, careful, manual inspection revealed that some of the test
scripts containing the needle `master` do not actually rely on a
specific default branch name: either they mention `master` only in a
comment, or they initialize that branch specificially, or they do not
actually refer to the current default branch. Therefore, the
aforementioned modification was undone in those test scripts thusly:
$ git checkout HEAD -- \
t/t0027-auto-crlf.sh t/t0060-path-utils.sh \
t/t1011-read-tree-sparse-checkout.sh \
t/t1305-config-include.sh t/t1309-early-config.sh \
t/t1402-check-ref-format.sh t/t1450-fsck.sh \
t/t2024-checkout-dwim.sh \
t/t2106-update-index-assume-unchanged.sh \
t/t3040-subprojects-basic.sh t/t3301-notes.sh \
t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \
t/t3436-rebase-more-options.sh \
t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \
t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \
t/t5511-refspec.sh t/t5526-fetch-submodules.sh \
t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \
t/t5548-push-porcelain.sh \
t/t5552-skipping-fetch-negotiator.sh \
t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \
t/t5614-clone-submodules-shallow.sh \
t/t7508-status.sh t/t7606-merge-custom.sh \
t/t9302-fast-import-unpack-limit.sh
We excluded one set of test scripts in these commands, though: the range
of `git p4` tests. The reason? `git p4` stores the (foreign) remote
branch in the branch called `p4/master`, which is obviously not the
default branch. Manual analysis revealed that only five of these tests
actually require a specific default branch name to pass; They were
modified thusly:
$ sed -i '/^ *\. \.\/lib-git-p4\.sh$/i\
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
' t/t980[0167]*.sh t/t9811*.sh
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test checks for several commit object sizes to verify that objects
are encoded as expected. However, the size of a commit object differs
between SHA-1 and SHA-256, since each contains a hex representation of
the tree's object ID. Since these are root commits, compute the size of
each commit by using a constant plus the size of a single hex object ID.
In addition, use $ZERO_OID instead of a hard-coded object ID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.23: (44 commits)
Git 2.23.1
Git 2.22.2
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
...
* maint-2.22: (43 commits)
Git 2.22.2
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
...
* maint-2.20: (36 commits)
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
...
* maint-2.17: (32 commits)
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
...
On Windows, file names cannot contain asterisks nor newline characters.
In an upcoming commit, we will make this limitation explicit,
disallowing even the creation of commits that introduce such file names.
However, in the test scripts touched by this patch, we _know_ that those
paths won't be checked out, so we _want_ to allow such file names.
Happily, the stringent path validation will be guarded via the
`core.protectNTFS` flag, so all we need to do is to force that flag off
temporarily.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The backslash character is not a valid part of a file name on Windows.
Hence it is dangerous to allow writing files that were unpacked from
tree objects, when the stored file name contains a backslash character:
it will be misinterpreted as directory separator.
This not only causes ambiguity when a tree contains a blob `a\b` and a
tree `a` that contains a blob `b`, but it also can be used as part of an
attack vector to side-step the careful protections against writing into
the `.git/` directory during a clone of a maliciously-crafted
repository.
Let's prevent that, addressing CVE-2019-1354.
Note: we guard against backslash characters in tree objects' file names
_only_ on Windows (because on other platforms, even on those where NTFS
volumes can be mounted, the backslash character is _not_ a directory
separator), and _only_ when `core.protectNTFS = true` (because users
might need to generate tree objects for other platforms, of course
without touching the worktree, e.g. using `git update-index
--cacheinfo`).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Multiple changes here:
* add a test for a tag of a blob
* add a test for a tag of a tag of a commit
* add a comment to the tests for (possibly nested) tags of trees,
making it clear that these tests are doing much less than you might
expect
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new option, --mark-tags, which will output mark identifiers with
each tag object. This improves the incremental export story with
--export-marks since it will allow us to record that annotated tags have
been exported, and it is also needed as a step towards supporting nested
tags.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import has support for both an --import-marks flag and an
--import-marks-if-exists flag; the latter of which will not die() if the
file does not exist. fast-export only had support for an --import-marks
flag; add an --import-marks-if-exists flag for consistency.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-export allows specifying revision ranges, which can be used to
export a tag without exporting the commit it tags. fast-export handled
this rather poorly: it would emit a "from :0" directive. Since marks
start at 1 and increase, this means it refers to an unknown commit and
fast-import will choke on the input.
When we are unable to look up a mark for the object being tagged, use a
"from $HASH" directive instead to fix this problem.
Note that this is quite similar to the behavior fast-export exhibits
with commits and parents when --reference-excluded-parents is passed
along with an excluded commit range. For tags of excluded commits we do
not require the --reference-excluded-parents flag because we always have
to tag something. By contrast, when dealing with commits, pruning a
parent is always a viable option, so we need the flag to specify that
parent pruning is not wanted. (It is slightly weird that
--reference-excluded-parents isn't the default with a separate
--prune-excluded-parents flag, but backward compatibility concerns
resulted in the current defaults.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Automatic re-encoding of commit messages (and dropping of the encoding
header) hurts attempts to do reversible history rewrites (e.g. sha1sum
<-> sha256sum transitions, some subtree rewrites), and seems
inconsistent with the general principle followed elsewhere in
fast-export of requiring explicit user requests to modify the output
(e.g. --signed-tags=strip, --tag-of-filtered-object=rewrite). Add a
--reencode flag that the user can use to specify, and like other
fast-export flags, default it to 'abort'.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fast-export encounters a commit with an 'encoding' header, it tries
to reencode in UTF-8 and then drops the encoding header. However, if it
fails to reencode in UTF-8 because e.g. one of the characters in the
commit message was invalid in the old encoding, then we need to retain
the original encoding or otherwise we lose information needed to
understand all the other (valid) characters in the original commit
message.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test used an author with non-ascii characters in the name, but no
special commit message. It then grep'ed for those non-ascii characters,
but those are guaranteed to exist regardless of the reencoding process
since the reencoding only affects the commit message, not the author or
committer names. As such, the test would work even if the re-encoding
process simply stripped the commit message entirely. Modify the test to
actually check that the reencoding into UTF-8 worked.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Knowing the original names (hashes) of commits can sometimes enable
post-filtering that would otherwise be difficult or impossible. In
particular, the desire to rewrite commit messages which refer to other
prior commits (on top of whatever other filtering is being done) is
very difficult without knowing the original names of each commit.
In addition, knowing the original names (hashes) of blobs can allow
filtering by blob-id without requiring re-hashing the content of the
blob, and is thus useful as a small optimization.
Once we add original ids for both commits and blobs, we may as well
add them for tags too for completeness. Perhaps someone will have a
use for them.
This commit teaches a new --show-original-ids option to fast-export
which will make it add a 'original-oid <hash>' line to blob, commits,
and tags. It also teaches fast-import to parse (and ignore) such
lines.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git filter-branch has a nifty feature allowing you to rewrite, e.g. just
the last 8 commits of a linear history
git filter-branch $OPTIONS HEAD~8..HEAD
If you try the same with git fast-export, you instead get a history of
only 8 commits, with HEAD~7 being rewritten into a root commit. There
are two alternatives:
1) Don't use the negative revision specification, and when you're
filtering the output to make modifications to the last 8 commits,
just be careful to not modify any earlier commits somehow.
2) First run 'git fast-export --export-marks=somefile HEAD~8', then
run 'git fast-export --import-marks=somefile HEAD~8..HEAD'.
Both are more error prone than I'd like (the first for obvious reasons;
with the second option I have sometimes accidentally included too many
revisions in the first command and then found that the corresponding
extra revisions were not exported by the second command and thus were
not modified as I expected). Also, both are poor from a performance
perspective.
Add a new --reference-excluded-parents option which will cause
fast-export to refer to commits outside the specified rev-list-args
range by their sha1sum. Such a stream will only be useful in a
repository which already contains the necessary commits (much like the
restriction imposed when using --no-data).
Note from Peff:
I think we might be able to do a little more optimization here. If
we're exporting HEAD^..HEAD and there's an object in HEAD^ which is
unchanged in HEAD, I think we'd still print it (because it would not
be marked SHOWN), but we could omit it (by walking the tree of the
boundary commits and marking them shown). I don't think it's a
blocker for what you're doing here, but just a possible future
optimization.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If file paths are specified to fast-export and a ref points to a commit
that does not touch any of the relevant paths, then that ref would
sometimes fail to be exported. (This depends on whether any ancestors
of the commit which do touch the relevant paths would be exported with
that same ref name or a different ref name.) To avoid this problem,
put *all* specified refs into extra_refs to start, and then as we export
each commit, remove the refname used in the 'commit $REFNAME' directive
from extra_refs. Then, in handle_tags_and_duplicates() we know which
refs actually do need a manual reset directive in order to be included.
This means that we do need some special handling for excluded refs; e.g.
if someone runs
git fast-export ^master master
then they've asked for master to be exported, but they have also asked
for the commit which master points to and all of its history to be
excluded. That logically means ref deletion. Previously, such refs
were just silently omitted from being exported despite having been
explicitly requested for export.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If file paths are specified to fast-export and multiple refs point to a
commit that does not touch any of the relevant file paths, then
fast-export can hit problems. fast-export has a list of additional refs
that it needs to explicitly set after exporting all blobs and commits,
and when it tries to get_object_mark() on the relevant commit, it can
get a mark of 0, i.e. "not found", because the commit in question did
not touch the relevant paths and thus was not exported. Trying to
import a stream with a mark corresponding to an unexported object will
cause fast-import to crash.
Avoid this problem by taking the commit the ref points to and finding an
ancestor of it that was exported, and make the ref point to that commit
instead.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If --tag-of-filtered-object=rewrite is specified along with a set of
paths to limit what is exported, then any tags pointing to old commits
that do not contain any of those specified paths cause problems. Since
the old tagged commit is not exported, fast-export attempts to rewrite
such tags to an ancestor commit which was exported. If no such commit
exists, then fast-export currently die()s. Five years after the tag
rewriting logic was added to fast-export (see commit 2d8ad46919,
"fast-export: Add a --tag-of-filtered-object option for newly dangling
tags", 2009-06-25), fast-import gained the ability to delete refs (see
commit 4ee1b225b9, "fast-import: add support to delete refs",
2014-04-20), so now we do have a valid option to rewrite the tag to.
Delete these tags instead of dying.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fast-export" had a regression in v2.15.0 era where it skipped
some merge commits in certain cases, which has been corrected.
* ma/fast-export-skip-merge-fix:
fast-export: fix regression skipping some merge-commits
7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
noted that the pattern `object = array.objects[--array.nr].item` could
be abstracted as `object = object_array_pop(&array)`.
Unfortunately, one of the conversions was horribly wrong. Between
grabbing the last object (i.e., peeking at it) and decreasing the object
count, the original code would sometimes return early. The updated code
on the other hand, will always pop the last element, then maybe do the
early return without doing anything with the object.
The end result is that merge commits where all the parents have still
not been exported will simply be dropped, meaning that they will be
completely missing from the exported data.
Re-add a commit when it is not yet time to handle it. An alternative
that was considered was to peek-then-pop. That carries some risk with it
since the peeking and popping need to act on the same object, in a
concerted fashion.
Add a test that would have caught this.
Reported-by: Isaac Chou <Isaac.Chou@microfocus.com>
Analyzed-by: Isaac Chou <Isaac.Chou@microfocus.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid using pipes downstream of Git commands since the exit codes
of commands upstream of pipes get swallowed, thus potentially
hiding failure of those commands. Instead, capture Git command
output to a file and apply the downstream command(s) to that file.
Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fast-export" with -M/-C option issued "copy" instruction on a
path that is simultaneously modified, which was incorrect.
* jt/fast-export-copy-modify-fix:
fast-export: do not copy from modified file
"git fast-export" with -M/-C option issued "copy" instruction on a
path that is simultaneously modified, which was incorrect.
* jt/fast-export-copy-modify-fix:
fast-export: do not copy from modified file
When run with the "-C" option, fast-export writes 'C' commands in its
output whenever the internal diff mechanism detects a file copy,
indicating that fast-import should copy the given existing file to the
given new filename. However, the diff mechanism works against the
prior version of the file, whereas fast-import uses whatever is current.
This causes issues when a commit both modifies a file and uses it as the
source for a copy.
Therefore, teach fast-export to refrain from writing 'C' when it has
already written a modification command for a file.
An existing test in t9350-fast-export is also fixed in this patch. The
existing line "C file6 file7" copies the wrong version of file6, but it
has coincidentally worked because file7 was subsequently overridden.
Reported-by: Juraj Oršulić <juraj.orsulic@fer.hr>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>