Because Git tracks symbolic links as symbolic links, a path that
has a symbolic link in its leading part (e.g. path/to/dir/file,
where path/to/dir is a symbolic link to somewhere else, be it
inside or outside the working tree) can never appear in a patch
that validly applies, unless the same patch first removes the
symbolic link to allow a directory to be created there.
Detect and reject such a patch.
Things to note:
- Unfortunately, we cannot reuse the has_symlink_leading_path()
from dir.c, as that is only about the working tree, but "git
apply" can be told to apply the patch only to the index or to
both the index and to the working tree.
- We cannot directly use has_symlink_leading_path() even when we
are applying only to the working tree, as an early patch of a
valid input may remove a symbolic link path/to/dir and then a
later patch of the input may create a path path/to/dir/file, but
"git apply" first checks the input without touching either the
index or the working tree. The leading symbolic link check must
be done on the interim result we compute in-core (i.e. after the
first patch, there is no path/to/dir symbolic link and it is
perfectly valid to create path/to/dir/file).
Similarly, when an input creates a symbolic link path/to/dir and
then creates a file path/to/dir/file, we need to flag it as an
error without actually creating path/to/dir symbolic link in the
filesystem.
Instead, for any patch in the input that leaves a path (i.e. a non
deletion) in the result, we check all leading paths against the
resulting tree that the patch would create by inspecting all the
patches in the input and then the target of patch application
(either the index or the working tree).
This way, we catch a mischief or a mistake to add a symbolic link
path/to/dir and a file path/to/dir/file at the same time, while
allowing a valid patch that removes a symbolic link path/to/dir and
then adds a file path/to/dir/file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We should reject a patch, whether it renames/copies dir/file to
elsewhere with or without modificiation, or updates dir/file in
place, if "dir/" part is actually a symbolic link to elsewhere,
by making sure that the code to read the preimage does not read
from a path that is beyond a symbolic link.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently read the preimage to apply a patch from the index only
when the --cached option is given. Do so also when the command is
running under the --index option. With --index, the index entry and
the working tree file for a path that is involved in a patch must be
identical, so this should not affect the result, but by reading from
the index, we will get the protection to avoid reading an unintended
path beyond a symbolic link automatically.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, a patch that affects outside the working area (either a
Git controlled working tree, or the current working directory when
"git apply" is used as a replacement of GNU patch) is rejected as a
mistake (or a mischief). Git itself does not create such a patch,
unless the user bends over backwards and specifies a non-standard
prefix to "git diff" and friends.
When `git apply` is used as a "better GNU patch", the user can pass
the `--unsafe-paths` option to override this safety check. This
option has no effect when `--index` or `--cached` is in use.
The new test was stolen from Jeff King with slight enhancements.
Note that a few new tests for touching outside the working area by
following a symbolic link are still expected to fail at this step,
but will be fixed in later steps.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the incoming patch has whitespace errors in a common context
line (i.e. a line that is expected to be found and is not modified
by the patch), "apply --whitespace=fix" corrects the whitespace
errors the line has, in addition to the whitespace error on a line
that is updated by the patch. However, we did not count and report
that we fixed whitespace errors on such lines.
[jc: This is iffy. What if the whitespace error has been fixed in
the target since the patch was written? A common context line we
see in the patch has errors, and it matches a line in the target
that has the errors already corrected, resulting in no change, which
we may not want to count after all. On the other hand, we are
reporting whitespace errors _in_ the incoming patch, so...]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Under --whitespace=fix option, match_fragment() function examines
the preimage (the common context and the removed lines in the patch)
and the file being patched and checks if they match after correcting
all whitespace errors. When they are found to match, the common
context lines in the preimage is replaced with the fixed copy,
because these lines will then be copied to the corresponding place
in the postimage by a later call to update_pre_post_images(). Lines
that are added in the postimage, under --whitespace=fix, have their
whitespace errors already fixed when apply_one_fragment() prepares
the preimage and the postimage, so in the end, application of the
patch can be done by replacing the block of text in the file being
patched that matched the preimage with what is in the postimage that
was updated by update_pre_post_images().
In the earlier days, fixing whitespace errors always resulted in
reduction of size, either collapsing runs of spaces in the indent to
a tab or removing the trailing whitespaces. These days, however,
some whitespace error fix results in extending the size.
250b3c6c (apply --whitespace=fix: avoid running over the postimage
buffer, 2013-03-22) tried to compute the final postimage size but
its math was flawed. It counted the size of the block of text in
the original being patched after fixing the whitespace errors on its
lines that correspond to the preimage. That number does not have
much to do with how big the final postimage would be.
Instead count (1) the added lines in the postimage, whose size is
the same as in the final patch result because their whitespace
errors have already been corrected, and (2) the fixed size of the
lines that are common.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git apply --whitespace=fix" used to be able to assume that fixing
errors will always reduce the size by e.g. stripping whitespaces at
the end of lines or collapsing runs of spaces into tabs at the
beginning of lines. An update to accomodate fixes that lengthens
the result by e.g. expanding leading tabs into spaces were made long
time ago but the logic miscounted the necessary space after such
whitespace fixes, leading to either under-allocation or over-usage
of already allocated space.
Illustrate this with a runtime sanity-check to protect us from
future breakage. The test was stolen from Kyle McKay who helped
to identify the problem.
Helped-by: "Kyle J. McKay" <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:
- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This file had its own identical helper that predates
xstrdup_or_null. Let's use the global one to avoid
repetition.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new name is more consistent with the names of other
string_list-related functions.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue where ae021d87 (use skip_prefix to avoid magic numbers) left off
and use skip_prefix() in more places for determining the lengths of prefix
strings to avoid using dependent constants and other indirect methods.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the interface declaration for the functions in lockfile.c from
cache.h to a new file, lockfile.h. Add #includes where necessary (and
remove some redundant includes of cache.h by files that already
include builtin.h).
Move the documentation of the lock_file state diagram from lockfile.c
to the new header file.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use `git_config_get_string_const()` instead of `git_config()` to take
advantage of the config-set API which provides a cleaner control flow.
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whitespace breakages are checked while the patch is being parsed.
Disable them at the beginning of parse_chunk(), where each
individual patch is parsed, immediately after we learn the name of
the file the patch applies to and before we start parsing the diff
contained in the patch.
One may naively think that we should be able to not just skip the
whitespace checks but simply fast-forward to the next patch without
doing anything once use_patch() tells us that this patch is not
going to be used. But in reality we cannot really skip much of the
parsing in order to do such a "fast-forward", primarily because
parsing "@@ -k,l +m,n @@" lines and counting the input lines is how
we determine the boundaries of individual patches.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We parse each patchfile and find the name of the path the patch
applies to, and then use that name to consult the attribute system
to find the whitespace rules to be used, and also the target file
(either in the working tree or in the index) to replay the changes
against.
Unlike a Git-generated patch, a non-Git patch is taken to have the
pathnames relative to the current working directory. The names
found in such a patch are modified by prepending the prefix by the
prefix_patches() helper function introduced in 56185f49 (git-apply:
require -p<n> when working in a subdirectory., 2007-02-19).
However, this prefixing is done after the patch is fully parsed and
affects only what target files are patched. Because the attributes
are checked against the names found in the patch during the parsing,
not against the final pathname, the whitespace check that is done
during parsing ends up using attributes for a wrong path for non-Git
patches.
Fix this by doing the prefix much earlier, immediately after the
header part of each patch is parsed and we learn the name of the
path the patch affects.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing "index" lines from a git-diff, we look for a
space followed by the mode. If we don't have a space, then
we set our pointer to the end-of-line. However, we don't
double-check that our end-of-line pointer is valid (e.g., if
we got a truncated diff input), which could lead to some
wrap-around pointer arithmetic.
In most cases this would probably get caught by our "40 <
len" check later in the function, but to be on the safe
side, let's just use strchrnul to treat end-of-string the
same as end-of-line.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use xmemdupz() to allocate the memory, copy the data and make sure to
NUL-terminate the result, all in one step. The resulting code is
shorter, doesn't contain the constants 1 and '\0', and avoids
duplicating function parameters.
For blame, the last copied byte (o->file.ptr[o->file.size]) is always
set to NUL by fake_working_tree_commit() or read_sha1_file(), so no
information is lost by the conversion to using xmemdupz().
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A submodule diff generally has content like:
-Subproject commit [0-9a-f]{40}
+Subproject commit [0-9a-f]{40}
When we are using "git apply --index" with a submodule, we
first apply the textual diff, and then parse that result to
figure out the new sha1.
If the diff has bogus input like:
-Subproject commit 1234567890123456789012345678901234567890
+bogus
we will parse the "bogus" portion. Our parser assumes that
the buffer starts with "Subproject commit", and blindly
skips past it using strlen(). This can cause us to read
random memory after the buffer.
This problem was unlikely to have come up in practice (since
it requires a malformed diff), and even when it did, we
likely noticed the problem anyway as the next operation was
to call get_sha1_hex on the random memory.
However, we can easily fix it by using skip_prefix to notice
the parsing error.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's easy to get manual allocation calculations wrong, and
the use of strcpy/strcat raise red flags for people looking
for buffer overflows (though in this case each site was
fine).
It's also shorter to use xstrfmt, and the printf-format
tends to be easier for a reader to see what the final string
will look like.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Other fill_stat_cache_info() is on new entries, which should set
CE_ENTRY_ADDED in cache_changed, so we're safe.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fuzzy_matchlines() function is used when attempting to resurrect
a patch that is whitespace-damaged, or when applying a patch that
was produced against an old codebase to the codebase after
indentation change.
The patch may want to change a line "a_bc" ("_" is used throught
this description for a whitespace to make it stand out) in the
original into something else, and we may not find "a_bc" in the
current source, but there may be "a__bc" (two spaces instead of one
the whitespace-damaged patch claims to expect). By ignoring the
amount of whitespaces, it forces "git apply" to consider that "a_bc"
in the broken patch meant to refer to "a__bc" in reality.
However, the implementation special cases a run of whitespaces at
the beginning of a line and makes "abc" match "_abc", even though a
whitespace in the middle of string never matches a 0-width gap,
e.g. "a_bc" does not match "abc". A run of whitespace at the end of
one string does not match a 0-width end of line on the other line,
either, e.g. "abc_" does not match "abc".
Fix this inconsistency by making the code skip leading whitespaces
only when both strings begin with a whitespace. This makes the
option mean the same as the option of the same name in "diff" and
"git diff".
Note that I am not sure if anybody sane should use this option in
the first place. The fuzzy match logic may be able to find the
original line that the patch author may have meant to touch because
it does not fully trust what the original lines say (i.e. context
lines prefixed by " " and old lines prefixed by "-" does not have to
exactly match the contents the patch is applied to). There is no
reason for us to trust what the replacement lines (i.e. new lines
prefixed by "+") say, either, but with this option enabled, we end
up copying these new lines with suspicious whitespace distributions
literally into the patched result. But as long as we keep it, we
should make it do its insane thing consistently.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it clear that we don't use fnmatch() anymore.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN,
2011-09-27). All occurrences of the respective variables have
been reviewed and none of them relied on the counting up mechanism,
but all of them were using the variable as a true boolean.
This patch does not change semantics of any command intentionally.
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name marc.theaimsgroup.com is no longer active, and has
migrated to marc.info.
Signed-off-by: Ondřej Bílka <neleai@seznam.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The variable len is set to
len = strchrnul(line, '\n') - line;
unconditionally 9 lines later, hence we can remove the call to strlen.
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are only four (with some generous rounding) instances in the
current source code where we speak of "subproject" instead of
"submodule". They are as follows:
* one error message in git-apply and two in entry.c
* the patch format for submodule changes
The latter was introduced in 0478675 (Expose subprojects as special
files to "git diff" machinery, 2007-04-15), apparently before the
terminology was settled. We can of course not change the patch
format.
Let's at least change the error messages to consistently call them
"submodule".
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is
- diff-lib.c: setting CE_UPTODATE
- name-hash.c: setting CE_HASHED
- preload-index.c, read-cache.c, unpack-trees.c and
builtin/update-index: obvious
- entry.c: write_entry() may refresh the checked out entry via
fill_stat_cache_info(). This causes "non-const struct cache_entry
*" in builtin/apply.c, builtin/checkout-index.c and
builtin/checkout.c
- builtin/ls-files.c: --with-tree changes stagemask and may set
CE_UPDATE
Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.
So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.
The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:
diff --git a/cache.h b/cache.h
index 430d021..1692891 100644
--- a/cache.h
+++ b/cache.h
@@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
#define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
struct index_state {
- struct cache_entry **cache;
+ const struct cache_entry **cache;
unsigned int version;
unsigned int cache_nr, cache_alloc, cache_changed;
struct string_list *resolve_undo;
will help quickly identify them without bogus warnings.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2901bbe (apply: free patch->{def,old,new}_name fields, 2012-03-21)
cleaned up the memory management of filenames in the patches, but
forgot that find_name_traditional() can return NULL as a way of saying
"I couldn't find a name".
That NULL unfortunately gets passed into xstrdup() next, resulting in
a segfault. Use null_strdup() so as to safely propagate the null,
which will let us emit the correct error message.
Reported-by: DevHC on #git
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The compiler can short-circuit the evaluation of conditions strung
together with logical OR operators instead of computing the resulting
bitmask with binary ORs. More importantly, this patch makes the
intent of the changed code clearer, because the logical context (as
opposed to binary context) becomes immediately obvious.
While we're at it, simplify the check for patch->is_rename in
builtin/apply.c a bit; it can only be 0 or 1, so we don't need a
comparison operator.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of these were found using Lucas De Marchi's codespell tool.
Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally update-pre-post-images could assume that any whitespace
fixing will make the result only shorter by unexpanding runs of
leading SPs into HTs and removing trailing whitespaces at the end of
lines. Updating the post-image we read from the patch to match the
actual result can be performed in-place under this assumption.
These days, however, we have tab-in-indent (aka Python) rule whose
result can be longer than the original, and we do need to allocate
a larger buffer than the input and replace the result.
Fortunately the support for lengthening rewrite was already added
when we began supporting "match while ignoring whitespace
differences" mode in 86c91f9179 (git apply: option to ignore
whitespace differences, 2009-08-04). We only need to correctly
count the number of bytes necessary to hold the updated result and
tell the function to allocate a new buffer.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A textual patch also records the submodule commit object name in
full. Make the parsing more robust by reading from there and
verifying the (possibly abbreviated) name on the index line matches.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was prompted by an incorrect warning issued by clang [1], and a
suggestion by Linus to restrict the range to check for values greater
than INT_MAX since these will give bogus output after casting to int.
In fact the (dis)similarity index is a percentage, so reject values
greater than 100.
[1] http://article.gmane.org/gmane.comp.version-control.git/213857
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git am -3" uses this function to build a tree that records how the
preimage the patch was created from would have looked like. An
abbreviated object name on the index line is ordinarily sufficient
for us to figure out the object name the preimage tree would have
contained, but a change to a submodule by definition shows an object
name of a submodule commit which our repository should not have, and
get_sha1_blob() is not an appropriate way to read it (or get_sha1()
for that matter).
Use get_sha1_hex() and complain if we do not find a full object name
there.
We could read from the payload part of the patch to learn the full
object name of the commit, but the primary user "git rebase" has
been fixed to give us a full object name, so this should suffice
for now.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The local variable sha1_ptr in the build_fake_ancestor() function
used to either point at the null_sha1[] (if the ancestor did not
have the path) or at sha1[] (if we read the object name into the
local array), but 7a98869 (apply: get rid of --index-info in favor
of --build-fake-ancestor, 2007-09-17) made the "missing in the
ancestor" case unnecessary, hence sha1_ptr, when used, always points
at the local array.
Get rid of the unneeded variable, and restructure the if/else
cascade a bit to make it easier to read. There should be no
behaviour change.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
5166714 (apply: Allow blank context lines to match beyond EOF,
2010-03-06) and then later 0c3ef98 (apply: Allow blank *trailing*
context lines to match beyond EOF, 2010-04-08) taught "git apply"
to trim new blank lines at the end in the patch text when matching
the contents being patched and the preimage recorded in the patch,
under --whitespace=fix mode.
When a preimage is modified to match the current contents in
preparation for such a "fixed" patch application, the context lines
in the postimage must be updated to match (otherwise, it would
reintroduce whitespace breakages), and update_pre_post_images()
function is responsible for doing this. However, this function was
not updated to take into account a case where the removal of
trailing blank lines reduces the number of lines in the preimage,
and triggered an assertion error.
The logic to fix the postimage by copying the corrected context
lines from the preimage was not prepared to handle this case,
either, but it was protected by the assert() and only got exposed
when the assertion is corrected.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back when "git apply" was written, we made sure that the user can
skip more than the default number of path components (i.e. 1) by
giving "-p<n>", but the logic for doing so was built around the
notion of "we skip N slashes and stop". This obviously does not
work well when running under -p0 where we do not want to skip any,
but still want to skip SP/HT that separates the pathnames of
preimage and postimage and want to reject absolute pathnames.
Stop using "stop_at_slash()", and instead introduce a new helper
"skip_tree_prefix()" with similar logic but works correctly even for
the -p0 case.
This is an ancient bug, but has been masked for a long time because
most of the patches are text and have other clues to tell us the
name of the preimage and the postimage.
Noticed by Colin McCabe.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It hasn't been used since 2006, as of commit 3cd4f5e8
"git-apply --binary: clean up and prepare for --reverse"
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Strip the name length from the ce_flags field and move it
into its own ce_namelen field in struct cache_entry. This
will both give us a tiny bit of a performance enhancement
when working with long pathnames and is a refactoring for
more readability of the code.
It enhances readability, by making it more clear what
is a flag, and where the length is stored and make it clear
which functions use stages in comparisions and which only
use the length.
It also makes CE_NAMEMASK private, so that users don't
mistakenly write the name length in the flags.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "index" line read from the patch to reconstruct a partial
preimage tree records the object names of blob objects.
Signed-off-by: Junio C Hamano <gitster@pobox.com>