Convert this function and the git merge-recursive subcommand to use
struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all but two of the static functions in this file to use struct
object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct merge_file_info to use struct object_id. The following
Coccinelle semantic patch was used to implement this, followed by the
transformations in object_id.cocci:
@@
struct merge_file_info o;
@@
- o.sha
+ o.oid.hash
@@
struct merge_file_info *p;
@@
- p->sha
+ p->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the anonymous struct within struct stage_data to use struct
object_id. The following Coccinelle semantic patch was used to
implement this, followed by the transformations in object_id.cocci:
@@
struct stage_data o;
expression E1;
@@
- o.stages[E1].sha
+ o.stages[E1].oid.hash
@@
struct stage_data *p;
expression E1;
@@
- p->stages[E1].sha
+ p->stages[E1].oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct diff_filespec's sha1 member to use a struct object_id
called "oid" instead. The following Coccinelle semantic patch was used
to implement this, followed by the transformations in object_id.cocci:
@@
struct diff_filespec o;
@@
- o.sha1
+ o.oid.hash
@@
struct diff_filespec *p;
@@
- p->sha1
+ p->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were a few cases in merge-recursive that could result in a
check for the presence of files in the working copy while trying to
create a virtual merge base. These were rare and innocuous, but
somewhat illogical. The two cases were:
* When there was naming conflicts (e.g. a D/F conflict) and we had to
pick a new unique name for a file. Since the new name is somewhat
arbitrary, it didn't matter that we consulted the working copy to
avoid picking a filename it has, but since the virtual merge base is
never checked out, it's a waste of time and slightly odd to do so.
* When two different files get renamed to the same name (on opposite
sides of the merge), we needed to delete the original filenames from
the cache and possibly also the working directory. The caller's check
for determining whether to delete from the working directory was a
call to would_lose_untracked(). It turns out this didn't matter
because remove_file() had logic to avoid modifying the working
directory when creating a virtual merge base, but there is no reason
for the caller to check the working directory in such circumstances.
It's a waste of time, if not also a bit weird.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 51931bf (merge-recursive: Improve handling of rename
target vs. directory addition, 2011-08-11), I apparently added two
lines of code that were immediately duplicated a few lines later.
No idea why, other than it seems pretty clear this was a mistake:
there is no need to remove the same file twice; removing it once is
sufficient...especially since the intervening line was working with
a different file entirely.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the find-renames option follow the behaviour in git-diff, where it
resets the threshold when none is given. So, for instance,
"--find-renames=25 --find-renames" should result in the default
threshold (50%) instead of 25%.
Signed-off-by: Felipe Gonçalves Assis <felipegassis@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add strategy option find-renames, following git-diff interface. This
makes the option rename-threshold redundant.
Signed-off-by: Felipe Gonçalves Assis <felipegassis@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The recursive strategy turns on rename detection by default. Add a
strategy option to disable rename detection even for exact renames.
Signed-off-by: Felipe Gonçalves Assis <felipegassis@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object. This provides no
functional change, as it is essentially a macro substitution.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
struct object is one of the major data structures dealing with object
IDs. Convert it to use struct object_id instead of an unsigned char
array. Convert get_object_hash to refer to the new member as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash. Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
If one side deletes a file and the other changes its
content, we notice and report a conflict. However, if
instead of changing the content, we change only the mode,
the merge does not notice (and the mode change is silently
dropped).
The trivial index merge notices the problem and correctly
leaves the conflict in the index, but both merge-recursive
and merge-one-file will silently resolve this in favor of
the deletion. In many cases that is a sane resolution, but
we should be punting to the user whenever there is any
question. So let's detect and treat this as a conflict (in
both strategies).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This would be a fairly routine use of xstrfmt, except that
we need to remember the length of the result to pass to
cache_name_pos. So just use a strbuf, which makes this
simple.
As a bonus, this gets rid of confusing references to
"pathlen+1". The "1" is for the trailing slash we added, but
that is automatically accounted for in the strbuf's len
parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call file_exists() instead of open-coding it. That's shorter, simpler
and the intent becomes clearer.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These string_list instances were allocated by get_renames() and
get_unmerged for the sole use of this caller, and the function is
responsible for freeing them, not just their contents.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows the callback to use 'base' as a temporary buffer to
quickly assemble full path "without" extra allocation. The callback
has to restore it afterwards of course.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callers of get_merge_bases() can choose to leave object flags
used during the merge-base traversal by passing cleanup=0 as a
parameter, but in practice a very few callers can afford to do so
(namely, "git merge-base"), as they need to compute merge base in
preparation for other processing of their own and they need to see
the object without contaminate flags.
Change the function signature of get_merge_bases_many() and
get_merge_bases() to drop the cleanup parameter, so that the
majority of the callers do not have to say ", 1" at the end.
Give a new get_merge_bases_many_dirty() API to support only a few
callers that know they do not need to spend cycles cleaning up the
object flags.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the interface declaration for the functions in lockfile.c from
cache.h to a new file, lockfile.h. Add #includes where necessary (and
remove some redundant includes of cache.h by files that already
include builtin.h).
Move the documentation of the lock_file state diagram from lockfile.c
to the new header file.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following issue was found by scan.coverity.com (ID: 1049510),
and claimed to be likely a copy-paste mistake.
Introduced in 331a1838b (2010-07-02, Try normalizing files
to avoid delete/modify conflicts when merging), which is
quite a long time ago, so I'm rather unsure if it's of any impact
or just went unnoticed.
The line after the changed line has a comparison of 'o.len' to 'a.len',
so we should assume the lengths may be different.
I'd be happy to have a test for this bug(?) attached to
t6031-merge-recursive.sh, but I did not manage to
come up with a test in a reasonable amount of time.
Signed-off-by: Stefan Beller <stefanbeller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CodingGuidelines states that the first #include in C files should be
git-compat-util.h or another header file that includes it, such as
cache.h or builtin.h.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use `git_config_get_int()` instead of `git_config()` to take advantage
of the config-set API which provides a cleaner control flow.
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using memset and then manually setting values of the string-list
members is not future proof as the internal representation of
string-list may change any time.
Use `string_list_init()` or STRING_LIST_INIT_* macros instead of
memset.
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's a common idiom to match a prefix and then skip past it
with strlen, like:
if (starts_with(foo, "bar"))
foo += strlen("bar");
This avoids magic numbers, but means we have to repeat the
string (and there is no compiler check that we didn't make a
typo in one of the strings).
We can use skip_prefix to handle this case without repeating
ourselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When merge-recursive creates a unique filename, it uses a
template like:
path~branch_%d
where the final "_%d" is filled by an incrementing counter
until we find a unique name. We allocate 8 characters for
the counter, but there is no logic to limit the size of the
integer.
Of course, this is extremely unlikely, as you would need a
hundred million collisions to trigger the problem. Even if
an attacker constructed a specialized repo, it is unlikely
that the victim would have the patience to run the merge.
However, we can make it trivially correct (and hopefully
more readable) by using a strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is one line shorter, and makes sure the length in the
malloc and sprintf steps match.
These conversions are very straightforward; we can drop the
malloc entirely, and replace the sprintf with xstrfmt.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most callsites which use the commit buffer try to use the
cached version attached to the commit, rather than
re-reading from disk. Unfortunately, that interface provides
only a pointer to the NUL-terminated buffer, with no
indication of the original length.
For the most part, this doesn't matter. People do not put
NULs in their commit messages, and the log code is happy to
treat it all as a NUL-terminated string. However, some code
paths do care. For example, when checking signatures, we
want to be very careful that we verify all the bytes to
avoid malicious trickery.
This patch just adds an optional "size" out-pointer to
get_commit_buffer and friends. The existing callers all pass
NULL (there did not seem to be any obvious sites where we
could avoid an immediate strlen() call, though perhaps with
some further refactoring we could).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these sites assumes that commit->buffer is valid.
Since they would segfault if this was not the case, they are
likely to be correct in practice. However, we can
future-proof them by using get_commit_buffer.
And as a side effect, we abstract away the final bare uses
of commit->buffer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In both blame and merge-recursive, we sometimes create a
"fake" commit struct for convenience (e.g., to represent the
HEAD state as if we would commit it). By allocating
ourselves rather than using alloc_commit_node, we do not
properly set the "index" field of the commit. This can
produce subtle bugs if we then use commit-slab on the
resulting commit, as we will share the "0" index with
another commit.
We can fix this by using alloc_commit_node() to allocate.
Note that we cannot free the result, as it is part of our
commit allocator. However, both cases were already leaking
the allocated commit anyway, so there's nothing to fix up.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On a case-insensitive filesystem, when merging, a file would be
wrongly deleted from the working tree if an incoming commit had
renamed it changing only its case. When merging a rename, the file
with the old name would be deleted -- but since the filesystem
considers the old name to be the same as the new name, the new
file would in fact be deleted.
We avoid this by not deleting files that have a case-clone in the
index at stage 0.
Signed-off-by: David Turner <dturner@twitter.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach add_cacheinfo to tell make_cache_entry to skip refreshing stat
information when a file is missing from the work tree. We do not want
the index to be stat-dirty after the merge but also do not want to fail
when a file happens to be missing.
This fixes the 'merge-recursive w/ empty work tree - ours has rename'
case in t3030-merge-recursive.
Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the make_cache_entry boolean 'refresh' argument to a more
general 'refresh_options' argument. Pass the value through to the
underlying refresh_cache_ent call. Add option CE_MATCH_REFRESH to
enable stat refresh. Update call sites to use the new signature.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of returning magic integer values (which a couple of callers
go to the trouble of distinguishing), return values from an enum. Add
a docstring.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "diff-algorithm" option to the recursive merge strategy takes the
name of the algorithm as an option, but it uses strcmp on the option
string to check if it starts with "diff-algorithm=", meaning that this
options cannot actually be used.
Fix this by switching to prefixcmp. At the same time, clarify the
following line by using strlen instead of a hard-coded length, which
also makes it consistent with nearby code.
Reported-by: Luke Noel-Storr <luke.noel-storr@integrate.co.uk>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
While at there, move free_pathspec() to pathspec.c
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is
- diff-lib.c: setting CE_UPTODATE
- name-hash.c: setting CE_HASHED
- preload-index.c, read-cache.c, unpack-trees.c and
builtin/update-index: obvious
- entry.c: write_entry() may refresh the checked out entry via
fill_stat_cache_info(). This causes "non-const struct cache_entry
*" in builtin/apply.c, builtin/checkout-index.c and
builtin/checkout.c
- builtin/ls-files.c: --with-tree changes stagemask and may set
CE_UPDATE
Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.
So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.
The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:
diff --git a/cache.h b/cache.h
index 430d021..1692891 100644
--- a/cache.h
+++ b/cache.h
@@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
#define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
struct index_state {
- struct cache_entry **cache;
+ const struct cache_entry **cache;
unsigned int version;
unsigned int cache_nr, cache_alloc, cache_changed;
struct string_list *resolve_undo;
will help quickly identify them without bogus warnings.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since command line options have higher priority than config file
variables and taking previous commit into account, we need a way
how to specify myers algorithm on command line. However,
inventing `--myers` is not the right answer. We need far more
general option, and that is `--diff-algorithm`.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two different static functions and one global function,
all of them called "merge_file()", with different signatures and
purposes. Rename them all to reduce confusion in "git grep" output:
* Rename the static one in merge-index to "merge_one_path(const char
*path)" as that function is about asking an external command to
resolve conflicts in one path.
* Rename the global one in merge-file.c that is only used by
merge-tree to "merge_blobs()", as the function takes three blobs and
returns the merged result only in-core, without doing anything to
the filesystem.
* Rename the one in merge-recursive to "merge_one_file()", just to be
fair.
Also rename merge-file.[ch] to merge-blobs.[ch].
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition to updating the xstrdup(mkpath(...)) call sites with
mkpathdup(), we also fix a memory leak (in merge_3way()) caused by
neglecting to free the memory allocated to the 'base_name' variable.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function "merge_recursive" prints the count of common ancestors
as "found %u common ancestor(s):". We should use a singular and a
plural form of this message to help translators.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
flush_buffer() is a thin wrapper around write_in_full() with two very
confusing properties:
* It runs a loop to handle short reads, ensuring that we write
everything. But that is precisely what write_in_full() does!
* It checks for a return value of 0 from write_in_full(), which cannot
happen: it returns this value only if count=0, but flush_buffer()
will never call write_in_full() in this case.
Remove it.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff_setup_done() has historically returned an error code, but lost
the last nonzero return in 943d5b7 (allow diff.renamelimit to be set
regardless of -M/-C, 2006-08-09). The callers were in a pretty
confused state: some actually checked for the return code, and some
did not.
Let it return void, and patch all callers to take this into account.
This conveniently also gets rid of a handful of different(!) error
messages that could never be triggered anyway.
Note that the function can still die().
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>