The `null_oid()` function returns the object ID that only consists of
zeroes. Naturally, this ID also depends on the hash algorithm used, as
the number of zeroes is different between SHA1 and SHA256. Consequently,
the function returns the hash-algorithm-specific null object ID.
This is currently done by depending on `the_hash_algo`, which implicitly
makes us depend on `the_repository`. Refactor the function to instead
pass in the hash algorithm for which we want to retrieve the null object
ID. Adapt callsites accordingly by passing in `the_repository`, thus
bubbling up the dependency on that global variable by one layer.
There are a couple of trivial exceptions for subsystems that already got
rid of `the_repository`. These subsystems instead use the repository
that is available via the calling context:
- "builtin/grep.c"
- "grep.c"
- "refs/debug.c"
There are also two non-trivial exceptions:
- "diff-no-index.c": Here we know that we may not have a repository
initialized at all, so we cannot rely on `the_repository`. Instead,
we adapt `diff_no_index()` to get a `struct git_hash_algo` as
parameter. The only caller is located in "builtin/diff.c", where we
know to call `repo_set_hash_algo()` in case we're running outside of
a Git repository. Consequently, it is fine to continue passing
`the_repository->hash_algo` even in this case.
- "builtin/ls-files.c": There is an in-flight patch series that drops
`USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic
conflict because we use `null_oid()` in `show_submodule()`. The
value is passed to `repo_submodule_init()`, which may use the object
ID to resolve a tree-ish in the superproject from which we want to
read the submodule config. As such, the object ID should refer to an
object in the superproject, and consequently we need to use its hash
algorithm.
This means that we could in theory just not bother about this edge
case at all and just use `the_repository` in "diff-no-index.c". But
doing so would feel misdesigned.
Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in
"hash.c".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internals of the path diffing code, including ll_diff_tree_paths(),
all take an extra combine_diff_path parameter which they use as the tail
of a list of results, appending any new entries to it.
The public-facing diff_tree_paths() takes the same argument, but it just
makes the callers more awkward. They always start with a clean list, and
have to set up a fake head struct to pass in.
Let's keep the public API clean by always returning a new list. That
keeps the fake struct as an implementation detail of tree-diff.c.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want callers to use combine_diff_path_new() to allocate structs,
rather than using combine_diff_path_size() and xmalloc(). That gives us
more consistency over the initialization of the fields.
Now that the final external user of combine_diff_path_size() is gone, we
can stop declaring it publicly. And since our constructor is the only
caller, we can just inline it there.
Breaking the size computation into two parts also lets us reuse the
intermediate multiplication result of the parent length, since we need
to know it to perform our memset(). The result is a little easier to
read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit d76ce4f734 (log,diff-tree: add --combined-all-paths option,
2019-02-07) added a "path" field to each combine_diff_parent struct.
It's defined as a strbuf, but this is overkill. We never manipulate the
buffer beyond inserting a single string into it.
And in fact there's a small bug: we zero the parent structs, including
the path strbufs. For the 0th parent, we strbuf_init() the strbuf before
adding to it. But for subsequent parents, we never do the init. This is
technically violating the strbuf API, though the code there is resilient
enough to handle this zero'd state.
This patch switches us to just store an allocated string pointer.
Zeroing it is enough to properly initialize it there (modulo the usual
assumption we make that a NULL pointer is all-zeroes).
And as a bonus, we can just check for a non-NULL value to see if it is
present, rather than repeating the combined_all_paths logic at each
site.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The combine_diff_path struct has variable size, since it embeds both the
memory allocation for the path field as well as a variable-sized parent
array. This makes allocating one a bit tricky.
We have a helper to compute the required size, but it's up to individual
sites to actually initialize all of the fields. Let's provide a
constructor function to make that a little nicer. Besides being shorter,
it also hides away tricky bits like the computation of the "path"
pointer (which is right after the "parent" flex array).
As a bonus, using the same constructor everywhere means that we'll
consistently initialize all parts of the struct. A few code paths left
the parent array unitialized. This didn't cause any bugs, but we'll be
able to simplify some code in the next few patches knowing that the
parent fields have all been zero'd.
This also gets rid of some questionable uses of "int" to store buffer
lengths. Though we do use them to allocate, I don't think there are any
integer overflow vulnerabilities here (the allocation helper promotes
them to size_t and checks arithmetic for overflow, and the actual memcpy
of the bytes is done using the possibly-truncated "int" value).
Sadly we can't use the FLEX_* macros to simplify the allocation here,
because there are two variable-sized parts to the struct (and those
macros only handle one).
Nor can we get stop publicly declaring combine_diff_path_size(). This
patch does not touch the code in path_appendnew() at all, which is not
ready to be moved to our new constructor for a few reasons:
- path_appendnew() has a memory-reuse optimization where it tries to
reuse combine_diff_path structs rather than freeing and
reallocating.
- path_appendnew() does not create the struct from a single path
string, but rather allocates and copies into the buffer from
multiple sources.
These can be addressed by some refactoring, but let's leave it as-is for
now.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `cnt` variable tracks the number of lines in a patch diff. It can
happen though that there are no newlines, in which case we'd still end
up allocating our array of `sline`s. In fact, we always allocate it with
`cnt + 2` entries: one extra entry for the deletion hunk at the end, and
another entry that we don't seem to ever populate at all but acts as a
kind of sentinel value.
When we loop through the array to clear it at the end of this function
we only loop until `lno < cnt`, and thus we may not end up releasing
whatever the two extra `sline`s contain. While that shouldn't matter for
the sentinel value, it does matter for the extra deletion hunk sline.
Regardless of that, plug this memory leak by releasing both extra
entries, which makes the logic a bit easier to reason about.
While at it, fix the formatting of a local comment, which incidentally
also provides the necessary context for why we overallocate the `sline`
array.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `orderfile` diff option is being assigned via `OPT_FILENAME()`,
which assigns an allocated string to the variable. We never free it
though, causing a memory leak.
Change the type of the string to `char *` and free it to plug the leak.
This also requires us to use `xstrdup()` to assign the global config to
it in case it is set.
This leak is being hit in t7621, but plugging it alone does not make the
test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.
It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.
Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit
For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).
Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `resolve_gitlink_ref()` we implicitly rely on `the_repository` to
look up the submodule ref store. Now that we can look up submodule ref
stores for arbitrary repositories we can improve this function to
instead accept a repository as parameter for which we want to resolve
the gitlink.
Do so and adjust callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code paths that call repo_read_object_file() have been
tightened to react to errors.
* js/check-null-from-read-object-file:
Always check the return value of `repo_read_object_file()`
There are a couple of places in Git's source code where the return value
is not checked. As a consequence, they are susceptible to segmentation
faults.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these were checked with
gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).
...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file. These cases were:
* builtin/credential-cache.c
* builtin/pull.c
* builtin/send-pack.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The vast majority of files including object-store.h did not need dir.h
nor khash.h. Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.
After this patch:
$ git grep -h include..object-store | sort | uniq -c
2 #include "object-store.h"
129 #include "object-store-ll.h"
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cache.h's nature of a dumping ground of includes prevented it from
being included in some compat/ files, forcing us into a workaround
of having a double forward declaration of the read_in_full() function
(see commit 14086b0a13 ("compat/pread.c: Add a forward declaration to
fix a warning", 2007-11-17)). Now that we have moved functions like
read_in_full() from cache.h to wrapper.h, and wrapper.h isn't littered
with unrelated and scary #defines, get rid of the extra forward
declaration and just have compat/pread.c include wrapper.h.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ab/remove-implicit-use-of-the-repository:
libs: use "struct repository *" argument, not "the_repository"
post-cocci: adjust comments for recent repo_* migration
cocci: apply the "revision.h" part of "the_repository.pending"
cocci: apply the "rerere.h" part of "the_repository.pending"
cocci: apply the "refs.h" part of "the_repository.pending"
cocci: apply the "promisor-remote.h" part of "the_repository.pending"
cocci: apply the "packfile.h" part of "the_repository.pending"
cocci: apply the "pretty.h" part of "the_repository.pending"
cocci: apply the "object-store.h" part of "the_repository.pending"
cocci: apply the "diff.h" part of "the_repository.pending"
cocci: apply the "commit.h" part of "the_repository.pending"
cocci: apply the "commit-reach.h" part of "the_repository.pending"
cocci: apply the "cache.h" part of "the_repository.pending"
cocci: add missing "the_repository" macros to "pending"
cocci: sort "the_repository" rules by header
cocci: fix incorrect & verbose "the_repository" rules
cocci: remove dead rule from "the_repository.pending.cocci"
As can easily be seen from grepping in our sources, we had these uses
of "the_repository" in various library code in cases where the
function in question was already getting a "struct repository *"
argument. Let's use that argument instead.
Out of these changes only the changes to "cache-tree.c",
"commit-reach.c", "shallow.c" and "upload-pack.c" would have cleanly
applied before the migration away from the "repo_*()" wrapper macros
in the preceding commits.
The rest aren't new, as we'd previously implicitly refer to
"the_repository", but it's now more obvious that we were doing the
wrong thing all along, and should have used the parameter instead.
The change to change "get_index_format_default(the_repository)" in
"read-cache.c" to use the "r" variable instead should arguably have
been part of [1], or in the subsequent cleanup in [2]. Let's do it
here, as can be seen from the initial code in [3] it's not important
that we use "the_repository" there, but would prefer to always use the
current repository.
This change excludes the "the_repository" use in "upload-pack.c"'s
upload_pack_advertise(), as the in-flight [4] makes that change.
1. ee1f0c242e (read-cache: add index.skipHash config option,
2023-01-06)
2. 6269f8eaad (treewide: always have a valid "index_state.repo"
member, 2023-01-17)
3. 7211b9e753 (repo-settings: consolidate some config settings,
2019-08-13)
4. <Y/hbUsGPVNAxTdmS@coredump.intra.peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the part of "the_repository.pending.cocci" pertaining to
"object-store.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the part of "the_repository.pending.cocci" pertaining to
"cache.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff code provides a format_callback interface, but not every
callback needs each parameter (e.g., the "opt" and "data" parameters are
frequently left unused). Likewise for the output_prefix callback, the
low-level change/add_remove interfaces, the callbacks used by
xdi_diff(), etc.
Mark unused arguments in the callback implementations to quiet
-Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Certain diff options are currently ignored when combined-diff is
shown; mark them as incompatible with the feature.
* rs/combine-diff-with-incompatible-options:
combine-diff: abort if --output is given
combine-diff: abort if --ignore-matching-lines is given
The code for combined diffs currently only writes to stdout. Abort and
report that fact instead of silently ignoring the --output option. The
(empty) output file has already been created at that point, though.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code for combined diffs doesn't currently support ignoring changes
that match a regex. Abort and report that fact instead of running into
a segfault.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite the backend for "diff -G/-S" to use pcre2 engine when
available.
* ab/pickaxe-pcre2: (22 commits)
xdiff-interface: replace discard_hunk_line() with a flag
xdiff users: use designated initializers for out_line
pickaxe -G: don't special-case create/delete
pickaxe -G: terminate early on matching lines
xdiff-interface: allow early return from xdiff_emit_line_fn
xdiff-interface: prepare for allowing early return
pickaxe -S: slightly optimize contains()
pickaxe: rename variables in has_changes() for brevity
pickaxe -S: support content with NULs under --pickaxe-regex
pickaxe: assert that we must have a needle under -G or -S
pickaxe: refactor function selection in diffcore-pickaxe()
perf: add performance test for pickaxe
pickaxe/style: consolidate declarations and assignments
diff.h: move pickaxe fields together again
pickaxe: die when --find-object and --pickaxe-all are combined
pickaxe: die when -G and --pickaxe-regex are combined
pickaxe tests: add missing test for --no-pickaxe-regex being an error
pickaxe tests: test for -G, -S and --find-object incompatibility
pickaxe tests: add test for "log -S" not being a regex
pickaxe tests: add test for diffgrep_consume() internals
...
Change the function prototype of xdiff_emit_line_fn to return an "int"
instead of "void". Change all of those functions to "return 0",
nothing checks those return values yet, and no behavior is being
changed.
In subsequent commits the interface will be changed to allow early
return via this new return value.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Up until recently, object IDs did not have an algorithm member, only a
hash. Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms. Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.
Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead. It shortens the code and infers the
element size automatically.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"log -c --find-object=X" did not work well to find a merge that
involves a change to an object X from only one parent.
* jk/diff-cc-oidfind-fix:
combine-diff: handle --find-object in multitree code path
When doing combined diffs, we have two possible code paths:
- a slower one which independently diffs against each parent, applies
any filters, and then intersects the resulting paths
- a faster one which walks all trees simultaneously
When the diff options specify that we must do certain filters, like
pickaxe, then we always use the slow path, since the pickaxe code only
knows how to handle filepairs, not the n-parent entries generated for
combined diffs.
But there are two problems with the slow path:
1. It's slow. Running:
git rev-list HEAD | git diff-tree --stdin -r -c
in git.git takes ~3s on my machine. But adding "--find-object" to
that increases it to ~6s, even though find-object itself should
incur only a few extra oid comparisons. On linux.git, it's even
worse: 35s versus 215s.
2. It doesn't catch all cases where a particular path is interesting.
Consider a merge with parent blobs X and Y for a particular path,
and end result Z. That should be interesting according to "-c",
because the result doesn't match either parent. And it should be
interesting even with "--find-object=X", because "X" went away in
the merge.
But because we perform each pairwise diff independently, this
confuses the intersection code. The change from X to Z is still
interesting according to --find-object. But in the other parent we
went from Y to Z, so the diff appears empty! That causes the
intersection code to think that parent didn't change the path, and
thus it's not interesting for "-c".
This patch fixes both by implementing --find-object for the multitree
code. It's a bit unfortunate that we have to duplicate some logic from
diffcore-pickaxe, but this is the best we can do for now. In an ideal
world, all of the diffcore code would stop thinking about filepairs and
start thinking about n-parent sets, and we could use the multitree walk
with all of it.
Until then, there are some leftover warts:
- other pickaxe operations, like -S or -G, still suffer from both
problems. These would be hard to adapt because they rely on having
a diff_filespec() for each path to look at content. And we'd need to
define what an n-way "change" means in each case (probably easy for
"-S", which can compare counts, but not so clear for -G, which is
about grepping diffs).
- other options besides --find-object may cause us to use the slow
pairwise path, in which case we'll go back to producing a different
(wrong) answer for the X/Y/Z case above.
We may be able to hack around these, but I think the ultimate solution
will be a larger rewrite of the diffcore code. For now, this patch
improves one specific case but leaves the rest.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of 'dense' argument that is redundant for every function that has
'struct rev_info *rev' argument as well, as the value of 'dense' passed is
always taken from 'rev->dense_combined_merges' field.
The only place where this was not the case is in 'submodule.c' where
'diff_tree_combined_merge()' was called with '1' for 'dense' argument. However,
at that call the 'revs' instance used is local to the function, and we now just
set 'revs->dense_combined_merges' to 1 in this local instance.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
oid_array, 2017-03-31), but the file is still called sha1-array. Besides
being slightly confusing, it makes it more annoying to grep for leftover
occurrences of "sha1" in various files, because the header is included
in so many places.
Let's complete the transition by renaming the source and header files
(and fixing up a few comment references).
I kept the "-" in the name, as that seems to be our style; cf.
fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
We also have oidmap.h and oidset.h without any punctuation, but those
are "struct oidmap" and "struct oidset" in the code. We _could_ make
this "oidarray" to match, but somehow it looks uglier to me because of
the length of "array" (plus it would be a very invasive patch for little
gain).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Output from "diff --cc" did not show the original paths when the
merge involved renames. A new option adds the paths in the
original trees to the output.
* en/combined-all-paths:
log,diff-tree: add --combined-all-paths option
The combined diff format for merges will only list one filename, even if
rename or copy detection is active. For example, with raw format one
might see:
::100644 100644 100644 fabadb8cc95eb04866510 MM describe.c
::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM bar.sh
::100644 100644 100644 e07d6c5 9042e82 ee91881 RR phooey.c
This doesn't let us know what the original name of bar.sh was in the
first parent, and doesn't let us know what either of the original names
of phooey.c were in either of the parents. In contrast, for non-merge
commits, raw format does provide original filenames (and a rename score
to boot). In order to also provide original filenames for merge
commits, add a --combined-all-paths option (which must be used with
either -c or --cc, and is likely only useful with rename or copy
detection active) so that we can print tab-separated filenames when
renames are involved. This transforms the above output to:
::100644 100644 100644 fabadb8cc95eb04866510 MM desc.c desc.c desc.c
::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM foo.sh bar.sh bar.sh
::100644 100644 100644 e07d6c5 9042e82 ee91881 RR fooey.c fuey.c phooey.c
Further, in patch format, this changes the from/to headers so that
instead of just having one "from" header, we get one for each parent.
For example, instead of having
--- a/phooey.c
+++ b/phooey.c
we would see
--- a/fooey.c
--- a/fuey.c
+++ b/phooey.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --color-moved --cc --stat -p" did not work well due to
funny interaction between a bug in color-moved and the rest, which
has been fixed.
* jk/diff-cc-stat-fixes:
combine-diff: treat --dirstat like --stat
combine-diff: treat --summary like --stat
combine-diff: treat --shortstat like --stat
combine-diff: factor out stat-format mask
diff: clear emitted_symbols flag after use
t4006: resurrect commented-out tests
Currently "--cc --dirstat" will show nothing for a merge. Like
--shortstat and --summary in the previous two patches, it probably makes
sense to treat it like we do --stat, and show a stat against the
first-parent.
This case is less obviously correct than for --shortstat and --summary,
as those are basically variants of --stat themselves. It's possible we
could develop a multi-parent combined dirstat format, in which case we
might regret defining this first-parent behavior. But the same could be
said for --stat, and in the 12+ years of it showing first-parent stats,
nobody has complained.
So showing the first-parent dirstat is at least _useful_, and if we
later develop a clever multi-parent stat format, we'd probably have to
deal with --stat anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently "--cc --summary" on a merge shows nothing. Since we show "--cc
--stat" as a stat against the first parent, and because --summary is
typically used in combination with --stat, it makes sense to treat them
both the same way.
Note that we have to tweak t4013's setup a bit to test this case, as the
existing merges do not have any --summary results against their first
parent. But since the merge at the tip of 'master' does add and remove
files with respect to the second parent, we can just make a reversed
doppelganger merge where the parents are swapped.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --stat of a combined diff is defined as the first-parent stat,
going all the way back to 965f803c32 (combine-diff: show diffstat with
the first parent., 2006-04-17).
Naturally, we gave --numstat the same treatment in 74e2abe5b7 (diff
--numstat, 2006-10-12).
But --shortstat, which is really just the final line of --stat, does
nothing, which produces confusing results:
$ git show --oneline --stat eab7584e37eab7584e37 Merge branch 'en/show-ref-doc-fix'
Documentation/git-show-ref.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
$ git show --oneline --shortstat eab7584e37eab7584e37 Merge branch 'en/show-ref-doc-fix'
[nothing! We'd expect to see the "1 file changed..." line]
This patch teaches combine-diff to treats the two formats identically.
Reported-by: David Turner <novalis@novalis.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several conditionals in the combine diff code that check if
we're doing --stat or --numstat output. Since these must all remain in
sync, let's pull them out into a separate bit-mask.
Arguably this could go into diff.h along with the other DIFF_FORMAT
macros, but it's not clear that the definition of "which formats are
stat" is universal (e.g., does --dirstat count? --summary?). So let's
keep this local to combine-diff.c, where the meaning is more clearly
"stat formats that combine-diff supports".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interface into "xdiff" library used to discover the offset and
size of a generated patch hunk by first formatting it into the
textual hunk header "@@ -n,m +k,l @@" and then parsing the numbers
out. A new interface has been introduced to allow callers a more
direct access to them.
* jk/xdiff-interface:
xdiff-interface: drop parse_hunk_header()
range-diff: use a hunk callback
diff: convert --check to use a hunk callback
combine-diff: use an xdiff hunk callback
diff: use hunk callback for word-diff
diff: discard hunk headers for patch-ids earlier
diff: avoid generating unused hunk header lines
xdiff-interface: provide a separate consume callback for hunks
xdiff: provide a separate emit callback for hunks