The ll_diff_tree_paths() function and its helpers all take a pointer to
a list tail, possibly add to it, and then return the new tail. This
works but has two downsides:
- The top-level caller (diff_tree_paths() in this case) has to make a
fake combine_diff_path struct to act as the list head. This is
especially weird here, as it's a flexible-sized struct which will
have an empty FLEX_ARRAY field. That used to be a portability
problem, though these days it is legal because our FLEX_ARRAY macro
over-allocates if necessary. It's still kind of ugly, though.
- Besides the name "tail", it's not immediately obvious that the entry
we pass around will not be examined by each function. Using a
pointer-to-pointer or similar makes it more obvious we only care
about the pointer itself, not its contents.
We can solve both by passing around a pointer to the tail instead. That
gets rid of the return value entirely, though note that because of the
recursion we actually need a three-star pointer for this to work.
The result is fairly readable, as we only need to dereference the tail
in one spot. If we wanted to make it simpler we could wrap the tail in a
struct, which we pass around.
Another option is to convert combine_diff to use our generic list_head
API. I tried that and found the result became much harder to read
overall. It means that _all_ code that looks at combine_diff_path
structs needs to be modified, since the "next" pointer is now inside a
list_head which has to be dereferenced with list_entry(). And we lose
some type safety, since we're just passing around a list_head struct
everywhere, and everybody who looks at it has to specify the type to
list_entry themselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In emit_path() we may append a new combine_diff_path entry to our list,
decide that we don't want it (because opt->pathchange() told us so) and
then roll it back.
Between the addition and the rollback, it doesn't matter if it's in the
list or not (no functions can even tell, since it's a singly-linked list
and we pass around just the tail entry).
So it's much simpler to just wait until opt->pathchange() tells us
whether to keep it, and either attach it (or free it) then. We do still
have to allocate it up front since it's that struct itself which is
passed to the pathchange callback.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ll_diff_tree_paths() function and its helpers all append to a
running list by taking in a pointer to the old tail and returning the
new tail. But they just call this argument "p", which is not very
descriptive.
It gets particularly confusing in emit_path(), where we actually add to
the list, because "p" does double-duty: it is the tail of the list, but
it is also the entry which we add. Except that in some cases we _don't_
add a new entry (or we might even add it and roll it back) if the path
isn't interesting. At first glance, this makes it look like a bug that
we pass "p" on to ll_diff_tree_paths() to recurse; sometimes it is
getting the new entry we made and sometimes not!
But it's not a bug, because ll_diff_tree_paths() does not care about the
entry itself at all. It is only using its "next" pointer as the tail of
the list.
Let's swap out "p" for "tail" to make this obvious. And then in
emit_path() we'll continue to use "p" for our newly allocated entry.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internals of the path diffing code, including ll_diff_tree_paths(),
all take an extra combine_diff_path parameter which they use as the tail
of a list of results, appending any new entries to it.
The public-facing diff_tree_paths() takes the same argument, but it just
makes the callers more awkward. They always start with a clean list, and
have to set up a fake head struct to pass in.
Let's keep the public API clean by always returning a new list. That
keeps the fake struct as an implementation detail of tree-diff.c.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our path_appendnew() has been simplified to the point that it is mostly
just implementing combine_diff_path_new(), plus setting the "next"
pointer. Since there's only one caller, let's replace it completely with
a call to that helper function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When diffing trees, we'll have a strbuf "base" containing the
slash-separted names of our parent trees, and a "path" string
representing an entry name from the current tree. We pass these
separately to path_appendnew(), which combines them to form a single
path string in the combine_diff_path struct.
Instead, let's append the path string to our base strbuf ourselves, pass
in the result, and then roll it back with strbuf_setlen(). This lets us
simplify path_appendnew() a bit, enabling further refactoring.
And while it might seem like this causes extra wasted allocations, it
does not in practice. We reuse the same strbuf for each tree entry, so
we only have to allocate it to match the largest name. Plus, in a
recursive diff we'll end up doing this same operation to extend the base
for the next level of recursion. So we're really just incurring a small
memcpy().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're diffing trees, we create a list of combine_diff_path structs
that represent changed paths. We allocate each struct and add it to the
list with path_appendnew(), which we then feed to opt->pathchange().
That function tells us whether the path is of interest or not; if not,
then we can throw away the struct we allocated.
So there's an optimization to avoid extra allocations: instead of
throwing away the new entry, we try to reuse it. If it was large enough
to store the next path we care about, we can do so. And if not, we fall
back to freeing and re-allocating a new struct.
This comes from 72441af7c4 (tree-diff: rework diff_tree() to generate
diffs for multiparent cases as well, 2014-04-07), where the goal was to
have even the 2-parent diff code use the combine-diff infrastructure,
but without taking a performance hit.
The implementation causes some complexities in the interface (as we
store the allocation length inside the "next" pointer), and prevents us
from using the regular combine_diff_path_new() constructor. The
complexity is mostly contained inside two functions, but it's worth
re-evaluating how much it's helping.
That commit claims it helps ~1% on generating two-parent diffs in
linux.git. Here are the timings I get on the same command today ("old"
is the current tip of master, and "new" has this patch applied):
Benchmark 1: ./git.old log --raw --no-abbrev --no-renames v3.10..v3.11
Time (mean ± σ): 532.9 ms ± 5.8 ms [User: 472.7 ms, System: 59.6 ms]
Range (min … max): 525.9 ms … 543.3 ms 10 runs
Benchmark 2: ./git.new log --raw --no-abbrev --no-renames v3.10..v3.11
Time (mean ± σ): 538.3 ms ± 5.7 ms [User: 478.0 ms, System: 59.7 ms]
Range (min … max): 528.5 ms … 545.3 ms 10 runs
Summary
./git.old log --raw --no-abbrev --no-renames v3.10..v3.11 ran
1.01 ± 0.02 times faster than ./git.new log --raw --no-abbrev --no-renames v3.10..v3.11
So we do end up on average 1% faster, but with 2% of noise. I tried to
focus more on diff performance by running the commit traversal
separately, like:
git rev-list v3.10..v3.11 >in
and then timing just the diffs:
Benchmark 1: ./git.old diff-tree --stdin -r <in
Time (mean ± σ): 415.7 ms ± 5.8 ms [User: 357.7 ms, System: 58.0 ms]
Range (min … max): 410.9 ms … 430.3 ms 10 runs
Benchmark 2: ./git.new diff-tree --stdin -r <in
Time (mean ± σ): 418.5 ms ± 2.1 ms [User: 361.7 ms, System: 56.6 ms]
Range (min … max): 414.9 ms … 421.3 ms 10 runs
Summary
./git.old diff-tree --stdin -r <in ran
1.01 ± 0.02 times faster than ./git.new diff-tree --stdin -r <in
That gets roughly the same result.
Adding in "-c" to do multi-parent diffs doesn't change much:
Benchmark 1: ./git.old diff-tree --stdin -r -c <in
Time (mean ± σ): 525.3 ms ± 6.6 ms [User: 470.0 ms, System: 55.1 ms]
Range (min … max): 508.4 ms … 531.0 ms 10 runs
Benchmark 2: ./git.new diff-tree --stdin -r -c <in
Time (mean ± σ): 532.3 ms ± 6.2 ms [User: 469.0 ms, System: 63.1 ms]
Range (min … max): 520.3 ms … 539.4 ms 10 runs
Summary
./git.old diff-tree --stdin -r -c <in ran
1.01 ± 0.02 times faster than ./git.new diff-tree --stdin -r -c <in
And of course if you add in a lot more work by doing actual
content-level diffs, any difference is lost entirely (here the newer
version is actually faster, but that's really just noise):
Benchmark 1: ./git.old diff-tree --stdin -r --cc <in
Time (mean ± σ): 11.571 s ± 0.064 s [User: 11.287 s, System: 0.283 s]
Range (min … max): 11.497 s … 11.615 s 3 runs
Benchmark 2: ./git.new diff-tree --stdin -r --cc <in
Time (mean ± σ): 11.466 s ± 0.109 s [User: 11.108 s, System: 0.357 s]
Range (min … max): 11.346 s … 11.560 s 3 runs
Summary
./git.new diff-tree --stdin -r --cc <in ran
1.01 ± 0.01 times faster than ./git.old diff-tree --stdin -r --cc <in
So my conclusion is that it probably does help a little, but it's mostly
lost in the noise. I could see an argument for keeping it, as the
complexity is hidden away in functions that do not often need to be
touched. But it does make them more confusing than necessary (despite
some detailed explanations from the author of that commit; it just took
me a while to wrap my head around what was going on) and prevents
further refactoring of the combine_diff_path struct. So let's drop it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the other functions which allocate a combine_diff_path struct
zero out the parent array, but this code path does not. There's no bug,
since our caller will fill in most of the fields. But leaving the unused
fields (like combine_diff_parent.path) uninitialized makes working with
the struct more error-prone than it needs to be.
Let's just zero the parent field to be consistent with the
combine_diff_path_new() allocator.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "environment.h" we have quite a lot of functions and variables that
either explicitly or implicitly depend on `the_repository`.
The implicit set of stateful declarations includes for example variables
which get populated when parsing a repository's Git configuration. This
set of variables is broken by design, as their state often depends on
the last repository config that has been parsed. So they may or may not
represent the state of `the_repository`.
Fixing that is quite a big undertaking, and later patches in this series
will demonstrate a solution for a first small set of those variables. So
for now, let's guard these with `USE_THE_REPOSITORY_VARIABLE` so that
callers are aware of the implicit dependency.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split
out of hash.h to remove dependency on repository.h, 2023-04-22) to make
explicit the split between hash-related functions that rely on the
global `the_repository`, and those that don't. This split is no longer
necessary now that we we have removed the reliance on `the_repository`.
Merge "hash-ll.h" back into "hash.h". This causes some code units to not
include "repository.h" anymore, which requires us to add some forward
declarations.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When diffing trees, we recurse to handle subtrees. That means we may run
out of stack space and segfault. Let's teach this code path about
core.maxTreeDepth in order to fail more gracefully.
As with the previous patch, we have no way to return an error (and other
tree-loading problems would just cause us to die()). So we'll likewise
call die() if we exceed the maximum depth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code simplification.
* jc/tree-walk-drop-base-offset:
tree-walk: drop unused base_offset from do_match()
tree-walk: lose base_offset that is never used in tree_entry_interesting
The tree_entry_interesting() function takes base_offset, allowing
its callers to potentially pass a non-zero number to skip the early
part of the path string.
The feature is never exercised and we do not even know what bugs are
lurking there, as all callers pass 0 to the parameter.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also made it clear that several .c files depended upon various
things that oidset included, but had omitted the direct #include for
those headers. Add those now.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
S_DIFFTREE_IFXMIN_NEQ is *only* used in tree-diff.c, so there is no
point exposing it in cache.h. Move it to tree-diff.c.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code that was added in 8f4f8f4579 (guard against new pathspec
magic in pathspec matching code, 2013-07-14) to use the BUG() macro
instead of emitting a "fatal" message with the "__FILE__"-name and
"__LINE__"-numbers.
The original code predated the existence of the BUG() function, which
was added in d8193743e0 (usage.c: add BUG() function, 2017-05-12).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
b8ba412bf7 (tree-diff: avoid alloca for large allocations, 2016-06-07)
adds a way to route some bigger allocations out of the stack and free
them through the addition of two conveniently named macros, but leaves
the calls to free the xalloca part, which could be also in the heap,
if the system doesn't HAVE_ALLOCA_H (ex: macOS and other BSD).
Add the missing free call, xalloca_free(), which is a noop if we
allocated memory in the stack frame, but a real free() if we
allocated in the heap instead, and while at it, change the expression
to match in both macros for ease of readability.
This avoids a leak reported by LSAN while running t0000 but that
wouldn't fail the test (which is fixed in the next patch):
SUMMARY: LeakSanitizer: 1034 byte(s) leaked in 15 allocation(s).
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Up until recently, object IDs did not have an algorithm member, only a
hash. Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms. Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.
Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e3696980 (diff: halt tree-diff early after max_changes,
2020-03-30) intended to create a mechanism to short-circuit a diff
calculation after a certain number of paths were modified. By
incrementing a "num_changes" counter throughout the recursive
ll_diff_tree_paths(), this was supposed to match the number of changes
that would be written into the changed-path Bloom filters.
Unfortunately, this was not implemented correctly and instead misses
simple cases like file modifications. This then does not stop very
large changed-path filters from being written (unless they add or remove
many files).
To start, change the implementation in ll_diff_tree_paths() to instead
use the global diff_queue_diff struct's 'nr' member as the count. This
is a way to simplify the logic instead of making more mistakes in the
complicated diff code.
This has a drawback: the diff_queue_diff struct only lists the paths
corresponding to blob changes, not their leading directories. Thus,
get_or_compute_bloom_filter() needs an additional check to see if the
hashmap with the leading directories becomes too large.
One reason why this was not caught by test cases was that the test in
t4216-log-bloom.sh that was supposed to check this "too many changes"
condition only checked this on the initial commit of a repository. The
old logic counted these values correctly. Update this test in a few
ways:
1. Use GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS to reduce the limit,
allowing smaller commits to engage with this logic.
2. Create several interesting cases of edits, adds, removes, and mode
changes (in the second commit). By testing both sides of the
inequality with the *_MAX_CHANGED_PATHS variable, we can see that
the count is exactly correct, so none of these changes are missed
or over-counted.
3. Use the trace2 data value filter_found_large to verify that these
commits are on the correct side of the limit.
Another way to verify the behavior is correct is through performance
tests. By testing on my local copies of the Git repository and the Linux
kernel repository, I could measure the effect of these short-circuits
when computing a fresh commit-graph file with changed-path Bloom filters
using the command
GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS=N time \
git commit-graph write --reachable --changed-paths
and reporting the wall time and resulting commit-graph size.
For Git, the results are
| | N=1 | N=10 | N=512 |
|--------|----------------|----------------|----------------|
| HEAD~1 | 10.90s 9.18MB | 11.11s 9.34MB | 11.31s 9.35MB |
| HEAD | 9.21s 8.62MB | 11.11s 9.29MB | 11.29s 9.34MB |
For Linux, the results are
| | N=1 | N=20 | N=512 |
|--------|----------------|---------------|---------------|
| HEAD~1 | 61.28s 64.3MB | 76.9s 72.6MB | 77.6s 72.6MB |
| HEAD | 49.44s 56.3MB | 68.7s 65.9MB | 69.2s 65.9MB |
Naturally, the improvement becomes much less as the limit grows, as
fewer commits satisfy the short-circuit.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ll_diff_tree_oid() has only ever returned 0 [1], so it's return value
is basically useless. It's only caller diff_tree_oid() has only ever
returned the return value of ll_diff_tree_oid() as-is [2], so its
return value is just as useless. Most of diff_tree_oid()'s callers
simply ignore its return value, except:
- diff_root_tree_oid() is a thin wrapper around diff_tree_oid() and
returns with its return value, but all of diff_root_tree_oid()'s
callers ignore its return value.
- rev_compare_tree() and rev_same_tree_as_empty() do look at the
return value in a condition, but, since the return value is always
0, the former's < 0 condition is never fulfilled, while the
latter's >= 0 condition is always fulfilled.
So let's drop the return value of ll_diff_tree_oid(), diff_tree_oid()
and diff_root_tree_oid(), and drop those conditions from
rev_compare_tree() and rev_same_tree_as_empty() as well.
[1] ll_diff_tree_oid() and its ancestors have been returning only 0
ever since it was introduced as diff_tree() in 9174026cfe (Add
"diff-tree" program to show which files have changed between two
trees., 2005-04-09).
[2] diff_tree_oid() traces back to diff-tree.c:main() in 9174026cfe as
well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing the changed-paths bloom filters for the commit-graph,
we limit the size of the filter by restricting the number of paths
in the diff. Instead of computing a large diff and then ignoring the
result, it is better to halt the diff computation early.
Create a new "max_changes" option in struct diff_options. If non-zero,
then halt the diff computation after discovering strictly more changed
paths. This includes paths corresponding to trees that change.
Use this max_changes option in the bloom filter calculations. This
reduces the time taken to compute the filters for the Linux kernel
repo from 2m50s to 2m35s. On a large internal repository with ~500
commits that perform tree-wide changes, the time reduced from
6m15s to 3m48s.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While at there, clean up the_repo usage in builtin/merge-tree.c a tiny
bit.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
struct diff_filespec defines mode to be an 'unsigned short'. Several
other places in the API which we'd like to interact with using a
diff_filespec used a plain unsigned (or unsigned int). This caused
problems when taking addresses, so switch to unsigned short.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to walk tree objects has been taught that we may be
working with object names that are not computed with SHA-1.
* bc/tree-walk-oid:
cache: make oidcpy always copy GIT_MAX_RAWSZ bytes
tree-walk: store object_id in a separate member
match-trees: use hashcpy to splice trees
match-trees: compute buffer offset correctly when splicing
tree-walk: copy object ID before use
When parsing a tree, we read the object ID directly out of the tree
buffer. This is normally fine, but such an object ID cannot be used with
oidcpy, which copies GIT_MAX_RAWSZ bytes, because if we are using SHA-1,
there may not be that many bytes to copy.
Instead, store the object ID in a separate struct member. Since we can
no longer efficiently compute the path length, store that information as
well in struct name_entry. Ensure we only copy the object ID into the
new buffer if the path length is nonzero, as some callers will pass us
an empty path with no object ID following it, and we will not want to
read past the end of the buffer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to support :(attr) when matching pathspec on a tree,
tree_entry_interesting() needs to take an index (because
git_check_attr() needs it). This is the preparation step for it. This
also makes it clearer what index we fall back to when looking up
attributes during an unpack-trees operation: the source index.
This also fixes revs->pruning.repo initialization that should have
been done in 2abf350385 (revision.c: remove implicit dependency on
the_index - 2018-09-21). Without it, skip_uninteresting() will
dereference a NULL pointer through this call chain
get_revision(revs)
get_revision_internal
get_revision_1
try_to_simplify_commit
rev_compare_tree
diff_tree_oid(..., &revs->pruning)
ll_diff_tree_oid
diff_tree_paths
ll_diff_tree
skip_uninteresting
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".
* nd/the-index: (23 commits)
revision.c: reduce implicit dependency the_repository
revision.c: remove implicit dependency on the_index
ws.c: remove implicit dependency on the_index
tree-diff.c: remove implicit dependency on the_index
submodule.c: remove implicit dependency on the_index
line-range.c: remove implicit dependency on the_index
userdiff.c: remove implicit dependency on the_index
rerere.c: remove implicit dependency on the_index
sha1-file.c: remove implicit dependency on the_index
patch-ids.c: remove implicit dependency on the_index
merge.c: remove implicit dependency on the_index
merge-blobs.c: remove implicit dependency on the_index
ll-merge.c: remove implicit dependency on the_index
diff-lib.c: remove implicit dependency on the_index
read-cache.c: remove implicit dependency on the_index
diff.c: remove implicit dependency on the_index
grep.c: remove implicit dependency on the_index
diff.c: remove the_index dependency in textconv() functions
blame.c: rename "repo" argument to "r"
combine-diff.c: remove implicit dependency on the_index
...
A new variant repo_diff_setup() is added that takes 'struct repository *'
and diff_setup() becomes a thin macro around it that is protected by
NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to NO_THE_INDEX_....
The plan is these macros will always be defined for all library files
and the macros are only accessible in builtin/
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the flip side of the previous two patches: checking
for a non-zero oidcmp() can be more strictly expressed as
inequality. Like those patches, we write "!= 0" in the
coccinelle transformation, which covers by isomorphism the
more common:
if (oidcmp(E1, E2))
As with the previous two patches, this patch can be achieved
almost entirely by running "make coccicheck"; the only
differences are manual line-wrap fixes to match the original
code.
There is one thing to note for anybody replicating this,
though: coccinelle 1.0.4 seems to miss the case in
builtin/tag.c, even though it's basically the same as all
the others. Running with 1.0.7 does catch this, so
presumably it's just a coccinelle bug that was fixed in the
interim.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the few conditional uses of FREE_AND_NULL(x) to be
unconditional. As noted in the standard[1] free(NULL) is perfectly
valid, so we might as well leave this check up to the C library.
1. http://pubs.opengroup.org/onlinepubs/9699919799/functions/free.html
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `DIFF_OPT_SET` macro and instead set the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_SET(&E, fld)
+ E.flags.fld = 1
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_SET(ptr, fld)
+ ptr->flags.fld = 1
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `DIFF_OPT_TST` macro and instead access the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_TST(&E, fld)
+ E.flags.fld
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_TST(ptr, fld)
+ ptr->flags.fld
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All callers of fill_tree_descriptor() have been converted to object_id
already, so convert that function as well. As a nice side-effect we get
rid of NULL checks in tree-diff.c, as fill_tree_descriptor() already
does them for us.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The object_id pointers can be NULL for invalid entries. Don't try to
dereference them and pass NULL along to fill_tree_descriptor() instead,
which handles them just fine.
Found with Clang's UBSan.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.
* ab/free-and-null:
*.[ch] refactoring: make use of the FREE_AND_NULL() macro
coccinelle: make use of the "expression" FREE_AND_NULL() rule
coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
coccinelle: make use of the "type" FREE_AND_NULL() rule
coccinelle: add a rule to make "type" code use FREE_AND_NULL()
git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert diff_change to take a struct object_id. In addition convert the
function pointer type 'change_fn_t' to also take a struct object_id.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert diff_addremove to take a struct object_id. In addtion convert
the function pointer type 'add_remove_fn_t' to also take a struct
object_id.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>