Now that struct object_id has an algorithm field, we should populate it.
This will allow us to handle object IDs in any supported algorithm and
distinguish between them. Ensure that the field is written whenever we
write an object ID by storing it explicitly every time we write an
object. Set values for the empty blob and tree values as well.
In addition, use the algorithm field to compare object IDs. Note that
because we zero-initialize struct object_id in many places throughout
the codebase, we default to the default algorithm in cases where the
algorithm field is zero rather than explicitly initialize all of those
locations.
This leads to a branch on every comparison, but the alternative is to
compare the entire buffer each time and padding the buffer for SHA-1.
That alternative ranges up to 3.9% worse than this approach on the perf
t0001, t1450, and t1451.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we'll want oidread to automatically set the hash
algorithm member for an object ID we read into it, so ensure we use
oidread instead of hashcpy everywhere we're copying a hash value into a
struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead. It shortens the code and infers the
element size automatically.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Config parser fix for "git notes".
* na/notes-displayref-is-not-boolean:
t3301: test proper exit response to no-value notes.displayRef.
notes.c: fix a segfault in notes_display_config()
If notes.displayRef is configured with no value[1], control should be
returned to the caller when notes.c:notes_display_config() checks if 'v'
is NULL. Otherwise, both git log --notes and git diff-tree --notes will
subsequently segfault when refs.h:has_glob_specials() calls strpbrk()
with a NULL first argument.
[1] Examples:
.git/config:
[notes]
displayRef
$ git -c notes.displayRef [...]
Signed-off-by: Nate Avers <nate@roosteregg.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to automatically shrink the fan-out in the notes tree had
an off-by-one bug, which has been killed.
* jh/notes-fanout-fix:
notes.c: fix off-by-one error when decreasing notes fanout
t3305: check notes fanout more carefully and robustly
Add a function for inserting a C string into a strbuf. Use it
throughout the source to get rid of magic string length constants and
explicit strlen() calls.
Like strbuf_addstr(), implement it as an inline function to avoid the
implicit strlen() calls to cause runtime overhead.
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in the previous commit, the nature of the fanout heuristic
in the notes code causes the exact point at which we increase or
decrease the notes fanout to vary with the objects being annotated.
Since the object ids generated by the test environment are
deterministic (by design), the notes generated and tested by t3305
are always the same, and we therefore happen to see the same fanout
behavior from one run to the next.
Coincidentally, if we were to change the test environment slightly
(say by making a test commit on an unrelated branch before we start
the t3305 test proper), we not only see the fanout switch happen at
different points, we also manage to trigger a _bug_ in the notes
code where the fanout 1 -> 0 switch is not applied uniformly across
the notes tree, but instead yields a notes tree like this:
...
bdeafb301e44b0e4db0f738a2d2a7beefdb70b70
bff2d39b4f7122bd4c5caee3de353a774d1e632a
d3/8ec8f851adf470131178085bfbaab4b12ad2a7
e0b173960431a3e692ae929736df3c9b73a11d5b
eb3c3aede523d729990ac25c62a93eb47c21e2e3
...
The bug occurs when we are writing out a notes tree with a newly
decreased fanout, and the notes tree contains unexpanded subtrees
that should be consolidated into the parent tree as a consequence of
the decreased fanout):
Subtrees that happen to sit at an _even_ level in the internal notes
16-tree structure (in other words: subtrees whose path - "d3" in the
example above - is unique in the first nibble - i.e. there are no
other note paths that start with "d") are _not_ unpacked as part of
the tree writeout. This error will repeat itself in subsequent note
trees until the subtree is forced to be unpacked. In t3305 this only
happens when the d38ec8f8 note is itself removed from the tree.
The error is not severe (no information is lost, and the notes code
is able to read/decode this tree and manipulate it correctly), but
this is nonetheless a bug in the current implementation that should
be fixed.
That said, fixing the off-by-one error is not without complications:
We must take into account that the load_subtree() call from
for_each_note_helper() (that is now done to correctly unpack the
subtree while we're writing out the notes tree) may end up inserting
unpacked non-notes into the linked list of non_note entries held by
the struct notes_tree. Since we are in the process of writing out the
notes tree, this linked list is currently in the process of being
traversed by write_each_non_note_until(). The unpacked non-notes are
necessarily inserted between the last non-note we wrote out, and the
next non-note to be written. Hence, we cannot simply hold the
next_non_note to write in struct write_each_note_data (as we would
then silently skip these newly inserted notes), but must instead
always follow the ->next pointer from the last non-note we wrote.
(This part was caught by an existing test in t3304.)
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Brian M. Carlson <sandals@crustytoothpaste.net>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We often skip an optional prefix in a string with a hardcoded
constant, e.g.
if (starts_with(string, "prefix"))
string += 6;
which is less error prone when written
skip_prefix(string, "prefix", &string);
Note that this changes a few error messages from "git reflog expire
--expire=nonsense.timestamp", which used to complain by saying
'--expire=nonsense.timestamp' is not a valid timestamp
but with this change, we say
'nonsense.timestamp' is not a valid timestamp
which is more technically correct (the string with --expire= as
a prefix obviously cannot be a valid timestamp, but the error is
about the part of the input without that prefix).
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch" can take a set of configured format.notes values
to specify which notes refs to use in the log message part of the
output. The behaviour of this was not consistent with multiple
--notes command line options, which has been corrected.
* dl/format-patch-notes-config-fixup:
notes.h: fix typos in comment
notes: break set_display_notes() into smaller functions
config/format.txt: clarify behavior of multiple format.notes
format-patch: move git_config() before repo_init_revisions()
format-patch: use --notes behavior for format.notes
notes: extract logic into set_display_notes()
notes: create init_display_notes() helper
notes: rename to load_display_notes()
In 8164c961e1 (format-patch: use --notes behavior for format.notes,
2019-12-09), we introduced set_display_notes() which was a monolithic
function with three mutually exclusive branches. Break the function up
into three small and simple functions that each are only responsible for
one task.
This family of functions accepts an `int *show_notes` instead of
returning a value suitable for assignment to `show_notes`. This is for
two reasons. First of all, this guarantees that the external
`show_notes` variable changes in lockstep with the
`struct display_notes_opt`. Second, this prompts future developers to be
careful about doing something meaningful with this value. In fact, a
NULL check is intentionally omitted because causing a segfault here
would tell the future developer that they are meant to use the value for
something meaningful.
One alternative was making the family of functions accept a
`struct rev_info *` instead of the `struct display_notes_opt *`, since
the former contains the `show_notes` field as well. This does not work
because we have to call git_config() before repo_init_revisions().
However, if we had a `struct rev_info`, we'd need to initialize it before
it gets assigned values from git_config(). As a result, we break the
circular dependency by having standalone `int show_notes` and
`struct display_notes_opt notes_opt` variables which temporarily hold
values from git_config() until the values are copied over to `rev`.
To implement this change, we need to get a pointer to
`rev_info::show_notes`. Unfortunately, this is not possible with
bitfields and only direct-assignment is possible. Change
`rev_info::show_notes` to a non-bitfield int so that we can get its
address.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of open coding the logic that tweaks the variables in
`struct display_notes_opt` within handle_revision_opt(), abstract away the
logic into set_display_notes() so that it can be reused.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently open code the initialization for revs->notes_opt. Abstract
this away into a helper function so that the logic can be reused in a
future commit.
This is slightly wasteful as we memset the struct twice but this is only
run once so it shouldn't have any major effect.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the function comment, init_display_notes() was supposed to
"Load the notes machinery for displaying several notes trees." Rename
this function to load_display_notes() so that its use is more accurately
represented.
This is done because, in a future commit, we will reuse the name
init_display_notes().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few implementation fixes in the notes API.
* mh/notes-duplicate-entries:
notes: avoid potential use-after-free during insertion
notes: avoid leaking duplicate entries
The note_tree_insert() function may free the leaf_node struct we pass in
(e.g., if it's a duplicate, or if it needs to be combined with an
existing note).
Most callers are happy with this, as they assume that ownership of the
struct is handed off. But in load_subtree(), if we see an error we'll
use the handed-off struct's key_oid to generate the die() message,
potentially accessing freed memory.
We can easily fix this by instead using the original oid that we copied
into the leaf_node struct.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When add_note is called multiple times with the same key/value pair, the
leaf_node it creates is leaked by notes_tree_insert.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While at there, clean up the_repo usage in builtin/merge-tree.c a tiny
bit.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git merge-recursive" backend recently learned a new heuristics to
infer file movement based on how other files in the same directory
moved. As this is inherently less robust heuristics than the one
based on the content similarity of the file itself (rather than
based on what its neighbours are doing), it sometimes gives an
outcome unexpected by the end users. This has been toned down to
leave the renamed paths in higher/conflicted stages in the index so
that the user can examine and confirm the result.
* en/merge-directory-renames:
merge-recursive: switch directory rename detection default
merge-recursive: give callers of handle_content_merge() access to contents
merge-recursive: track information associated with directory renames
t6043: fix copied test description to match its purpose
merge-recursive: switch from (oid,mode) pairs to a diff_filespec
merge-recursive: cleanup handle_rename_* function signatures
merge-recursive: track branch where rename occurred in rename struct
merge-recursive: remove ren[12]_other fields from rename_conflict_info
merge-recursive: shrink rename_conflict_info
merge-recursive: move some struct declarations together
merge-recursive: use 'ci' for rename_conflict_info variable name
merge-recursive: rename locals 'o' and 'a' to 'obuf' and 'abuf'
merge-recursive: rename diff_filespec 'one' to 'o'
merge-recursive: rename merge_options argument from 'o' to 'opt'
Use 'unsigned short' for mode, like diff_filespec does
struct diff_filespec defines mode to be an 'unsigned short'. Several
other places in the API which we'd like to interact with using a
diff_filespec used a plain unsigned (or unsigned int). This caused
problems when taking addresses, so switch to unsigned short.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the uses of sha1_to_hex in this function with hash_to_hex to
allow the use of SHA-256 as well. Rename some variables since this code
is no longer limited to SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch out various uses of the GIT_SHA1_* constants with GIT_MAX_*
constants for allocations and the_hash_algo for general parsing. Update
a comment to no longer be SHA-1 specific.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing a tree, we read the object ID directly out of the tree
buffer. This is normally fine, but such an object ID cannot be used with
oidcpy, which copies GIT_MAX_RAWSZ bytes, because if we are using SHA-1,
there may not be that many bytes to copy.
Instead, store the object ID in a separate struct member. Since we can
no longer efficiently compute the path length, store that information as
well in struct name_entry. Ensure we only copy the object ID into the
new buffer if the path length is nonzero, as some callers will pass us
an empty path with no object ID following it, and we will not want to
read past the end of the buffer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the flip side of the previous two patches: checking
for a non-zero oidcmp() can be more strictly expressed as
inequality. Like those patches, we write "!= 0" in the
coccinelle transformation, which covers by isomorphism the
more common:
if (oidcmp(E1, E2))
As with the previous two patches, this patch can be achieved
almost entirely by running "make coccicheck"; the only
differences are manual line-wrap fixes to match the original
code.
There is one thing to note for anybody replicating this,
though: coccinelle 1.0.4 seems to miss the case in
builtin/tag.c, even though it's basically the same as all
the others. Running with 1.0.7 does catch this, so
presumably it's just a coccinelle bug that was fixed in the
interim.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the partner patch to the previous one, but covering
the "hash" variants instead of "oid". Note that our
coccinelle rule is slightly more complex to avoid triggering
the call in hasheq().
I didn't bother to add a new rule to convert:
- hasheq(E1->hash, E2->hash)
+ oideq(E1, E2)
Since these are new functions, there won't be any such
existing callers. And since most of the code is already
using oideq, we're not likely to introduce new ones.
We might still see "!hashcmp(E1->hash, E2->hash)" from topics
in flight. But because our new rule comes after the existing
ones, that should first get converted to "!oidcmp(E1, E2)"
and then to "oideq(E1, E2)".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the more restrictive oideq() should, in the long run,
give the compiler more opportunities to optimize these
callsites. For now, this conversion should be a complete
noop with respect to the generated code.
The result is also perhaps a little more readable, as it
avoids the "zero is equal" idiom. Since it's so prevalent in
C, I think seasoned programmers tend not to even notice it
anymore, but it can sometimes make for awkward double
negations (e.g., we can drop a few !!oidcmp() instances
here).
This patch was generated almost entirely by the included
coccinelle patch. This mechanical conversion should be
completely safe, because we check explicitly for cases where
oidcmp() is compared to 0, which is what oideq() is doing
under the hood. Note that we don't have to catch "!oidcmp()"
separately; coccinelle's standard isomorphisms make sure the
two are treated equivalently.
I say "almost" because I did hand-edit the coccinelle output
to fix up a few style violations (it mostly keeps the
original formatting, but sometimes unwraps long lines).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should make these functions easier to find and cache.h less
overwhelming to read.
In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file
As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise. It would be better to #include wherever
identifiers from the header are used. That can happen later
when we have better tooling for it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file. Do the same for read_sha1_file_extended.
Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id. Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert get_tree_entry and find_tree_entry to take pointers to struct
object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the definition and declaration of write_sha1_file to
struct object_id and adjust usage of this function.
This commit also converts static function write_sha1_file_prepare, as it
is closely related.
Rename these functions to write_object_file and
write_object_file_prepare respectively.
Replace sha1_to_hex, hashcpy and hashclr with their oid equivalents
wherever possible.
Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the definition and declaration of write_notes_tree to
struct object_id and adjust usage of this function.
Additionally, improve style of small part of this function, as old
formatting made it hard to understand at glance what this part of
code is doing.
Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the definition and declarations of combine_notes_* functions
to struct object_id and adjust usage of these functions.
Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the function for converting pairs of hexadecimal digits to binary
available to other call sites.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All but two of the call sites already have parameters using the hash
parameter of struct object_id, so convert them to take a pointer to the
struct directly. Also convert refs_read_refs_full, the underlying
implementation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* mh/notes-cleanup:
load_subtree(): check that `prefix_len` is in the expected range
load_subtree(): declare some variables to be `size_t`
hex_to_bytes(): simpler replacement for `get_oid_hex_segment()`
get_oid_hex_segment(): don't pad the rest of `oid`
load_subtree(): combine some common code
get_oid_hex_segment(): return 0 on success
load_subtree(): only consider blobs to be potential notes
load_subtree(): check earlier whether an internal node is a tree entry
load_subtree(): separate logic for internal vs. terminal entries
load_subtree(): fix incorrect comment
load_subtree(): reduce the scope of some local variables
load_subtree(): remove unnecessary conditional
notes: make GET_NIBBLE macro more robust
This value, which is stashed in the last byte of an object_id hash,
gets handed around a lot. So add a sanity check before using it in
`load_subtree()`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* `prefix_len`
* `path_len`
* `i`
It's good hygiene.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that `get_oid_hex_segment()` does less, it makes sense to rename
it and simplify its semantics:
* Instead of a `hex_len` parameter, which was the number of hex
characters (and had to be even), use a `len` parameter, which is the
number of resulting bytes. This removes then need for the check that
`hex_len` is even and to divide it by two to determine the number of
bytes. For good hygiene, declare the `len` parameter to be `size_t`
instead of `unsigned int`.
* Change the order of the arguments to the more traditional (dst,
src, len).
* Rename the function to `hex_to_bytes()`.
* Remove a loop variable: just count `len` down instead.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the feature of `get_oid_hex_segment()` that it pads the rest of
the `oid` argument with zeros. Instead, do this at the caller who
needs it.
This makes the functionality of this function more coherent and
removes the need for its `oid_len` argument.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Write the length into `object_oid` (before copying) rather than
`l->key_oid` (after copying). Then combine some code from the two `if`
blocks.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Nobody cares about the return value of get_oid_hex_segment() except to
check whether it failed. So just return 0 on success.
And while we're updating its docstring, update it for some argument
renaming that happened a while ago.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old code converted any entry whose path constituted a full SHA-1
as a leaf node, without regard for the type of the entry. But only
blobs can be notes. So treat entries whose paths *look like* notes
paths but that are not blobs as non-notes.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If an entry is not a tree entry, then it cannot possibly be an
internal node. But the old code checked this condition only after
allocating a leaf_node object and therefore leaked that memory.
Instead, check before even entering this branch of the code.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are only two legitimate notes path components:
* A hexadecimal string that fills the rest of the SHA-1
* A two-digit hexadecimal string that constitutes another internal
node.
So handle those two cases at the top level, and reject others as
non-notes without trying to parse them. The logic separation also
simplifies upcoming changes.
This prevents us from leaking memory for a leaf_node in the case of
wrong-sized paths. There are still memory leaks in this code; they will
be fixed in upcoming commits.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This comment was added in 851c2b3791 (Teach notes code to properly
preserve non-notes in the notes tree, 2010-02-13) when the
corresponding code was added. But I believe it was incorrect even
then. The condition `path_len != 2` a dozen lines up prevents a path
like "dead/beef" from being converted to "de/ad/beef", and indeed the
test added in commit 851c2b3 verifies that this case works correctly.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Declare the variables inside the loop, to make it more obvious that
their values are not carried across loop iterations.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At this point in the code, len is *always* <= 20.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Put parentheses around sha1. Otherwise it could fail for something
like
GET_NIBBLE(n, (unsigned char *)data);
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All callers of fill_tree_descriptor() have been converted to object_id
already, so convert that function as well. As a nice side-effect we get
rid of NULL checks in tree-diff.c, as fill_tree_descriptor() already
does them for us.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert add_note, get_note, and copy_note to take struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make get_note return a pointer to a const struct object_id. Add a
defensive check to ensure we don't accidentally dereference a NULL
pointer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert for_each_note and each of the callbacks to use struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert several portions of the internals of the code to struct
object_id. Introduce two macros to denote the different constants in
the code: KEY_INDEX for the last byte of the object ID, and
FANOUT_PATH_SEPARATORS for the number of possible path separators (on
Unix, "/"). While these constants are both 19 (one less than the number
of bytes in the hash), distinguish them to make the code more
understandable, and define them logically based on their intended
purpose.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After a note is removed, note_tree_consolidate is called to eliminate
some useless nodes. The typical case is that if you had an int_node
with 2 PTR_TYPE_NOTEs in it, and remove one of them, then the
PTR_TYPE_INTERNAL pointer in the parent tree can be replaced with the
remaining PTR_TYPE_NOTE.
This works fine when PTR_TYPE_NOTEs are involved, but falls flat when
other types are involved.
To put things in more practical terms, let's say we start from an empty
notes tree, and add 3 notes:
- one for a sha1 that starts with 424
- one for a sha1 that starts with 428
- one for a sha1 that starts with 4c
To keep track of this, note_tree.root will have a PTR_TYPE_INTERNAL at
a[4], pointing to an int_node*.
In turn, that int_node* will have a PTR_TYPE_NOTE at a[0xc], pointing to
the leaf_node* with the key and value, and a PTR_TYPE_INTERNAL at a[2],
pointing to another int_node*.
That other int_node* will have 2 PTR_TYPE_NOTE, one at a[4] and the
other at a[8].
When looking for the note for the sha1 starting with 428, get_note() will
recurse through (simplified) root.a[4].a[2].a[8].
Now, if we remove the note for the sha1 that starts with 4c, we're left
with a int_node* with only one PTR_TYPE_INTERNAL entry in it. After
note_tree_consolidate runs, root.a[4] now points to what used to be
pointed at by root.a[4].a[2].
Which means looking up for the note for the sha1 starting with 428 now
fails because there is nothing at root.a[4].a[2] anymore: there is only
root.a[4].a[4] and root.a[4].a[8], which don't match the expected
structure for the lookup.
So if all there is left in an int_node* is a PTR_TYPE_INTERNAL pointer,
we can't safely remove it. I think the same applies for PTR_TYPE_SUBTREE
pointers. IOW, only PTR_TYPE_NOTEs are safe to be moved to the parent
int_node*.
This doesn't have a practical effect on git because all that happens
after a remove_note is a write_notes_tree, which just iterates the entire
note tree, but this affects anything using libgit.a that would try to do
lookups after removing notes.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two types of string_lists: those that own the
string memory, and those that don't. You can tell the
difference by the strdup_strings flag, and one should use
either STRING_LIST_INIT_DUP, or STRING_LIST_INIT_NODUP as an
initializer.
Historically, the normal all-zeros initialization has
corresponded to the NODUP case. Many sites use no
initializer at all, and that works as a shorthand for that
case. But for a reader of the code, it can be hard to
remember which is which. Let's be more explicit and actually
have each site declare which type it means to use.
This is a fairly mechanical conversion; I assumed each site
was correct as-is, and just switched them all to NODUP.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update various codepaths to avoid manually-counted malloc().
* jk/tighten-alloc: (22 commits)
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
tree-diff: catch integer overflow in combine_diff_path allocation
...
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:
1. It automatically checks the array-size multiplication
for overflow.
2. It always uses sizeof(*array) for the element-size,
so that it can never go out of sync with the declared
type of the array.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git notes merge" used to limit the source of the merged notes tree
to somewhere under refs/notes/ hierarchy, which was too limiting
when inventing a workflow to exchange notes with remote
repositories using remote-tracking notes trees (located in e.g.
refs/remote-notes/ or somesuch).
* jk/notes-merge-from-anywhere:
notes: allow merging from arbitrary references
Create a new expansion function, expand_loose_notes_ref which will first
check whether the ref can be found using get_sha1. If it can't be found
then it will fallback to using expand_notes_ref. The content of the
strbuf will not be changed if the notes ref can be located using
get_sha1. Otherwise, it may be updated as done by expand_notes_ref.
Since we now support merging from non-notes refs, remove the test case
associated with that behavior. Add a test case for merging from a
non-notes ref.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Reviewed-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
init_notes() is the main point of entry to the notes API. It ensures
that the input can be used as ref, because it needs a ref to update to
store notes tree after modifying it.
There however are many use cases where notes tree is only read, e.g.
"git log --notes=...". Any notes-shaped treeish could be used for such
purpose, but it is not allowed due to existing restriction.
Allow treeish expressions to be used in the case the notes tree is going
to be used without write "permissions". Add a flag to distinguish
whether the notes tree is intended to be used read-only, or will be
updated.
With this change, operations that use notes read-only can be fed any
notes-shaped tree-ish can be used, e.g. git log --notes=notes@{1}.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We know that a fanned-out sha1 in a notes tree cannot be
more than "aa/bb/cc/...", and we have an assert() to confirm
that. But let's factor out that length into a constant so we
can be sure it is used consistently. And even though we
assert() earlier, let's replace a strcpy with xsnprintf, so
it is clear to a reader that all cases are covered.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are loading a notes tree into our internal hash
table, we also collect any files that are clearly non-notes.
We format the name of the file into a PATH_MAX buffer, but
unlike true notes (which cannot be larger than a fanned-out
sha1 hash), these tree entries can be arbitrarily long,
overflowing our buffer.
We can fix this by switching to a strbuf. It doesn't even
cost us an extra allocation, as we can simply hand ownership
of the buffer over to the non-note struct.
This is of moderate security interest, as you might fetch
notes trees from an untrusted remote. However, we do not do
so by default, so you would have to manually fetch into the
notes namespace.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".
To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().
This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git blame HEAD -- missing" failed to correctly say "HEAD" when it
tried to say "No such path 'missing' in HEAD".
* jk/blame-commit-label:
blame.c: fix garbled error message
use xstrdup_or_null to replace ternary conditionals
builtin/commit.c: use xstrdup_or_null instead of envdup
builtin/apply.c: use xstrdup_or_null instead of null_strdup
git-compat-util: add xstrdup_or_null helper
"git blame HEAD -- missing" failed to correctly say "HEAD" when it
tried to say "No such path 'missing' in HEAD".
* jk/blame-commit-label:
blame.c: fix garbled error message
use xstrdup_or_null to replace ternary conditionals
builtin/commit.c: use xstrdup_or_null instead of envdup
builtin/apply.c: use xstrdup_or_null instead of null_strdup
git-compat-util: add xstrdup_or_null helper
This replaces "x ? xstrdup(x) : NULL" with xstrdup_or_null(x).
The change is fairly mechanical, with the exception of
resolve_refdup, which can eliminate a temporary variable.
There are still a few hits grepping for "?.*xstrdup", but
these are of slightly different forms and cannot be
converted (e.g., "x ? xstrdup(x->foo) : NULL").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote update --prune" to drop many refs has been optimized.
* mh/simplify-repack-without-refs:
sort_string_list(): rename to string_list_sort()
prune_remote(): iterate using for_each_string_list_item()
prune_remote(): rename local variable
repack_without_refs(): make the refnames argument a string_list
prune_remote(): sort delete_refs_list references en masse
prune_remote(): initialize both delete_refs lists in a single loop
prune_remote(): exit early if there are no stale references
The new name is more consistent with the names of other
string_list-related functions.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user has gone through the trouble of explicitly adding an empty
note, then "git log" should not silently skip it (as if it didn't exist).
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
xcalloc() takes two arguments: the number of elements and their size.
notes.c includes several calls to xcalloc() that pass the arguments in
reverse order.
Rearrange them so they are in the correct order.
Signed-off-by: Brian Gesiak <modocache@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since string_list_add_one_ref() adds refname to the string list, but
the lifetime of refname is limited, it is important that the
string_list passed to string_list_add_one_ref() has strdup_strings
set. Document this fact.
All current callers do the right thing.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Emit the notes attached to the commit in "format-patch --notes"
output after three-dashes.
* jc/prettier-pretty-note:
format-patch: add a blank line between notes and diffstat
Doc User-Manual: Patch cover letter, three dashes, and --notes
Doc format-patch: clarify --notes use case
Doc notes: Include the format-patch --notes option
Doc SubmittingPatches: Mention --notes option after "cover letter"
Documentation: decribe format-patch --notes
format-patch --notes: show notes after three-dashes
format-patch: append --signature after notes
pretty_print_commit(): do not append notes message
pretty: prepare notes message at a centralized place
format_note(): simplify API
pretty: remove reencode_commit_message()
Improve the asymptotic performance of the cat_sort_uniq notes merge
strategy.
* mh/notes-string-list:
string_list_add_refs_from_colon_sep(): use string_list_split()
notes: fix handling of colon-separated values
combine_notes_cat_sort_uniq(): sort and dedup lines all at once
Initialize sort_uniq_list using named constant
string_list: add a function string_list_remove_empty_items()
Various codepaths checked if two encoding names are the same using
ad-hoc code and some of them ended up asking iconv() to convert
between "utf8" and "UTF-8". The former is not a valid way to spell
the encoding name, but often people use it by mistake, and we
equated them in some but not all codepaths. Introduce a new helper
function to make these codepaths consistent.
* jc/same-encoding:
reencode_string(): introduce and use same_encoding()
Conflicts:
builtin/mailinfo.c
It makes for simpler code than strbuf_split().
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Jeff King <peff@peff.net>
The substrings output by strbuf_split() include the ':' delimiters.
When processing GIT_NOTES_DISPLAY_REF and GIT_NOTES_REWRITE_REF, strip
off the delimiter character *before* checking whether the substring is
empty rather than after, so that empty strings within the list are
also skipped.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Jeff King <peff@peff.net>
Instead of reading lines one by one and insertion-sorting them into a
string_list, read all of the lines, sort them, then remove duplicates.
Aside from being less code, this reduces the complexity from O(N^2) to
O(N lg N) in the total number of lines.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Jeff King <peff@peff.net>
Callers of reencode_string() that re-encodes a string from one
encoding to another all used ad-hoc way to bypass the case where the
input and the output encodings are the same. Some did strcmp(),
some did strcasecmp(), yet some others when converting to UTF-8 used
is_encoding_utf8().
Introduce same_encoding() helper function to make these callers use
the same logic. Notably, is_encoding_utf8() has a work-around for
common misconfiguration to use "utf8" to name UTF-8 encoding, which
does not match "UTF-8" hence strcasecmp() would not consider the
same. Make use of it in this helper function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We either stuff the notes message without modification for %N
userformat, or format it for human consumption. Using two bits
is an overkill that does not benefit anybody.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detected by "gcc -std=iso9899:1990 ...". This patch applies against
"maint".
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is in preparation for more notes-related revision
command-line options.
The "suppress_default_notes" option is renamed to
"use_default_notes", and is now a tri-state with values less
than one indicating "not set". If the value is "not set",
then we show default refs if and only if no other refs were
given.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need to use an extra pointer, which just ends up
leaking memory. The fact that the list is empty tells us the
same thing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is useful for other commands besides "git
notes" which want to let users refer to notes by their
shorthand name.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>