Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf_check_branch_ref() is the central place where many codepaths
see if a proposed name is suitable for the name of a branch. It was
designed to allow us to get stricter than the check_refname_format()
check used for refnames in general, and we already use it to reject
a branch whose name begins with a '-'. The function gets a strbuf
and a string "name", and returns non-zero if the name is not
appropriate as the name for a branch. When the name is good, it
places the full refname for the branch with the proposed name in the
strbuf before it returns.
However, it turns out that one caller looks at what is in the strbuf
even when the function returns an error. Make the function populate
the strbuf even when it returns an error. That way, when "-dash" is
given as name, "refs/heads/-dash" is placed in the strbuf when
returning an error to copy_or_rename_branch(), which notices that
the user is trying to recover with "git branch -m -- -dash dash" to
rename "-dash" to "dash".
While at it, use the same mechanism to also reject "HEAD" as a
branch name.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running "git check-ref-format --branch @{-1}" from outside any
repository produces
$ git check-ref-format --branch @{-1}
BUG: environment.c:182: git environment hasn't been setup
This is because the expansion of @{-1} must come from the HEAD reflog,
which involves opening the repository. @{u} and @{push} (which are
more unusual because they typically would not expand to a local
branch) trigger the same assertion.
This has been broken since day one. Before v2.13.0-rc0~48^2
(setup_git_env: avoid blind fall-back to ".git", 2016-10-02), the
breakage was more subtle: Git would read reflogs from ".git" within
the current directory even if it was not a valid repository. Usually
that is harmless because Git is not being run from the root directory
of an invalid repository, but in edge cases such accesses can be
confusing or harmful. Since v2.13.0, the problem is easier to
diagnose because Git aborts with a BUG message.
Erroring out is the right behavior: when asked to interpret a branch
name like "@{-1}", there is no reasonable answer in this context.
But we should print a message saying so instead of an assertion failure.
We do not forbid "check-ref-format --branch" from outside a repository
altogether because it is ok for a script to pre-process branch
arguments without @{...} in such a context. For example, with
pre-2.13 Git, a script that does
branch='master'; # default value
parse_options
branch=$(git check-ref-format --branch "$branch")
to normalize an optional branch name provided by the user would work
both inside a repository (where the user could provide '@{-1}') and
outside (where '@{-1}' should not be accepted).
So disable the "expand @{...}" half of the feature when run outside a
repository, but keep the check of the syntax of a proposed branch
name. This way, when run from outside a repository, "git
check-ref-format --branch @{-1}" will gracefully fail:
$ git check-ref-format --branch @{-1}
fatal: '@{-1}' is not a valid branch name
and "git check-ref-format --branch master" will succeed as before:
$ git check-ref-format --branch master
master
restoring the usual pre-2.13 behavior.
[jn: split out from a larger patch; moved conditional to
strbuf_check_branch_ref instead of its caller; fleshed out commit
message; some style tweaks in tests]
Reported-by: Marko Kungla <marko.kungla@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the callers and internals, including struct read_ref_at_cb, of
read_ref_at to use struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the callers of these functions just pass the hash member of a
struct object_id, so convert them to use a pointer to struct object_id
directly. Insert a check for NULL in expand_ref on a temporary basis;
this check can be removed when resolve_ref_unsafe is converted as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Minimize OID comparisons during disambiguation of packfile OIDs.
Teach git to use binary search with the full OID to find the object's
position (or insertion position, if not present) in the pack-index.
The object before and immediately after (or the one at the insertion
position) give the maximum common prefix. No subsequent linear search
is required.
Take care of which two to inspect, in case the object id exists in the
packfile.
If the input to find_unique_abbrev_r() is a partial prefix, then the
OID used for the binary search is padded with zeroes so the object will
not exist in the repo (with high probability) and the same logic
applies.
This commit completes a series of three changes to OID abbreviation
code, and the overall change can be seen using standard commands for
large repos. Below we report performance statistics for perf test 4211.6
from p4211-line-log.sh using three copies of the Linux repo:
| Packs | Loose | HEAD~3 | HEAD | Rel% |
|-------|--------|----------|----------|-------|
| 1 | 0 | 41.27 s | 38.93 s | -4.8% |
| 24 | 0 | 98.04 s | 91.35 s | -5.7% |
| 23 | 323952 | 117.78 s | 112.18 s | -4.8% |
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create get_hex_char_from_oid() to parse oids one hex character at a
time. This prevents unnecessary copying of hex characters in
extend_abbrev_len() when finding the length of a common prefix.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unroll the while loop inside find_unique_abbrev_r to avoid iterating
through all loose objects and packfiles multiple times when the short
name is longer than the predicted length.
Instead, inspect each object that collides with the estimated
abbreviation to find the longest common prefix.
The focus of this change is to refactor the existing method in a way
that clearly does not change the current behavior. In some cases, the
new method is slower than the previous method. Later changes will
correct all performance loss.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common mistake when writing binary search is to allow possible
integer overflow by using the simple average:
mid = (min + max) / 2;
Instead, use the overflow-safe version:
mid = min + (max - min) / 2;
This translation is safe since the operation occurs inside a loop
conditioned on "min < max". The included changes were found using
the following git grep:
git grep '/ *2;' '*.c'
Making this cleanup will prevent future review friction when a new
binary search is contructed based on existing code.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
alloc_packed_git() in packfile.c is duplicated from sha1_file.c. In a
subsequent commit, alloc_packed_git() will be removed from sha1_file.c.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several uses of the constant 40 in find_unique_abbrev_r.
Convert them to GIT_SHA1_HEXSZ.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the flags for get_oid_with_context and friends to use "OID"
instead of "SHA1" in their names.
This transform was made by running the following one-liner on the
affected files:
perl -pi -e 's/GET_SHA1/GET_OID/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert code that divides and rounds up to use DIV_ROUND_UP to make the
intent clearer and reduce the number of magic constants.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function for_each_file_in_obj_subdir() takes a object subdirectory
number and expects the name of the same subdirectory to be included in
the path strbuf. Avoid this redundancy by letting the function append
the hexadecimal subdirectory name itself. This makes it a bit easier
and safer to use the function -- it becomes impossible to specify
different subdirectories in subdir_nr and path.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Read each loose object subdirectory at most once when looking for unique
abbreviated hashes. This speeds up commands like "git log --pretty=%h"
considerably, which previously caused one readdir(3) call for each
candidate, even for subdirectories that were visited before.
The new cache is kept until the program ends and never invalidated. The
same is already true for pack indexes. The inherent racy nature of
finding unique short hashes makes it still fit for this purpose -- a
conflicting new object may be added at any time. Tasks with higher
consistency requirements should not use it, though.
The cached object names are stored in an oid_array, which is quite
compact. The bitmap for remembering which subdir was already read is
stored as a char array, with one char per directory -- that's not quite
as compact, but really simple and incurs only an overhead equivalent to
11 hashes after all.
Suggested-by: Jeff King <peff@peff.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the is_missing_file_error() helper introduced in the previous
step, update all hits from
$ git grep -e ENOENT --and -e ENOTDIR
There are codepaths that only check ENOENT, and it is possible that
some of them should be checking both. Updating them is kept out of
this step deliberately, as we do not want to change behaviour in this
step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a sha1 lookup returns the tree path via "struct
object_context", it just copies it into a fixed-size buffer.
This means the result can be truncated, and it means our
"struct object_context" consumes a lot of stack space.
Instead, let's allocate a string on the heap. Because most
callers don't care about this information, we'll avoid doing
it by default (so they don't all have to start calling
free() on the result). There are basically two options for
the caller to signal to us that it's interested:
1. By setting a pointer to storage in the object_context.
2. By passing a flag in another parameter.
Doing (1) would match the way that sha1_object_info_extended()
works. But it would mean that every caller would have to
initialize the object_context, which they don't currently
have to do.
This patch does (2), and adds a new bit to the function's
flags field. All of the callers that look at the "path"
field are updated to pass the new flag.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_sha1_with_context() function zeroes out the
oc->symlink_path strbuf, but doesn't use strbuf_init() to
set up the usual invariants (like pointing to the slopbuf).
We don't actually write to the oc->symlink_path strbuf
unless we call get_tree_entry_follow_symlinks(), and that
function does initialize it. However, readers may still look
at the zero'd strbuf.
In practice this isn't a triggerable bug. The only caller
that looks at it only does so when the mode we found is 0.
This doesn't happen for non-tree-entries (where we return
S_IFINVALID). A broken tree entry could have a mode of 0,
but canon_mode() quietly rewrites that into S_IFGITLINK.
So the "0" mode should only come up when we did indeed find
a symlink.
This is mostly just an accident of how the code happens to
work, though. Let's future-proof ourselves to make sure the
strbuf is properly initialized for all calls (it's only a
few struct member assignments, not a heap allocation).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An early version of the patch to add object_context used the
name object_resolve_context. This was later shortened to
just object_context, but the "orc" variable name stuck in a
few places. Let's use "oc", which is used elsewhere in the
code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert lookup_tag to take a pointer to struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert lookup_commit, lookup_commit_or_die,
lookup_commit_reference, and lookup_commit_reference_gently to take
struct object_id arguments.
Introduce a temporary in parse_object buffer in order to convert this
function. This is required since in order to convert parse_object and
parse_object_buffer, lookup_commit_reference_gently and
lookup_commit_or_die would need to be converted. Not introducing a
temporary would therefore require that lookup_commit_or_die take a
struct object_id *, but lookup_commit would take unsigned char *,
leaving a confusing and hard-to-use interface.
parse_object_buffer will lose this temporary in a later patch.
This commit was created with manual changes to commit.c, commit.h, and
object.c, plus the following semantic patch:
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1.hash, E2)
+ lookup_commit_reference_gently(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1->hash, E2)
+ lookup_commit_reference_gently(E1, E2)
@@
expression E1;
@@
- lookup_commit_reference(E1.hash)
+ lookup_commit_reference(&E1)
@@
expression E1;
@@
- lookup_commit_reference(E1->hash)
+ lookup_commit_reference(E1)
@@
expression E1;
@@
- lookup_commit(E1.hash)
+ lookup_commit(&E1)
@@
expression E1;
@@
- lookup_commit(E1->hash)
+ lookup_commit(E1)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1.hash, E2)
+ lookup_commit_or_die(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1->hash, E2)
+ lookup_commit_or_die(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a small number of remaining callers of lookup_commit_reference
and lookup_commit_reference_gently that still need to be converted to
struct object_id. Convert these.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this structure handles an array of object IDs, rename it to struct
oid_array. Also rename the accessor functions and the initialization
constant.
This commit was produced mechanically by providing non-Documentation
files to the following Perl one-liners:
perl -pi -E 's/struct sha1_array/struct oid_array/g'
perl -pi -E 's/\bsha1_array_/oid_array_/g'
perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make sha1_array_for_each_unique take a callback using struct object_id.
Since one of these callbacks is an argument to for_each_abbrev, convert
those as well. Rename various functions, replacing "sha1" with "oid".
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the callers to pass struct object_id by changing the function
declaration and definition and applying the following semantic patch:
@@
expression E1, E2;
@@
- sha1_array_append(E1, E2.hash)
+ sha1_array_append(E1, &E2)
@@
expression E1, E2;
@@
- sha1_array_append(E1, E2->hash)
+ sha1_array_append(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert this function pointer type and the functions that implement it
to take a struct object_id. Introduce a temporary in
show_ambiguous_object to avoid having to convert for_each_abbrev at this
point.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct disambiguate_state to use struct object_id by changing
the structure definition and applying the following semantic patch:
@@
struct disambiguate_state E1;
@@
- E1.bin_pfx
+ E1.bin_pfx.hash
@@
struct disambiguate_state *E1;
@@
- E1->bin_pfx
+ E1->bin_pfx.hash
@@
struct disambiguate_state E1;
@@
- E1.candidate
+ E1.candidate.hash
@@
struct disambiguate_state *E1;
@@
- E1->candidate
+ E1->candidate.hash
This conversion is needed so we can convert disambiguate_hint_fn later.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the revision parsing logic to match @{upstream}, @{u} & @{push}
case-insensitively.
Before this change supplying anything except the lower-case forms
emits an "unknown revision or path not in the working tree"
error. This change makes upper-case & mixed-case versions equivalent
to the lower-case versions.
The use-case for this is being able to hold the shift key down while
typing @{u} on certain keyboard layouts, which makes the sequence
easier to type, and reduces cases where git throws an error at the
user where it could do what he means instead.
These suffixes now join various other suffixes & special syntax
documented in gitrevisions(7) that matches case-insensitively. A table
showing the status of the various forms documented there before &
after this patch is shown below. The key for the table is:
- CI = Case Insensitive
- CIP = Case Insensitive Possible (without ambiguities)
- AG = Accepts Garbage (.e.g. @{./.4.minutes./.})
Before this change:
|----------------+-----+------+-----|
| What? | CI? | CIP? | AG? |
|----------------+-----+------+-----|
| @{<date>} | Y | Y | Y |
| @{upstream} | N | Y | N |
| @{push} | N | Y | N |
|----------------+-----+------+-----|
After it:
|----------------+-----+------+-----|
| What? | CI? | CIP? | AG? |
|----------------+-----+------+-----|
| @{<date>} | Y | Y | Y |
| @{upstream} | Y | Y | N |
| @{push} | Y | Y | N |
|----------------+-----+------+-----|
The ^{<type>} suffix is not made case-insensitive, because other
places that take <type> like "cat-file -t <type>" do want them case
sensitively (after all we never declared that type names are case
insensitive). Allowing case-insensitive typename only with this syntax
will make the resulting Git as a whole inconsistent.
This change was independently authored to scratch a longtime itch, but
when I was about to submit it I discovered that a similar patch had
been submitted unsuccessfully before by Conrad Irwin in August 2011 as
"rev-parse: Allow @{U} as a synonym for
@{u}" (<1313287071-7851-1-git-send-email-conrad.irwin@gmail.com>).
The tests for this patch are more exhaustive than in the 2011
submission. The starting point for them was to first change the code
to only support upper-case versions of the existing words, seeing what
broke, and amending the breaking tests to check upper case & mixed
case as appropriate, and where not redundant to other similar
tests. The implementation itself is equivalent.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we will likely be introducing a new hash function at some point,
and that hash function might be longer than 20 bytes, use the constant
GIT_MAX_RAWSZ, which is designed to be suitable for allocations, instead
of GIT_SHA1_RAWSZ. This will ease the transition down the line by
distinguishing between places where we need to allocate memory suitable
for the largest hash from those where we need to handle the current
hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we will likely be introducing a new hash function at some point,
and that hash function might be longer than 40 hex characters, use the
constant GIT_MAX_HEXSZ, which is designed to be suitable for
allocations, instead of GIT_SHA1_HEXSZ. This will ease the transition
down the line by distinguishing between places where we need to allocate
memory suitable for the largest hash from those where we need to handle
the current hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function asks strbuf_branchname() to expand any @-marks
in the branchname, and then we blindly stick refs/heads/ in
front of the result. This is obviously nonsense if the
expansion is "HEAD" or a ref in refs/remotes/.
The most obvious end-user effect is that creating or
renaming a branch with an expansion may have confusing
results (e.g., creating refs/heads/origin/master from
"@{upstream}" when the operation should be disallowed).
We can fix this by telling strbuf_branchname() that we are
only interested in local expansions. Any unexpanded bits are
then fed to check_ref_format(), which either disallows them
(in the case of "@{upstream}") or lets them through
("refs/heads/@" is technically valid, if a bit silly).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interpret_branch_name() function converts names like
@{-1} and @{upstream} into branch names. The expanded ref
names are not fully qualified, and may be outside of the
refs/heads/ namespace (e.g., "@" expands to "HEAD", and
"@{upstream}" is likely to be in "refs/remotes/").
This is OK for callers like dwim_ref() which are primarily
interested in resolving the resulting name, no matter where
it is. But callers like "git branch" treat the result as a
branch name in refs/heads/. When we expand to a ref outside
that namespace, the results are very confusing (e.g., "git
branch @" tries to create refs/heads/HEAD, which is
nonsense).
Callers can't know from the returned string how the
expansion happened (e.g., did the user really ask for a
branch named "HEAD", or did we do a bogus expansion?). One
fix would be to return some out-parameters describing the
types of expansion that occurred. This has the benefit that
the caller can generate precise error messages ("I
understood @{upstream} to mean origin/master, but that is a
remote tracking branch, so you cannot create it as a local
name").
However, out-parameters make the function interface somewhat
cumbersome. Instead, let's do the opposite: let the caller
tell us which elements to expand. That's easier to pass in,
and none of the callers give more precise error messages
than "@{upstream} isn't a valid branch name" anyway (which
should be sufficient).
The strbuf_branchname() function needs a similar parameter,
as most of the callers access interpret_branch_name()
through it.
We can break the callers down into two groups:
1. Callers that are happy with any kind of ref in the
result. We pass "0" here, so they continue to work
without restrictions. This includes merge_name(),
the reflog handling in add_pending_object_with_path(),
and substitute_branch_name(). This last is what powers
dwim_ref().
2. Callers that have funny corner cases (mostly in
git-branch and git-checkout). These need to make use of
the new parameter, but I've left them as "0" in this
patch, and will address them individually in follow-on
patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return value from strbuf_branchname() is confusing and
useless: it's 0 if the whole name was consumed by an @-mark,
but otherwise is the length of the original name we fed.
No callers actually look at the return value, so let's just
get rid of it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We generally put docstrings with function declarations,
because it's the callers who need to know how the function
works. Let's do so for interpret_branch_name().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interpret_branch_name() function takes a ptr/len pair
for the name, but you can pass "0" for "namelen", which will
cause it to check the length with strlen().
However, before we do that auto-namelen magic, we call
interpret_nth_prior_checkout(), which gets fed the bogus
"0". This was broken by 8cd4249c4 (interpret_branch_name:
always respect "namelen" parameter, 2014-01-15). Though to
be fair to that commit, it was broken in the _opposite_
direction before, where we would always treat "name" as a
string even if a length was passed.
You can see the bug with "git log -g @{-1}". That code path
always passes "0", and without this patch it cannot figure
out which branch's reflog to show.
We can fix it by a small reordering of the code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make each_reflog_ent_fn take two struct object_id pointers instead of
two pointers to unsigned char. Convert the various callbacks to use
struct object_id as well. Also, rename fsck_handle_reflog_sha1 to
fsck_handle_reflog_oid.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Overflow is defined for unsigned integers, but not for signed ones.
Wrap around explicitly for the new ring-buffer in find_unique_abbrev()
as we did in bb84735c for the ones in sha1_to_hex() and get_pathname(),
thus avoiding signed overflows and getting rid of the magic number 3.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some code paths want to format multiple abbreviated sha1s in
the same output line. Because we use a single static buffer
for our return value, they have to either break their output
into several calls or allocate their own arrays and use
find_unique_abbrev_r().
Intead, let's mimic sha1_to_hex() and use a ring of several
buffers, so that the return value stays valid through
multiple calls. This shortens some of the callers, and makes
it harder to for them to make a silly mistake.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We pre-size the scratch buffer to hold a loose object
filename of the form "xx/yyyy...", which leads to allocation
code that is hard to verify. We have to use some magic
numbers during the initial allocation, and then writers must
blindly assume that the buffer is big enough. Using a strbuf
makes it more clear that we cannot overflow.
Unfortunately, we do still need some magic numbers to grow
our strbuf before calling fill_sha1_path(), but the strbuf
growth is much closer to the point of use. This makes it
easier to see that it's correct, and opens the possibility
of pushing it even further down if fill_sha1_path() learns
to work on strbufs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The alternate_object_database struct uses a single buffer
both for storing the path to the alternate, and as a scratch
buffer for forming object names. This is efficient (since
otherwise we'd end up storing the path twice), but it makes
life hard for callers who just want to know the path to the
alternate. They have to remember to stop reading after
"alt->name - alt->base" bytes, and to subtract one for the
trailing '/'.
It would be much simpler if they could simply access a
NUL-terminated path string. We could encapsulate this in a
function which puts a NUL in the scratch buffer and returns
the string, but that opens up questions about the lifetime
of the result. The first time another caller uses the
alternate, the scratch buffer may get other data tacked onto
it.
Let's instead just store the root path separately from the
scratch buffer. There aren't enough alternates being stored
for the duplicated data to matter for performance, and this
keeps things simple and safe for the callers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allocating a struct alternate_object_database is tricky, as
we must over-allocate the buffer to provide scratch space,
and then put in particular '/' and NUL markers.
Let's encapsulate this in a function so that the complexity
doesn't leak into callers (and so that we can modify it
later).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_short_sha1() is only about reading short sha1s; we
do call it in a loop to check "is this long enough" for each
object, but otherwise it should not need to know about
things like our default_abbrev setting.
So instead of asking it to set default_automatic_abbrev as a
side-effect, let's just have find_unique_abbrev() pick the
right place to start its loop. This requires a separate
approximate_object_count() function, but that naturally
belongs with the rest of sha1_file.c.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In fairly early days we somehow decided to abbreviate object names
down to 7-hexdigits, but as projects grow, it is becoming more and
more likely to see such a short object names made in earlier days
and recorded in the log messages no longer unique.
Currently the Linux kernel project needs 11 to 12 hexdigits, while
Git itself needs 10 hexdigits to uniquely identify the objects they
have, while many smaller projects may still be fine with the
original 7-hexdigit default. One-size does not fit all projects.
Introduce a mechanism, where we estimate the number of objects in
the repository upon the first request to abbreviate an object name
with the default setting and come up with a sane default for the
repository. Based on the expectation that we would see collision in
a repository with 2^(2N) objects when using object names shortened
to first N bits, use sufficient number of hexdigits to cover the
number of objects in the repository. Each hexdigit (4-bits) we add
to the shortened name allows us to have four times (2-bits) as many
objects in the repository.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we find ambiguous short sha1s, we may get a
disambiguation rule from our caller's context. But if we
don't, we fall back to treating all sha1s the same, even
though most projects will tend to refer only to commits by
their short sha1s.
This patch introduces a configuration option that lets the
user pick a different fallback (e.g., only commits). It's
possible that we may want to make this the default, but it's
a good idea to start as a config option for two reasons:
1. It lets people experiment with this and see if it's a
good idea (i.e., the "tend to" above is an assumption;
we don't really know if this will break some obscure
cases).
2. Even if we do flip the default, it gives people an
escape hatch if it causes problems (you can sometimes
override it by asking for "1234^{tree}", but not all
combinations are possible).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>