Generalize the last remnants of "sha" and "sha1" in this file and rename
it to reflect that we're not just able to handle SHA-1 these days.
We need to update one test to check for an updated error string.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a user checks out the upstream branch of HEAD, the upstream branch
not being a local branch, and then runs "git status", like this:
git clone $URL client
cd client
git checkout @{u}
git status
no status is printed, but instead an error message:
fatal: HEAD does not point to a branch
(This error message when running "git branch" persists even after
checking out other things - it only stops after checking out a branch.)
This is because "git status" reads the reflog when determining the "HEAD
detached" message, and thus attempts to DWIM "@{u}", but that doesn't
work because HEAD no longer points to a branch.
Therefore, when calculating the status of a worktree, tolerate dangling
marks. This is done by adding an additional parameter to
dwim_ref() and repo_dwim_ref().
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for a future patch adding a boolean parameter to
repo_interpret_branch_name(), which might be easily confused with an
existing unsigned int parameter, refactor repo_interpret_branch_name()
to take an option struct instead of the unsigned int parameter.
The static function interpret_branch_mark() is also updated to take the
option struct in preparation for that future patch, since it will also
make use of the to-be-introduced boolean parameter.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "refs" pointer in a struct repository starts life as NULL, but then
is lazily initialized when it is accessed via get_main_ref_store().
However, it's easy for calling code to forget this and access it
directly, leading to code which works _some_ of the time, but fails if
it is called before anybody else accesses the refs.
This was the cause of the bug fixed by 5ff4b920eb (sha1-name: do not
assume that the ref store is initialized, 2020-04-09). In order to
prevent similar bugs, let's more clearly mark the "refs" field as
private.
In addition to helping future code, the name change will help us audit
any existing direct uses. Besides get_main_ref_store() itself, it turns
out there is only one. But we know it's OK as it is on the line directly
after the fix from 5ff4b920eb, which will have initialized the pointer.
However it's still a good idea for it to model the proper use of the
accessing function, so we'll convert it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
c931ba4e (sha1-name.c: remove the_repo from handle_one_ref(),
2019-04-16) replaced the use of for_each_ref() helper, which works
with the main ref store of the default repository instance, with
refs_for_each_ref(), which can work on any ref store instance, by
assuming that the repository instance the function is given has its
ref store already initialized.
But it is possible that nobody has initialized it, in which case,
the code ends up dereferencing a NULL pointer.
Reported-by: Érico Rolim <erico.erc@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
oid_array, 2017-03-31), but the file is still called sha1-array. Besides
being slightly confusing, it makes it more annoying to grep for leftover
occurrences of "sha1" in various files, because the header is included
in so many places.
Let's complete the transition by renaming the source and header files
(and fixing up a few comment references).
I kept the "-" in the name, as that seems to be our style; cf.
fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
We also have oidmap.h and oidset.h without any punctuation, but those
are "struct oidmap" and "struct oidset" in the code. We _could_ make
this "oidarray" to match, but somehow it looks uglier to me because of
the length of "array" (plus it would be a very invasive patch for little
gain).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our nth_packed_object_sha1() function returns NULL for error. So when we
wrapped it with nth_packed_object_oid(), we kept the same semantics. But
it's a bit funny, because the caller actually passes in an out
parameter, and the pointer we return is just that same struct they
passed to us (or NULL).
It's not too terrible, but it does make the interface a little
non-idiomatic. Let's switch to our usual "0 for success, negative for
error" return value. Most callers either don't check it, or are
trivially converted. The one that requires the biggest change is
actually improved, as we can ditch an extra aliased pointer variable.
Since we are changing the interface in a subtle way that the compiler
wouldn't catch, let's also change the name to catch any topics in
flight. We can drop the 'o' and make it nth_packed_object_id(). That's
slightly shorter, but also less redundant since the 'o' stands for
"object" already.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We often skip an optional prefix in a string with a hardcoded
constant, e.g.
if (starts_with(string, "prefix"))
string += 6;
which is less error prone when written
skip_prefix(string, "prefix", &string);
Note that this changes a few error messages from "git reflog expire
--expire=nonsense.timestamp", which used to complain by saying
'--expire=nonsense.timestamp' is not a valid timestamp
but with this change, we say
'nonsense.timestamp' is not a valid timestamp
which is more technically correct (the string with --expire= as
a prefix obviously cannot be a valid timestamp, but the error is
about the part of the input without that prefix).
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several error messages in get_oid() and its children that are
clearly intended for humans, but aren't marked for translation. E.g.:
$ git show :1:foo
fatal: Path 'foo' is in the index, but not at stage 1.
Did you mean ':0:foo'?
Let's mark these for translation. While we're at it, let's switch the
style to be more like our usual error messages: start with a lowercase
letter and omit a period at the end of the line.
This does mean that multi-line messages like the one above don't have
any punctuation between the two sentences. I solved that by adding a
"hint" marker like we'd see from advise(). So the result is:
$ git show :1:foo
fatal: path 'foo' is in the index, but not at stage 1
hint: Did you mean ':0:foo'?
A few tests had to be switched to test_i18ngrep and test_i18ncmp. Since
we were touching them anyway, I also simplified the ones using i18ngrep
a bit for readability.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass the target strbuf to the callback function grab_nth_branch_switch()
by reference so that it can add the result string directly instead of
having it put the string into a temporary strbuf first. This gets rid
of an extra allocation and a string copy.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reject values that don't fit into an int, as get_parent() and
get_nth_ancestor() cannot handle them. That's better than potentially
returning a random object.
If this restriction turns out to be too tight then we can switch to a
wider data type, but we'd still have to check for overflow.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use QSORT_S instead of QSORT, which allows passing the repository
pointer to the comparison function without using a static variable.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a couple of places where 'struct repository' is already passed
around, but the_repository is still used. Use the right repo.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When get_oid() and its helpers see an index name like ":.gitmodules",
they try to load the index on demand, like:
if (repo->index->cache)
repo_read_index(repo);
However, that misses the case when "repo->index" itself is NULL; we'll
segfault in the conditional.
This never happens with the_repository; there we always point its index
field to &the_index. But a submodule repository may have a NULL index
field until somebody calls repo_read_index().
This bug is triggered by t7411, but it was hard to notice because it's
in an expect_failure block. That test was added by 2b1257e463 (t/helper:
add test-submodule-nested-repo-config, 2018-10-25). Back then we had no
easy way to access the .gitmodules blob of a submodule repo, so we
expected (and got) an error message to that effect. Later, d9b8b8f896
(submodule-config.c: use repo_get_oid for reading .gitmodules,
2019-04-16) started looking in the correct repo, which is when we
started triggering the segfault.
With this fix, the test starts passing (once we clean it up as its
comment instructs).
Note that as far as I know, this bug could not be triggered outside of
the test suite. It requires resolving an index name in a submodule, and
all of the code paths (aside from test-tool) which do that either load
the index themselves, or always pass the_repository.
Ultimately it comes from 3a7a698e93 (sha1-name.c: remove implicit
dependency on the_index, 2019-01-12), which replaced a check of
"the_index.cache" with "repo->index->cache". So even if there is another
way to trigger it, it wouldn't affect any versions before then.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index allows searching for objects across multiple
packs using one object list. The original design gains many of
these performance benefits by keeping the packs in the
multi-pack-index out of the packed_git list.
Unfortunately, this has one major drawback. If the multi-pack-index
covers thousands of packs, and a command loads many of those packs,
then we can hit the limit for open file descriptors. The
close_one_pack() method is used to limit this resource, but it
only looks at the packed_git list, and uses an LRU cache to prevent
thrashing.
Instead of complicating this close_one_pack() logic to include
direct references to the multi-pack-index, simply add the packs
opened by the multi-pack-index to the packed_git list. This
immediately solves the file-descriptor limit problem, but requires
some extra steps to avoid performance issues or other problems:
1. Create a multi_pack_index bit in the packed_git struct that is
one if and only if the pack was loaded from a multi-pack-index.
2. Skip packs with the multi_pack_index bit when doing object
lookups and abbreviations. These algorithms already check the
multi-pack-index before the packed_git struct. This has a very
small performance hit, as we need to walk more packed_git
structs. This is acceptable, since these operations run binary
search on the other packs, so this walk-and-ignore logic is
very fast by comparison.
3. When closing a multi-pack-index file, do not close its packs,
as those packs will be closed using close_all_packs(). In some
cases, such as 'git repack', we run 'close_midx()' without also
closing the packs, so we need to un-set the multi_pack_index bit
in those packs. This is necessary, and caught by running
t6501-freshen-objects.sh with GIT_TEST_MULTI_PACK_INDEX=1.
To manually test this change, I inserted trace2 logging into
close_pack_fd() and set pack_max_fds to 10, then ran 'git rev-list
--all --objects' on a copy of the Git repo with 300+ pack-files and
a multi-pack-index. The logs verified the packs are closed as
we read them beyond the file descriptor limit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a cyclic dependency between one of these functions so they
cannot be converted one by one, so all related functions are converted
at once.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"remove" is not entirely correct. But at least the function is aware
that if the given repo is not the_repository, then $CWD and
is_inside_work_tree() means nothing.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
struct diff_filespec defines mode to be an 'unsigned short'. Several
other places in the API which we'd like to interact with using a
diff_filespec used a plain unsigned (or unsigned int). This caused
problems when taking addresses, so switch to unsigned short.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new command "git switch" is added to avoid the confusion of
one-command-do-all "git checkout" for new users. They are also helpful
to avoid ambiguation context.
For these reasons, promote it everywhere possible. This includes
documentation, suggestions/advice from other commands...
The "Checking out files" progress line in unpack-trees.c is also updated
to "Updating files" to be neutral to both git-checkout and git-switch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is quite possible that the loose object cache gets stale when new
objects are written. In that case, get_oid() would potentially say that
it cannot find a given object, even if it should find it.
Let's blow away the loose object cache as well as the read packs and try
again in that case.
Note: this does *not* affect the code path that was introduced to help
avoid looking for the same non-existing objects (which made some
operations really expensive via NFS): that code path is handled by the
`OBJECT_INFO_QUICK` flag (which does not even apply to `get_oid()`,
which has no equivalent flag, at least at the time this patch was
written).
This incidentally fixes the problem identified earlier where an
interactive rebase wanted to re-read (and validate) the todo list after
an `exec` command modified it.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compared to `get_oid()`, `get_oidf()` has as parameters
a pointer to `object_id`, a printf format string and
additional arguments. This will help simplify the code
in subsequent commits.
Original-idea-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_oid_with_context() function is declared to return an enum in
cache.h, but defined to return an int in sha1-name.c. The compiler
notices this on AIX and rejects the build, since d1dd94b308 (Do not
print 'dangling' for cat-file in case of ambiguity - 2019-01-17) was
merged.
Return the correct type from the implementation to fix this.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return values -1 and -2 from get_oid could mean two different
things, depending on whether they were from an enum returned by
get_tree_entry_follow_symlinks, or from a different code path. This
caused 'dangling' to be printed from a git cat-file in the case of an
ambiguous (-2) result.
Unify the results of get_oid* and get_tree_entry_follow_symlinks to be
one common type, with unambiguous values.
Signed-off-by: David Turner <novalis@novalis.org>
Reported-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This kills the_index dependency in get_oid_with_context() but for
get_oid() and friends, they still assume the_repository (which also
means the_index).
Unfortunately the widespread use of get_oid() will make it hard to
make the conversion now. We probably will add repo_get_oid() at some
point and limit the use of get_oid() in builtin/ instead of forcing
all get_oid() call sites to carry struct repository.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_index() shares the same problem as hold_locked_index(): it
assumes $GIT_DIR/index. Move all call sites to repo_read_index()
instead. read_index_preload() and read_index_unmerged() are also
killed as a consequence.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add and use a function for loading the entries of a loose object
subdirectory for a given object ID. It frees callers from deriving the
fanout key; they can use the returned oid_array reference for lookups or
forward range scans.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our object_directory struct has a loose objects cache that all users of
the struct can see. But the only one that knows how to load the cache is
find_short_object_filename(). Let's extract that logic in to a reusable
function.
While we're at it, let's also reset the cache when we re-read the object
directories. This shouldn't have an impact on performance, as re-reads
are meant to be rare (and are already expensive, so we avoid them with
things like OBJECT_INFO_QUICK).
Since the cache is already meant to be an approximation, it's tempting
to skip even this bit of safety. But it's necessary to allow more code
to use it. For instance, fetch-pack explicitly re-reads the object
directory after performing its fetch, and would be confused if we didn't
clear the cache.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>