kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Junio C Hamano	a271b05066	Merge branch 'ps/cat-file-filter-batch' "git cat-file --batch" and friends learned to allow "--filter=" to omit certain objects, just like the transport layer does. * ps/cat-file-filter-batch: builtin/cat-file: use bitmaps to efficiently filter by object type builtin/cat-file: deduplicate logic to iterate over all objects pack-bitmap: introduce function to check whether a pack is bitmapped pack-bitmap: add function to iterate over filtered bitmapped objects pack-bitmap: allow passing payloads to `show_reachable_fn()` builtin/cat-file: support "object:type=" objects filter builtin/cat-file: support "blob:limit=" objects filter builtin/cat-file: support "blob:none" objects filter builtin/cat-file: wire up an option to filter objects builtin/cat-file: introduce function to report object status builtin/cat-file: rename variable that tracks usage	2025-04-16 13:54:21 -07:00
Junio C Hamano	9bdd7ecf7e	Merge branch 'ps/test-wo-perl-prereq' "make test" used to have a hard dependency on (basic) Perl; tests have been rewritten help environment with NO_PERL test the build as much as possible. * ps/test-wo-perl-prereq: t5703: refactor test to not depend on Perl t5316: refactor `max_chain()` to not depend on Perl t0210: refactor trace2 scrubbing to not use Perl t0021: refactor `generate_random_characters()` to not depend on Perl t/lib-httpd: refactor "one-time-perl" CGI script to not depend on Perl t/lib-t6000: refactor `name_from_description()` to not depend on Perl t/lib-gpg: refactor `sanitize_pgp()` to not depend on Perl t: refactor tests depending on Perl for textconv scripts t: refactor tests depending on Perl to print data t: refactor tests depending on Perl substitution operator t: refactor tests depending on Perl transliteration operator Makefile: stop requiring Perl when running tests meson: stop requiring Perl when tests are enabled t: adapt existing PERL prerequisites t: introduce PERL_TEST_HELPERS prerequisite t: adapt `test_readlink()` to not use Perl t: adapt `test_copy_bytes()` to not use Perl t: adapt character translation helpers to not use Perl t: refactor environment sanitization to not use Perl t: skip chain lint when PERL_PATH is unset	2025-04-16 13:54:20 -07:00
Junio C Hamano	c39e5cbaa5	Merge branch 'jk/zlib-inflate-fixes' Fix our use of zlib corner cases. * jk/zlib-inflate-fixes: unpack_loose_rest(): rewrite return handling for clarity unpack_loose_rest(): simplify error handling unpack_loose_rest(): never clean up zstream unpack_loose_rest(): avoid numeric comparison of zlib status unpack_loose_header(): avoid numeric comparison of zlib status git_inflate(): skip zlib_post_call() sanity check on Z_NEED_DICT unpack_loose_header(): fix infinite loop on broken zlib input unpack_loose_header(): report headers without NUL as "bad" unpack_loose_header(): simplify next_out assignment loose_object_info(): BUG() on inflating content with unknown type	2025-04-15 13:50:14 -07:00
Patrick Steinhardt	64b3eee038	t: adapt existing PERL prerequisites A couple of our tests depend on the PERL prerequisite even though it isn't needed. These tests fall into one of the following classes: - The underlying logic used to be implemented in Perl but isn't anymore. Here we can simply drop the dependency altogether. - The test logic used to depend on Perl but doesn't anymore. Again, we can simply drop the dependency. - The test logic still relies on a Perl interpreter. These tests should use the newly introduced PERL_TEST_HELPERS prerequisite. Adapt test cases accordingly. Note that in t1006 we have to introduce another new prerequisite depending on whether or not the IPC::Open2 module is available. Funny enough, when starting to use `test_lazy_prereq` to do so we also get a conflict of variables with the "script" variable that contains the Perl logic because `test_run_lazy_prereq_` also sets that variable. We thus rename the variable in t1006 to "perl_script". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:47:38 -07:00
Patrick Steinhardt	23e21a58d5	t: introduce PERL_TEST_HELPERS prerequisite In the early days of Git, Perl was used quite prominently throughout the project. This has changed significantly as almost all of the executables we ship nowadays have eventually been rewritten in C. Only a handful of subsystems remain that require Perl: - gitweb, a read-only web interface. - A couple of scripts that allow importing repositories from GNU Arch, CVS and Subversion. - git-send-email(1), which can be used to send mails. - git-request-pull(1), which is used to request somebody to pull from a URL by sending an email. - git-filter-branch(1), which uses Perl with the `--state-branch` option. This command is typically recommended against nowadays in favor of git-filter-repo(1). - Our Perl bindings for Git. - The netrc Git credential helper. None of these subsystems can really be considered to be part of the "core" of Git, and an installation without them is fully functional. It is more likely than not that an end user wouldn't even notice that any features are missing if those tools weren't installed. But while Perl nowadays very much is an optional dependency of Git, there is a significant limitation when Perl isn't available: developers cannot run our test suite. Preceding commits have started to lift this restriction by removing the strict dependency on Perl in many central parts of the test library. But there are still many tests that rely on small Perl helpers to do various different things. Introduce a new PERL_TEST_HELPERS prerequisite that guards all tests that require Perl. This prerequisite is explicitly different than the preexisting PERL prerequisite: - PERL records whether or not features depending on the Perl interpreter are built. - PERL_TEST_HELPERS records whether or not a Perl interpreter is available for our tests. By having these two separate prerequisites we can thus distinguish between tests that inherently depend on Perl because the underlying feature does, and those tests that depend on Perl because the test itself is using Perl. Adapt all tests to set the PERL_TEST_HELPERS prerequisite as needed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:47:37 -07:00
Patrick Steinhardt	8fa9fe171a	builtin/cat-file: support "object:type=" objects filter Implement support for the "object:type=" filter in git-cat-file(1), which causes us to omit all objects that don't match the provided object type. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:43:51 -07:00
Patrick Steinhardt	dbe1b32d59	builtin/cat-file: support "blob:limit=" objects filter Implement support for the "blob:limit=" filter in git-cat-file(1), which causes us to omit all blobs that are bigger than a certain size. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:43:50 -07:00
Patrick Steinhardt	3794e9bf98	builtin/cat-file: support "blob:none" objects filter Implement support for the "blob:none" filter in git-cat-file(1), which causes us to omit all blobs. Note that this new filter requires us to read the object type via `oid_object_info_extended()` in `batch_object_write()`. But as we try to optimize away reading objects from the database the `data->info.typep` pointer may not be set. We thus have to adapt the logic to conditionally set the pointer in cases where the filter is given. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:43:50 -07:00
Patrick Steinhardt	eb83e4c64b	builtin/cat-file: wire up an option to filter objects In batch mode, git-cat-file(1) enumerates all objects and prints them by iterating through both loose and packed objects. This works without considering their reachability at all, and consequently most options to filter objects as they exist in e.g. git-rev-list(1) are not applicable. In some situations it may still be useful though to filter objects based on properties that are inherent to them. This includes the object size as well as its type. Such a filter already exists in git-rev-list(1) with the `--filter=` command line option. While this option supports a couple of filters that are not applicable to our usecase, some of them are quite a neat fit. Wire up the filter as an option for git-cat-file(1). This allows us to reuse the same syntax as in git-rev-list(1) so that we don't have to reinvent the wheel. For now, we die when any of the filter options has been passed by the user, but they will be wired up in subsequent commits. Further note that the filters that we are about to introduce don't significantly speed up the runtime of git-cat-file(1). While we can skip emitting a lot of objects in case they are uninteresting to us, the majority of time is spent reading the packfile, which is bottlenecked by I/O and not the processor. This will change though once we start to make use of bitmaps, which will allow us to skip reading the whole packfile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:43:50 -07:00
Jeff King	67a6b1aeb8	unpack_loose_header(): avoid numeric comparison of zlib status When unpacking a loose header, we try to inflate the first 32 bytes. We'd expect either Z_OK (we filled up the output buffer, but there are more bytes in the object) or Z_STREAM_END (this is a tiny object whose header and content fit in the buffer). We check for that with "if (status < Z_OK)", making the assumption that all of the errors we'd see have negative values (as Z_OK itself is "0", and Z_STREAM_END is "1"). But there's at least one case this misses: Z_NEED_DICT is "2". This isn't something we'd ever expect to see, but if we do see it, we should consider it an error (since we have no dictionary to load). Instead, the current code interprets Z_NEED_DICT as success and looks for the object header's terminating NUL in the bytes we've read. This will generaly be zero bytes if the dictionary is mentioned at the start of the stream. So we'll fail to find it and complain "the header is too long" (ULHR_LONG). But really, the problem is that the object is malformed, and we should return ULHR_BAD. This is a minor bug, as we consider both cases to be an error. But it does mean we print the wrong error message. The test case added in the previous patch triggers this code, so we can just confirm the error message we see here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-25 10:24:55 -08:00
Jeff King	0b1493c2d4	git_inflate(): skip zlib_post_call() sanity check on Z_NEED_DICT This fixes a case where malformed object input can cause us to hit a BUG() call in the git-zlib.c code. The zlib format allows the use of preset dictionaries to reduce the size of deflated data. The checksum of the dictionary is computed by the deflate code and goes into the stream. On the inflating side, zlib sees the dictionary checksum and returns Z_NEED_DICT, asking the caller to provide the dictionary data via inflateSetDictionary(). This should never happen in Git, because we never provide a dictionary for deflating (and if we get a stream that mentions a dictionary, we have no idea how to provide it). So normally Z_NEED_DICT is a hard error for us. But something interesting happens if we _do_ happen to see it (e.g., because of a corrupt or malicious input). In git_inflate() as we loop over calls to zlib's inflate(), we translate between our large-integer git_zstream values and zlib's native z_stream types, copying in and out with zlib_pre_call() and zlib_post_call(). In zlib_post_call() we have a few sanity checks, including one that checks that the number of bytes consumed by zlib (as measured by it moving the "next_in" pointer) is equal to the movement of its "total_in" count. But these do not correspond when we see Z_NEED_DICT! Zlib consumes the bytes from the input buffer but it does not increment total_in. And so we hit the BUG("total_in mismatch") call. There are a few options here: - We could ditch that BUG() check. It is making too many assumptions about how zlib updates these values. But it does have value in most cases as a sanity check on the values we're copying. - We could skip the zlib_post_call() entirely when we see Z_NEED_DICT. We know that it's hard error for us, so we should just send the status up the stack and let the caller bail. The downside is that if we ever did want to support dictionaries, we couldn't (the git_zstream will be out of sync, since we never copied its values back from the z_stream). - We could continue to call zlib_post_call(), but skip just that BUG() check if the status is Z_NEED_DICT. This keeps git_inflate() as a thin wrapper around inflate(), and would let us later support dictionaries for some calls if we wanted to. This patch uses the third approach. It seems like the least-surprising thing to keep git_inflate() a close to inflate() as possible. And while it makes the diff a bit larger (since we have to pass the status down to to the zlib_post_call() function), it's a static local function, and every caller by definition will have just made a zlib call (and so will have a status integer). Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-25 10:24:55 -08:00
Jeff King	b748ddb7a4	unpack_loose_header(): fix infinite loop on broken zlib input When reading a loose object, we first try to expand the first 32 bytes to read the type+size header. This is enough for any of the normal Git types. But since `46f034483e` (sha1_file: support reading from a loose object of unknown type, 2015-05-03), the caller can also ask us to parse any unknown names, which can be much longer. In this case we keep inflating until we find the NUL at the end of the header, or hit Z_STREAM_END. But what if zlib can't make forward progress? For example, if the loose object file is truncated, we'll have no more data to feed it. It will return Z_BUF_ERROR, and we'll just loop infinitely, calling git_inflate() over and over but never seeing new bytes nor an end-of-stream marker. We can fix this by only looping when we think we can make forward progress. This will always be Z_OK in this case. In other code we might also be able to continue on Z_BUF_ERROR, but: - We will never see Z_BUF_ERROR because the output buffer is full; we always feed a fresh 32-byte buffer on each call to git_inflate(). - We may see Z_BUF_ERROR if we run out of input. But since we've fed the whole mmap'd buffer to zlib, if it runs out of input there is nothing more we can do. So if we don't see Z_OK (and didn't see the end-of-header NUL, otherwise we'd have broken out of the loop), then we should stop looping and return an error. The test case shows an example where the input is truncated (which gives us the input Z_BUF_ERROR case above). Although we do operate on objects we might get from an untrusted remote, I don't think the security implications of this bug are too great. It can only trigger if both of these are true: - You're reading a loose object whose on-disk representation was written by an attacker. So fetching an object (or receiving a push) are mostly OK, because even with unpack-objects it is our local, trusted code that writes out the object file. The exception may be fetching from an untrusted local repo, or using dumb-http, which copies objects verbatim. But... - The only code path which triggers the inflate loop is cat-file's --allow-unknown-type option. This is unlikely to be called at all outside of debugging. But I also suspect that objects with non-standard types (or that are truncated) would not survive the usual fetch/receive checks in the first place. So I think it would be quite hard to trick somebody into running the infinite loop, and we can just fix the bug. Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-25 10:24:55 -08:00
Elijah Newren	71e19a0031	object-name: fix resolution of object names containing curly braces Given a branch name of 'foo{bar', commands like git cat-file -p foo{bar:README.md should succeed (assuming that branch had a README.md file, of course). However, the change in `cce91a2cae` (Change 'master@noon' syntax to 'master@{noon}'., 2006-05-19) presumed that curly braces would always come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md' to entirely miss the ':' and assume there's no object being referenced. In short, git would report: fatal: Not a valid object name foo{bar:README.md Change the parsing to only make the assumption of paired curly braces immediately after either a '@' or '^' character appears. Add tests for this, as well as for a few other test cases that initial versions of this patch broke: * 'foo@@{...}' * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}' Note that we'd prefer not duplicating the special logic for "@^" characters here, because if get_oid_basic() or interpret_nth_prior_checkout() or get_oid_basic() or similar gain extra methods of using curly braces, then the logic in get_oid_with_context_1() would need to be updated as well. But it's not clear how to refactor all of these to have a simple common callpoint with the specialized logic. Reported-by: Gabriel Amaral <gabriel-amaral@github.com> Helped-by: Michael Haggerty <mhagger@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-13 11:48:28 -08:00
Patrick Steinhardt	fc1ddf42af	t: remove TEST_PASSES_SANITIZE_LEAK annotations Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there is no longer a need to have that variable declared in all of our tests. Drop it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-21 08:23:48 +09:00
Patrick Steinhardt	9ddd5f755d	object-name: fix leaking symlink paths in object context The object context may be populated with symlink contents when reading a symlink, but the associated strbuf doesn't ever get released when releasing the object context, causing a memory leak. Plug it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-14 10:07:57 -07:00
Eric Wong	75daa42ddf	t1006: ensure cat-file info isn't buffered by default While working on buffering changes to `git cat-file' in a separate patch, I inadvertently made the output of --batch-check and the `info' command of --batch-command buffered as if opt->buffer_output is turned on by default. Buffering by default breaks some 3rd-party Perl scripts using cat-file, but this breakage was not detected anywhere in our test suite. Add a small Perl snippet to test this problem since (AFAIK) other equivalent ways to test this behavior from Bourne shell and/or awk would require racy sleeps, non-portable FIFOs or tedious C code. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-20 10:28:46 -07:00
Junio C Hamano	1002f28a52	Merge branch 'eb/hash-transition' Work to support a repository that work with both SHA-1 and SHA-256 hash algorithms has started. * eb/hash-transition: (30 commits) t1016-compatObjectFormat: add tests to verify the conversion between objects t1006: test oid compatibility with cat-file t1006: rename sha1 to oid test-lib: compute the compatibility hash so tests may use it builtin/ls-tree: let the oid determine the output algorithm object-file: handle compat objects in check_object_signature tree-walk: init_tree_desc take an oid to get the hash algorithm builtin/cat-file: let the oid determine the output algorithm rev-parse: add an --output-object-format parameter repository: implement extensions.compatObjectFormat object-file: update object_info_extended to reencode objects object-file-convert: convert commits that embed signed tags object-file-convert: convert commit objects when writing object-file-convert: don't leak when converting tag objects object-file-convert: convert tag objects when writing object-file-convert: add a function to convert trees between algorithms object: factor out parse_mode out of fast-import and tree-walk into in object.h cache: add a function to read an OID of a specific algorithm tag: sign both hashes commit: export add_header_signature to support handling signatures on tags ...	2024-03-28 14:13:50 -07:00
Junio C Hamano	0ebbaa07d0	Merge branch 'jk/t1006-cat-file-objectsize-disk' Test update. * jk/t1006-cat-file-objectsize-disk: t1006: prefer shell loop to awk for packed object sizes t1006: add tests for %(objectsize:disk)	2024-01-12 16:09:56 -08:00
René Scharfe	54d8a2531b	t1006: prefer shell loop to awk for packed object sizes To compute the expected on-disk size of packed objects, we sort the output of show-index by pack offset and then compute the difference between adjacent entries using awk. This works but has a few readability problems: 1. Reading the index in pack order means don't find out the size of an oid's entry until we see the _next_ entry. So we have to save it to print later. We can instead iterate in reverse order, so we compute each oid's size as we see it. 2. Since the awk invocation is inside a text_expect block, we can't easily use single-quotes to hold the script. So we use double-quotes, but then have to escape the dollar signs in the awk script. We can swap this out for a shell loop instead (which is made much easier by the first change). Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-01-03 09:26:53 -08:00
Jeff King	f546151228	t1006: add tests for %(objectsize:disk) Back when we added this placeholder in `a4ac106178` (cat-file: add %(objectsize:disk) format atom, 2013-07-10), there were no tests, claiming "[...]the exact numbers returned are volatile and subject to zlib and packing decisions". But we can use a little shell hackery to get the expected numbers ourselves. To a certain degree this is just re-implementing what Git is doing under the hood, but it is still worth doing. It makes sure we exercise the %(objectsize:disk) code at all, and having the two implementations agree gives us more confidence. Note that our shell code assumes that no object appears twice (either in two packs, or as both loose and packed), as then the results really are undefined. That's OK for our purposes, and the test will notice if that assumption is violated (the shell version would produce duplicate lines that Git's output does not have). Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-21 10:37:46 -08:00
René Scharfe	7854bf4960	i18n: factorize even more 'incompatible options' messages Continue the work of `12909b6b8a` (i18n: turn "options are incompatible" into "cannot be used together", 2022-01-05) and `a699367bb8` (i18n: factorize more 'incompatible options' messages, 2022-01-31) to use the same parameterized error message for reporting incompatible command line options. This reduces the number of strings to translate and makes the UI slightly more consistent. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-11-27 10:01:45 +09:00
Eric W. Biederman	3afa8d86ac	t1006: test oid compatibility with cat-file Update the existing tests that are oid based to test that cat-file works correctly with the normal oid and the compat_oid. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-10-02 14:57:40 -07:00
Eric W. Biederman	baab175c1d	t1006: rename sha1 to oid Before I extend this test, changing the naming of the relevant hash from sha1 to oid. Calling the hash sha1 is incorrect today as it can be either sha1 or sha256 depending on the value of GIT_DEFAULT_HASH_FUNCTION when the test is called. I plan to test sha1 and sha256 simultaneously in the same repository. Having a name like sha1 will be even more confusing. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-10-02 14:57:40 -07:00
Patrick Steinhardt	f79e18849b	cat-file: add option '-Z' that delimits input and output with NUL In `db9d67f2e9` (builtin/cat-file.c: support NUL-delimited input with `-z`, 2022-07-22), we have introduced a new mode to read the input via NUL-delimited records instead of newline-delimited records. This allows the user to query for revisions that have newlines in their path component. While unusual, such queries are perfectly valid and thus it is clear that we should be able to support them properly. Unfortunately, the commit only changed the input to be NUL-delimited, but didn't change the output at the same time. While this is fine for queries that are processed successfully, it is less so for queries that aren't. In the case of missing commits for example the result can become entirely unparsable: ``` $ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" \| git cat-file --batch -z `7ce4f05bae` blob 10 1234567890 commit missing ``` This is of course a crafted query that is intentionally gaming the deficiency, but more benign queries that contain newlines would have similar problems. Ideally, we should have also changed the output to be NUL-delimited when `-z` is specified to avoid this problem. As the input is NUL-delimited, it is clear that the output in this case cannot ever contain NUL characters by itself. Furthermore, Git does not allow NUL characters in revisions anyway, further stressing the point that using NUL-delimited output is safe. The only exception is of course the object data itself, but as git-cat-file(1) prints the size of the object data clients should read until that specified size has been consumed. But even though `-z` has only been introduced a few releases ago in Git v2.38.0, changing the output format retroactively to also NUL-delimit output would be a backwards incompatible change. And while one could make the argument that the output is inherently broken already, we need to assume that there are existing users out there that use it just fine given that revisions containing newlines are quite exotic. Instead, introduce a new option `-Z` that switches to NUL-delimited input and output. While this new option could arguably only switch the output format to be NUL-delimited, the consequence would be that users have to always specify both `-z` and `-Z` when the input may contain newlines. On the other hand, if the user knows that there never will be newlines in the input, they don't have to use either of those options. There is thus no usecase that would warrant treating input and output format separately, which is why we instead opt to "do the right thing" and have `-Z` mean to NUL-terminate both formats. The old `-z` option is marked as deprecated with a hint that its output may become unparsable. It is thus hidden both from the synopsis as well as the command's help output. Co-authored-by: Toon Claes <toon@iotcl.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:23:46 -07:00
Patrick Steinhardt	b116c77307	t1006: modernize test style to use `test_cmp` The tests for git-cat-file(1) are quite old and haven't ever been updated since they were introduced. They thus tend to use old idioms that have since grown outdated. Most importantly, many of the tests use `test $A = $B` to compare expected and actual output. This has the downside that it is impossible to tell what exactly is different between both versions in case the test fails. Refactor the tests to instead use `test_cmp`. While more verbose, it both tends to be more readable and will result in a nice diff in case states don't match. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:23:24 -07:00
Patrick Steinhardt	c7309f63c6	t1006: don't strip timestamps from expected results In t1006 we have a bunch of tests that verify the output format of the git-cat-file(1) command. But while part of the output for some tests would include commit timestamps, we don't verify those but instead strip them before comparing expected with actual results. This is done by the function `maybe_remove_timestamp`, which goes all the way back to the ancient commit `b335d3f121` (Add tests for git cat-file, 2008-04-23). Our tests had been in a different shape back then. Most importantly we didn't yet have the infrastructure to create objects with deterministic timestamps. Nowadays we do though, and thus there is no reason anymore to strip the timestamps. Refactor the tests to not strip the timestamp anymore. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:23:24 -07:00
Andrei Rybak	4e273368ce	t1006: assert error output of cat-file Test "cat-file $arg1 $arg2 error on missing full OID" in t1006-cat-file.sh compares files "expect.err" and "err.actual" to assert the expected error output of "git cat-file". A similar test in the same file named "cat-file $arg1 $arg2 error on missing short OID" also creates these two files, but doesn't use them in assertions. Assert error output of "git cat-file" in test "cat-file $arg1 $arg2 error on missing short OID" of t1006-cat-file.sh to improve test coverage. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 09:11:41 -07:00
Jeff King	61cc4be7ec	t1006: stop using 0-padded timestamps The fake objects in t1006 use dummy timestamps like "0000000000 +0000". While this does make them look more like normal timestamps (which, unless it is 1970, have many digits), it actually violates our fsck checks, which complain about zero-padded timestamps. This doesn't currently break anything, but let's future-proof our tests against a version of hash-object which is a little more careful about its input. We don't actually care about the exact values here (and in fact, the helper functions in this script end up removing the timestamps anyway, so we don't even have to adjust other parts of the tests). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-18 12:59:44 -08:00
Taylor Blau	db9d67f2e9	builtin/cat-file.c: support NUL-delimited input with `-z` When callers are using `cat-file` via one of the stdin-driven `--batch` modes, all input is newline-delimited. This presents a problem when callers wish to ask about, e.g. tree-entries that have a newline character present in their filename. To support this niche scenario, introduce a new `-z` mode to the `--batch`, `--batch-check`, and `--batch-command` suite of options that instructs `cat-file` to treat its input as NUL-delimited, allowing the individual commands themselves to have newlines present. The refactoring here is slightly unfortunate, since we turn loops like: while (strbuf_getline(&buf, stdin) != EOF) into: while (1) { int ret; if (opt->nul_terminated) ret = strbuf_getline_nul(&input, stdin); else ret = strbuf_getline(&input, stdin); if (ret == EOF) break; } It's tempting to think that we could use `strbuf_getwholeline()` and specify either `\n` or `\0` as the terminating character. But for input on platforms that include a CR character preceeding the LF, this wouldn't quite be the same, since `strbuf_getline(...)` will trim any trailing CR, while `strbuf_getwholeline(&buf, stdin, '\n')` will not. In the future, we could clean this up further by introducing a variant of `strbuf_getwholeline()` that addresses the aforementioned gap, but that approach felt too heavy-handed for this pair of uses. Some tests are added in t1006 to ensure that `cat-file` produces the same output in `--batch`, `--batch-check`, and `--batch-command` modes with and without the new `-z` option. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-22 21:42:06 -07:00
Taylor Blau	3639fefe7d	t1006: extract --batch-command inputs to variables A future commit will want to ensure that various `--batch`-related options produce the same output whether their input is newline terminated, or NUL terminated (and a to-be-implemented `-z` option exists). To prepare for this, extract the given input(s) into separate variables to that their LF characters can easily be converted into NUL bytes when testing the new `-z` mode. This is consistent with other tests in t1006 (which these days is no longer a shining example of our CodingGuidelines). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-22 21:42:05 -07:00
Ævar Arnfjörð Bjarmason	4627c67fa6	object-file: fix a unpack_loose_header() regression in `3b6a8db3b0` Fix a regression in my `3b6a8db3b0` (object-file.c: use "enum" return type for unpack_loose_header(), 2021-10-01) revealed both by running the test suite with --valgrind, and with the amended "git fsck" test. In practice this regression in v2.34.0 caused us to claim that we couldn't parse the header, as opposed to not being able to unpack it. Before the change in the C code the test_cmp added here would emit: -error: unable to unpack header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 +error: unable to parse header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 I.e. we'd proceed to call parse_loose_header() on the uninitialized "hdr" value, and it would have been very unlikely for that uninitialized memory to be a valid git object. The other callers of unpack_loose_header() were already checking the enum values exhaustively. See `3b6a8db3b0` and `5848fb11ac` (object-file.c: return ULHR_TOO_LONG on "header too long", 2021-10-01). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-12 15:42:26 -07:00
John Cai	440c705ea6	cat-file: add --batch-command mode Add a new flag --batch-command that accepts commands and arguments from stdin, similar to git-update-ref --stdin. At GitLab, we use a pair of long running cat-file processes when accessing object content. One for iterating over object metadata with --batch-check, and the other to grab object contents with --batch. However, if we had --batch-command, we wouldn't need to keep both processes around, and instead just have one --batch-command process where we can flip between getting object info, and getting object contents. Since we have a pair of cat-file processes per repository, this means we can get rid of roughly half of long lived git cat-file processes. Given there are many repositories being accessed at any given time, this can lead to huge savings. git cat-file --batch-command will enter an interactive command mode whereby the user can enter in commands and their arguments that get queued in memory: <command1> [arg1] [arg2] LF <command2> [arg1] [arg2] LF When --buffer mode is used, commands will be queued in memory until a flush command is issued that execute them: flush LF The reason for a flush command is that when a consumer process (A) talks to a git cat-file process (B) and interactively writes to and reads from it in --buffer mode, (A) needs to be able to control when the buffer is flushed to stdout. Currently, from (A)'s perspective, the only way is to either 1. kill (B)'s process 2. send an invalid object to stdin. 1. is not ideal from a performance perspective as it will require spawning a new cat-file process each time, and 2. is hacky and not a good long term solution. With this mechanism of queueing up commands and letting (A) issue a flush command, process (A) can control when the buffer is flushed and can guarantee it will receive all of the output when in --buffer mode. --batch-command also will not allow (B) to flush to stdout until a flush is received. This patch adds the basic structure for adding command which can be extended in the future to add more commands. It also adds the following two commands (on top of the flush command): contents <object> LF info <object> LF The contents command takes an <object> argument and prints out the object contents. The info command takes an <object> argument and prints out the object metadata. These can be used in the following way with --buffer: info <object> LF contents <object> LF contents <object> LF info <object> LF flush LF info <object> LF flush LF When used without --buffer: info <object> LF contents <object> LF contents <object> LF info <object> LF info <object> LF Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-18 11:21:46 -08:00
John Cai	4cf5d53b62	cat-file: add remove_timestamp helper maybe_remove_timestamp() takes arguments, but it would be useful to have a function that reads from stdin and strips the timestamp. This would allow tests to pipe data into a function to remove timestamps, and wouldn't have to always assign a variable. This is especially helpful when the data is multiple lines. Keep maybe_remove_timestamp() the same, but add a remove_timestamp helper that reads from stdin. The tests in the next patch will make use of this. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-18 11:21:46 -08:00
Junio C Hamano	008028a910	Merge branch 'ab/cat-file' Assorted updates to "git cat-file", especially "-h". * ab/cat-file: cat-file: s/_/-/ in typo'd usage_msg_optf() message cat-file: don't whitespace-pad "(...)" in SYNOPSIS and usage output cat-file: use GET_OID_ONLY_TO_DIE in --(textconv\|filters) object-name.c: don't have GET_OID_ONLY_TO_DIE imply *_QUIETLY cat-file: correct and improve usage information cat-file: fix remaining usage bugs cat-file: make --batch-all-objects a CMDMODE cat-file: move "usage" variable to cmd_cat_file() cat-file docs: fix SYNOPSIS and "-h" output parse-options API: add a usage_msg_optf() cat-file tests: test messaging on bad objects/paths cat-file tests: test bad usage	2022-02-05 09:42:31 -08:00
Junio C Hamano	4f4b18497a	Merge branch 'es/test-chain-lint' Broken &&-chains in the test scripts have been corrected. * es/test-chain-lint: t6000-t9999: detect and signal failure within loop t5000-t5999: detect and signal failure within loop t4000-t4999: detect and signal failure within loop t0000-t3999: detect and signal failure within loop tests: simplify by dropping unnecessary `for` loops tests: apply modern idiom for exiting loop upon failure tests: apply modern idiom for signaling test failure tests: fix broken &&-chains in `{...}` groups tests: fix broken &&-chains in `$(...)` command substitutions tests: fix broken &&-chains in compound statements tests: use test_write_lines() to generate line-oriented output tests: simplify construction of large blocks of text t9107: use shell parameter expansion to avoid breaking &&-chain t6300: make `%(raw:size) --shell` test more robust t5516: drop unnecessary subshell and command invocation t4202: clarify intent by creating expected content less cleverly t1020: avoid aborting entire test script when one test fails t1010: fix unnoticed failure on Windows t/lib-pager: use sane_unset() to avoid breaking &&-chain	2022-01-03 16:24:15 -08:00
Ævar Arnfjörð Bjarmason	b3fe468075	cat-file: fix remaining usage bugs With the migration of --batch-all-objects to OPT_CMDMODE() in the preceding commit one bug with combining it and other OPT_CMDMODE() options was solved, but we were still left with e.g. --buffer silently being discarded when not in batch mode. Fix all those bugs, and in addition emit errors telling the user specifically what options can't be combined with what other options, before this we'd usually just emit the cryptic usage text and leave the users to work it out by themselves. This change is rather large, because to do so we need to untangle the options processing so that we can not only error out, but emit sensible errors, and e.g. emit errors about options before errors about stray argc elements (as they might become valid if the option were removed). Some of the output changes ("error:" to "fatal:" with usage_msg_opt[f]()), but none of the exit codes change, except in those cases where we silently accepted bad option combinations before, now we'll error out. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-30 13:05:29 -08:00
Ævar Arnfjörð Bjarmason	485fd2c3da	cat-file: make --batch-all-objects a CMDMODE The usage of OPT_CMDMODE() in "cat-file"[1] was added in parallel with the development of[3] the --batch-all-objects option[4], so we've since grown[5] checks that it can't be combined with other command modes, when it should just be made a top-level command-mode instead. It doesn't combine with --filters, --textconv etc. By giving parse_options() information about what options are mutually exclusive with one another we can get the die() message being removed here for free, we didn't even use that removed message in some cases, e.g. for both of: --batch-all-objects --textconv --batch-all-objects --filters We'd take the "goto usage" in the "if (opt)" branch, and never reach the previous message. Now we'll emit e.g.: $ git cat-file --batch-all-objects --filters error: option `filters' is incompatible with --batch-all-objects 1. `b48158ac94` (cat-file: make the options mutually exclusive, 2015-05-03) 2. https://lore.kernel.org/git/xmqqtwspgusf.fsf@gitster.dls.corp.google.com/ 3. https://lore.kernel.org/git/20150622104559.GG14475@peff.net/ 4. `6a951937ae` (cat-file: add --batch-all-objects option, 2015-06-22) 5. `321459439e` (cat-file: support --textconv/--filters in batch mode, 2016-09-09) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-30 13:05:29 -08:00
Ævar Arnfjörð Bjarmason	ddf8420b59	cat-file tests: test bad usage Stress test the usage emitted when options are combined in ways that isn't supported. Let's test various option combinations, some of these we buggily allow right now. E.g. this reveals a bug in `321459439e` (cat-file: support --textconv/--filters in batch mode, 2016-09-09) that we'll fix in a subsequent commit. We're supposed to be emitting a relevant message when --batch-all-objects is combined with --textconv or --filters, but we don't. The cases of needing to assign to opt=2 in the "opt" loop are because on those we do the right thing already, in subsequent commits the "test_expect_failure" cases will be fixed, and the for-loops unified. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-30 13:05:28 -08:00
Eric Sunshine	7abcbcb7ea	tests: fix broken &&-chains in `{...}` groups The top-level &&-chain checker built into t/test-lib.sh causes tests to magically exit with code 117 if the &&-chain is broken. However, it has the shortcoming that the magic does not work within `{...}` groups, `(...)` subshells, `$(...)` substitutions, or within bodies of compound statements, such as `if`, `for`, `while`, `case`, etc. `chainlint.sed` partly fills in the gap by catching broken &&-chains in `(...)` subshells, but bugs can still lurk behind broken &&-chains in the other cases. Fix broken &&-chains in `{...}` groups in order to reduce the number of possible lurking bugs. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 10:29:48 -08:00
Eric Sunshine	c576868eaf	tests: fix broken &&-chains in `$(...)` command substitutions The top-level &&-chain checker built into t/test-lib.sh causes tests to magically exit with code 117 if the &&-chain is broken. However, it has the shortcoming that the magic does not work within `{...}` groups, `(...)` subshells, `$(...)` substitutions, or within bodies of compound statements, such as `if`, `for`, `while`, `case`, etc. `chainlint.sed` partly fills in the gap by catching broken &&-chains in `(...)` subshells, but bugs can still lurk behind broken &&-chains in the other cases. Fix broken &&-chains in `$(...)` command substitutions in order to reduce the number of possible lurking bugs. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 10:29:48 -08:00
Han-Wen Nienhuys	e9706a188f	refs: introduce REF_SKIP_OID_VERIFICATION flag This lets the ref-store test helper write non-existent or unparsable objects into the ref storage. Use this to make t1006 and t3800 independent of the files storage backend. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-07 13:15:19 -08:00
Junio C Hamano	061a21d36d	Merge branch 'ab/fsck-unexpected-type' "git fsck" has been taught to report mismatch between expected and actual types of an object better. * ab/fsck-unexpected-type: fsck: report invalid object type-path combinations fsck: don't hard die on invalid object types object-file.c: stop dying in parse_loose_header() object-file.c: return ULHR_TOO_LONG on "header too long" object-file.c: use "enum" return type for unpack_loose_header() object-file.c: simplify unpack_loose_short_header() object-file.c: make parse_loose_header_extended() public object-file.c: return -1, not "status" from unpack_loose_header() object-file.c: don't set "typep" when returning non-zero cat-file tests: test for current --allow-unknown-type behavior cat-file tests: add corrupt loose object test cat-file tests: test for missing/bogus object with -t, -s and -p cat-file tests: move bogus_* variable declarations earlier fsck tests: test for garbage appended to a loose object fsck tests: test current hash/type mismatch behavior fsck tests: refactor one test to use a sub-repo fsck tests: add test for fsck-ing an unknown type	2021-10-25 16:06:56 -07:00
Jeff King	5c5b29b459	cat-file: disable refs/replace with --batch-all-objects When we're enumerating all objects in the object database, it doesn't make sense to respect refs/replace. The point of this option is to enumerate all of the objects in the database at a low level. By definition we'd already show the replacement object's contents (under its real oid), and showing those contents under another oid is almost certainly working against what the user is trying to do. Note that you could make the same argument for something like: git show-index <foo.idx \| awk '{print $2}' \| git cat-file --batch but there we can't know in cat-file exactly what the user intended, because we don't know the source of the input. They could be trying to do low-level debugging, or they could be doing something more high-level (e.g., imagine a porcelain built around cat-file for its object accesses). So in those cases, we'll have to rely on the user specifying "git --no-replace-objects" to tell us what to do. One _could_ make an argument that "cat-file --batch" is sufficiently low-level plumbing that it should not respect replace-objects at all (and the caller should do any replacement if they want it). But we have been doing so for some time. The history is a little tangled: - looking back as far as v1.6.6, we would not respect replace refs for --batch-check, but would for --batch (because the former used sha1_object_info(), and the replace mechanism only affected actual object reads) - this discrepancy was made even weirder by `98e2092b50` (cat-file: teach --batch to stream blob objects, 2013-07-10), where we always output the header using the --batch-check code, and then printed the object separately. This could lead to "cat-file --batch" dying (when it notices the size or type changed for a non-blob object) or even producing bogus output (in streaming mode, we didn't notice that we wrote the wrong number of bytes). - that persisted until `1f7117ef7a` (sha1_file: perform object replacement in sha1_object_info_extended(), 2013-12-11), which then respected replace refs for both forms. So it has worked reliably this way for over 7 years, and we should make sure it continues to do so. That could also be an argument that --batch-all-objects should not change behavior (which this patch is doing), but I really consider the current behavior to be an unintended bug. It's a side effect of how the code is implemented (feeding the oids back into oid_object_info() rather than looking at what we found while reading the loose and packed object storage). The implementation is straight-forward: we just disable the global read_replace_refs flag when we're in --batch-all-objects mode. It would perhaps be a little cleaner to change the flag we pass to oid_object_info_extended(), but that's not enough. We also read objects via read_object_file() and stream_blob_to_fd(). The former could switch to its _extended() form, but the streaming code has no mechanism for disabling replace refs. Setting the global flag works, and as a bonus, it's impossible to have any "oops, we're sometimes replacing the object and sometimes not" bugs in the output (like the ones caused by `98e2092b50` above). The tests here cover the regular-input and --batch-all-objects cases, for both --batch-check and --batch. There is a test in t6050 that covers the regular-input case with --batch already, but this new one goes much further in actually verifying the output (plus covering --batch-check explicitly). This is perhaps a little overkill and the tests would be simpler just covering --batch-check, but I wanted to make sure we're checking that --batch output is consistent between the header and the content. The global-flag technique used here makes that easy to get right, but this is future-proofing us against regressions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-08 15:45:14 -07:00
Jeff King	e879295b20	t1006: clean up broken objects A few of the tests create intentionally broken objects with broken types. Let's clean them up after we're done with them, so that later tests don't get confused (we hadn't noticed because this only affects tests which use --batch-all-objects, but I'm about to add more). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-08 15:45:14 -07:00
Ævar Arnfjörð Bjarmason	96e41f58fe	fsck: report invalid object type-path combinations Improve the error that's emitted in cases where we find a loose object we parse, but which isn't at the location we expect it to be. Before this change we'd prefix the error with a not-a-OID derived from the path at which the object was found, due to an emergent behavior in how we'd end up with an "OID" in these codepaths. Now we'll instead say what object we hashed, and what path it was found at. Before this patch series e.g.: $ git hash-object --stdin -w -t blob </dev/null `e69de29bb2` $ mv objects/e6/ objects/e7 Would emit ("[...]" used to abbreviate the OIDs): git fsck error: hash mismatch for ./objects/e7/9d[...] (expected e79d[...]) error: e79d[...]: object corrupt or missing: ./objects/e7/9d[...] Now we'll instead emit: error: e69d[...]: hash-path mismatch, found at: ./objects/e7/9d[...] Furthermore, we'll do the right thing when the object type and its location are bad. I.e. this case: $ git hash-object --stdin -w -t garbage --literally </dev/null 8315a83d2acc4c174aed59430f9a9c4ed926440f $ mv objects/83 objects/84 As noted in an earlier commits we'd simply die early in those cases, until preceding commits fixed the hard die on invalid object type: $ git fsck fatal: invalid object type Now we'll instead emit sensible error messages: $ git fsck error: 8315[...]: hash-path mismatch, found at: ./objects/84/15[...] error: 8315[...]: object is of unknown type 'garbage': ./objects/84/15[...] In both fsck.c and object-file.c we're using null_oid as a sentinel value for checking whether we got far enough to be certain that the issue was indeed this OID mismatch. We need to add the "object corrupt or missing" special-case to deal with cases where read_loose_object() will return an error before completing check_object_signature(), e.g. if we have an error in unpack_loose_rest() because we find garbage after the valid gzip content: $ git hash-object --stdin -w -t blob </dev/null `e69de29bb2` $ chmod 755 objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 $ echo garbage >>objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 $ git fsck error: garbage at end of loose object 'e69d[...]' error: unable to unpack contents of ./objects/e6/9d[...] error: e69d[...]: object corrupt or missing: ./objects/e6/9d[...] There is currently some weird messaging in the edge case when the two are combined, i.e. because we're not explicitly passing along an error state about this specific scenario from check_stream_oid() via read_loose_object() we'll end up printing the null OID if an object is of an unknown type and it can't be unpacked by zlib, e.g.: $ git hash-object --stdin -w -t garbage --literally </dev/null 8315a83d2acc4c174aed59430f9a9c4ed926440f $ chmod 755 objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f $ echo garbage >>objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f $ /usr/bin/git fsck fatal: invalid object type $ ~/g/git/git fsck error: garbage at end of loose object '8315a83d2acc4c174aed59430f9a9c4ed926440f' error: unable to unpack contents of ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f error: 8315a83d2acc4c174aed59430f9a9c4ed926440f: object corrupt or missing: ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f error: 0000000000000000000000000000000000000000: object is of unknown type 'garbage': ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f [...] I think it's OK to leave that for future improvements, which would involve enum-ifying more error state as we've done with "enum unpack_loose_header_result" in preceding commits. In these increasingly more obscure cases the worst that can happen is that we'll get slightly nonsensical or inapplicable error messages. There's other such potential edge cases, all of which might produce some confusing messaging, but still be handled correctly as far as passing along errors goes. E.g. if check_object_signature() returns and oideq(real_oid, null_oid()) is true, which could happen if it returns -1 due to the read_istream() call having failed. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 15:06:01 -07:00
Ævar Arnfjörð Bjarmason	5848fb11ac	object-file.c: return ULHR_TOO_LONG on "header too long" Split up the return code for "header too long" from the generic negative return value unpack_loose_header() returns, and report via error() if we exceed MAX_HEADER_LEN. As a test added earlier in this series in t1006-cat-file.sh shows we'll correctly emit zlib errors from zlib.c already in this case, so we have no need to carry those return codes further down the stack. Let's instead just return ULHR_TOO_LONG saying we ran into the MAX_HEADER_LEN limit, or other negative values for "unable to unpack <OID> header". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 15:06:00 -07:00
Ævar Arnfjörð Bjarmason	dd45a56246	cat-file tests: test for current --allow-unknown-type behavior Add more tests for the current --allow-unknown-type behavior. As noted in [1] I don't think much of this makes sense, but let's test for it as-is so we can see if the behavior changes in the future. 1. https://lore.kernel.org/git/87r1i4qf4h.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 15:06:00 -07:00
Ævar Arnfjörð Bjarmason	7e7d220d9d	cat-file tests: add corrupt loose object test Fix a blindspot in the tests for "cat-file" (and by proxy, the guts of object-file.c) by testing that when we can't decode a loose object with zlib we'll emit an error from zlib.c. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 15:05:59 -07:00
Ævar Arnfjörð Bjarmason	59b8283d55	cat-file tests: test for missing/bogus object with -t, -s and -p When we look up a missing object with cat_one_file() what error we print out currently depends on whether we'll error out early in get_oid_with_context(), or if we'll get an error later from oid_object_info_extended(). The --allow-unknown-type flag then changes whether we pass the "OBJECT_INFO_ALLOW_UNKNOWN_TYPE" flag to get_oid_with_context() or not. The "-p" flag is yet another special-case in printing the same output on the deadbeef OID as we'd emit on the deadbeef_short OID for the "-s" and "-t" options, it also doesn't support the "--allow-unknown-type" flag at all. Let's test the combination of the two sets of [-t, -s, -p] and [--{no-}allow-unknown-type] (the --no-allow-unknown-type is implicit in not supplying it), as well as a [missing,bogus] object pair. This extends tests added in `3e370f9faf` (t1006: add tests for git cat-file --allow-unknown-type, 2015-05-03). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 15:05:59 -07:00
Ævar Arnfjörð Bjarmason	70e4a57762	cat-file tests: move bogus_* variable declarations earlier Change the short/long bogus bogus object type variables into a form where the two sets can be used concurrently. This'll be used by subsequently added tests. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 15:05:59 -07:00

1 2

87 Commits (2bc5414c411aab33c155b1070b7764ef6a49a02d)