kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Junio C Hamano	e2850a27a9	Second batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	898f80736c	Git 2.29.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	f9b6481aed	First batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	1d1c4a8759	other small fixes for 2.29.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	b927c80531	Git 2.29.1 Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Christian Couder	3e0a5dc9af	filter-branch doc: fix filter-repo typo The name of the tool is 'git-filter-repo' not 'git-repo-filter'. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Rafael Silva	c57b3367be	worktree: teach `list` to annotate locked worktree The "git worktree list" shows the absolute path to the working tree, the commit that is checked out and the name of the branch. It is not immediately obvious which of the worktrees, if any, are locked. "git worktree remove" refuses to remove a locked worktree with an error message. If "git worktree list" told which worktrees are locked in its output, the user would not even attempt to remove such a worktree, or would realize that "git worktree remove -f -f <path>" is required. Teach "git worktree list" to append "locked" to its output. The output from the command becomes like so: $ git worktree list /path/to/main abc123 [master] /path/to/worktree 456def (detached HEAD) /path/to/locked-worktree 123abc (detached HEAD) locked Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	d4a392452e	Git 2.29-rc1 Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jean-Noël Avila	9f443f5531	doc: fix the bnf like style of some commands In command line options, variables are entered between < and > Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jean-Noël Avila	89eed6fa99	doc: git-remote fix ups Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jean-Noël Avila	49fbf9ed71	doc: use linkgit macro where needed. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jean-Noël Avila	df49a806ab	git-bisect-lk2009: make continuation of list indented That's clearer asciidoc formatting. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Denton Liu	64f1f58fe7	checkout: learn to respect checkout.guess The current behavior of git checkout/switch is that --guess is currently enabled by default. However, some users may not wish for this to happen automatically. Instead of forcing users to specify --no-guess manually each time, teach these commands the checkout.guess configuration variable that gives users the option to set a default behavior. Teach the completion script to recognize the new config variable and disable DWIM logic if it is set to false. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Denton Liu	c693ef781b	Doc: document "A...B" form for <tree-ish> in checkout and switch Using "A...B" has been supported for the <tree-ish> argument for a while. However, its support has never been explicitly documented. Explicitly document it so that users know that it is available. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Denton Liu	ef09e7ddf3	Documentation/config/checkout: replace sq with backticks The modern style for Git documentation is to use backticks to quote any command-line documenation so that it is typeset in monospace. Replace all single quotes with backticks to conform to this. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	d98273ba77	Git 2.29-rc0 Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Samanta Navarro	3be01e5ab1	fast-import: fix typo in documentation Signed-off-by: Samanta Navarro <ferivoz@riseup.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Philippe Blain	7d15fdbe4c	gitsubmodules doc: invoke 'ls-files' with '--recurse-submodules' `git ls-files` was never taught to respect the `submodule.recurse` configuration variable, and it is too late now to change that [1], but still the command is mentioned in 'gitsubmodules(7)' as if it does respect that config. Adjust the call in 'gitsubmodules(7)' by calling 'ls-files' with the '--recurse-submodules' option. While at it, uniformize the capitalization in that file, and use backticks instead of quotes for Git commands and configuration variables. [1] https://lore.kernel.org/git/pull.732.git.1599707259907.gitgitgadget@gmail.com/T/#u Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	ab4691b67b	Nineteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Srinidhi Kaushik	3b5bf96573	t, doc: update tests, reference for "--force-if-includes" Update test cases for the new option, and document its usage and update related references. Update test cases for the new option, and document its usage and update related references. - t/t5533-push-cas.sh: Update test cases for "compare-and-swap" when used along with "--force-if-includes" helps mitigate overwrites when remote refs are updated in the background; allows forced updates when changes from remote are integrated locally. - Documentation: Add reference for the new option, configuration setting ("push.useForceIfIncludes") and advise messages. Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jacob Keller	7efba5fa39	format-patch: teach format.useAutoBase "whenAble" option The format.useAutoBase configuration option exists to allow users to enable '--base=auto' for format-patch by default. This can sometimes lead to poor workflow, due to unexpected failures when attempting to format an ancient patch: $ git format-patch -1 <an old commit> fatal: base commit shouldn't be in revision list This can be very confusing, as it is not necessarily immediately obvious that the user requested a --base (since this was in the configuration, not on the command line). We do want --base=auto to fail when it cannot provide a suitable base, as it would be equally confusing if a formatted patch did not include the base information when it was requested. Teach format.useAutoBase a new mode, "whenAble". This mode will cause format-patch to attempt to include a base commit when it can. However, if no valid base commit can be found, then format-patch will continue formatting the patch without a base commit. In order to avoid making yet another branch name unusable with --base, do not teach --base=whenAble or --base=whenable. Instead, refactor the base_commit option to use a callback, and rely on the global configuration variable auto_base. This does mean that a user cannot request this optional base commit generation from the command line. However, this is likely not too valuable. If the user requests base information manually, they will be immediately informed of the failure to acquire a suitable base commit. This allows the user to make an informed choice about whether to continue the format. Add tests to cover the new mode of operation for --base. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Sean Barag	de9ed3ef37	clone: allow configurable default for `-o`/`--origin` While the default remote name of "origin" can be changed at clone-time with `git clone`'s `--origin` option, it was previously not possible to specify a default value for the name of that remote. Add support for a new `clone.defaultRemoteName` config, with the newly-created remote name resolved in priority order: 1. (Highest priority) A remote name passed directly to `git clone -o` 2. A `clone.defaultRemoteName=new_name` in config `git clone -c` 3. A `clone.defaultRemoteName` value set in `/path/to/template/config`, where `--template=/path/to/template` is provided 4. A `clone.defaultRemoteName` value set in a non-template config file 5. The default value of `origin` Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Helped-by: Derrick Stolee <stolee@gmail.com> Helped-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jacob Keller	c0192df630	refspec: add support for negative refspecs Both fetch and push support pattern refspecs which allow fetching or pushing references that match a specific pattern. Because these patterns are globs, they have somewhat limited ability to express more complex situations. For example, suppose you wish to fetch all branches from a remote except for a specific one. To allow this, you must setup a set of refspecs which match only the branches you want. Because refspecs are either explicit name matches, or simple globs, many patterns cannot be expressed. Add support for a new type of refspec, referred to as "negative" refspecs. These are prefixed with a '^' and mean "exclude any ref matching this refspec". They can only have one "side" which always refers to the source. During a fetch, this refers to the name of the ref on the remote. During a push, this refers to the name of the ref on the local side. With negative refspecs, users can express more complex patterns. For example: git fetch origin refs/heads/:refs/remotes/origin/ ^refs/heads/dontwant will fetch all branches on origin into remotes/origin, but will exclude fetching the branch named dontwant. Refspecs today are commutative, meaning that order doesn't expressly matter. Rather than forcing an implied order, negative refspecs will always be applied last. That is, in order to match, a ref must match at least one positive refspec, and match none of the negative refspecs. This is similar to how negative pathspecs work. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	306ee63a70	Eighteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Evan Gates	287416dba6	Doc: show example scissors line The text tries to say the code accepts many variations that look remotely like scissors and perforation marks, but gives too little detail for users to decide what is and what is not taken as a scissors line for themselves. Instead of describing the heuristics more, just spell out what will always be accepted, namely "-- >8 --", as it would not help users to give them more choices and flexibility and be "creative" in their scissors line. Signed-off-by: Evan Gates <evan.gates@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Martin Ågren	71ccaa0993	config/uploadpack.txt: fix typo in `--filter=tree:<n>` That should be a ":", not a second "=". While at it, refer to the placeholder "<n>" as "<n>", not "n" (see, e.g., the entry just before this one). Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Martin Ågren	10a758479e	config/fmt-merge-msg.txt: drop space in quote We document how `merge.suppressDest` can be used to omit " into <branch name>" from the title of the merge message. It is true that we omit the space character before "into", but that lone double quote character risks ending up on the wrong side of a line break, looking a bit out of place. This currently happens with, e.g., 80-character terminals. Drop that leading quoted space. The result should be just as clear about how this option affects the formatted message. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jeff King	63d24fa0b0	shortlog: allow multiple groups to be specified Now that shortlog supports reading from trailers, it can be useful to combine counts from multiple trailers, or between trailers and authors. This can be done manually by post-processing the output from multiple runs, but it's non-trivial to make sure that each name/commit pair is counted only once. This patch teaches shortlog to accept multiple --group options on the command line, and pull data from all of them. That makes it possible to run: git shortlog -ns --group=author --group=trailer:co-authored-by to get a shortlog that counts authors and co-authors equally. The implementation is mostly straightforward. The "group" enum becomes a bitfield, and the trailer key becomes a list. I didn't bother implementing the multi-group semantics for reading from stdin. It would be possible to do, but the existing matching code makes it awkward, and I doubt anybody cares. The duplicate suppression we used for trailers now covers authors and committers as well (though in non-trailer single-group mode we can skip the hash insertion and lookup, since we only see one value per commit). There is one subtlety: we now care about the case when no group bit is set (in which case we default to showing the author). The caller in builtin/log.c needs to be adapted to ask explicitly for authors, rather than relying on shortlog_init(). It would be possible with some gymnastics to make this keep working as-is, but it's not worth it for a single caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jeff King	56d5dde752	shortlog: parse trailer idents Trailers don't necessarily contain name/email identity values, so shortlog has so far treated them as opaque strings. However, since many trailers do contain identities, it's useful to treat them as such when they can be parsed. That lets "-e" work as usual, as well as mailmap. When they can't be parsed, we'll continue with the old behavior of treating them as a single string (there's no new test for that here, since the existing tests cover a trailer like this). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jeff King	f17b0b99bf	shortlog: de-duplicate trailer values The current documentation is vague about what happens with --group=trailer:signed-off-by when we see a commit with: Signed-off-by: One Signed-off-by: Two Signed-off-by: One We clearly should credit both "One" and "Two", but should "One" get credited twice? The current code does so, but mostly because that was the easiest thing to do. It's probably more useful to count each commit at most once. This will become especially important when we allow values from multiple sources in a future patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jeff King	47beb37bc6	shortlog: match commit trailers with --group If a project uses commit trailers, this patch lets you use shortlog to see who is performing each action. For example, running: git shortlog -ns --group=trailer:reviewed-by in git.git shows who has reviewed. You can even use a custom format to see things like who has helped whom: git shortlog --format="...helped %an (%ad)" \ --group=trailer:helped-by Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jeff King	92338c450b	shortlog: add grouping option In preparation for adding more grouping types, let's refactor the committer/author grouping code and add a user-facing option that binds them together. In particular: - the main option is now "--group", to make it clear that the various group types are mutually exclusive. The "--committer" option is an alias for "--group=committer". - we keep an enum rather than a binary flag, to prepare for more values - we prefer switch statements to ternary assignment, since other group types will need more custom code Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	9bc233ae1c	Seventeenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Jeff King	eb049759fb	protocol: re-enable v2 protocol by default Protocol v2 became the default in v2.26.0 via `684ceae32d` (fetch: default to protocol version 2, 2019-12-23). More widespread use turned up a regression in negotiation. That was fixed in v2.27.0 via `4fa3f00abb` (fetch-pack: in protocol v2, in_vain only after ACK, 2020-04-27), but we also reverted the default to v0 as a precuation in `11c7f2a30b` (Revert "fetch: default to protocol version 2", 2020-04-22). In v2.28.0, we re-enabled it for experimental users with `3697caf4b9` (config: let feature.experimental imply protocol.version=2, 2020-05-20) and haven't heard any complaints. v2.28 has only been out for 2 months, but I'd generally expect people turning on feature.experimental to also stay pretty up-to-date. So we're not likely to collect much more data by waiting. In addition, we have no further reports from people running v2.26.0, and of course some people have been setting protocol.version manually for ages. Let's move forward with v2 as the default again. It's possible there are still lurking bugs, but we won't know until it gets more widespread use. And we can find and squash them just like any other bug at this point. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Derrick Stolee	e841a79a13	maintenance: add incremental-repack auto condition The incremental-repack task updates the multi-pack-index by deleting pack- files that have been replaced with new packs, then repacking a batch of small pack-files into a larger pack-file. This incremental repack is faster than rewriting all object data, but is slower than some other maintenance activities. The 'maintenance.incremental-repack.auto' config option specifies how many pack-files should exist outside of the multi-pack-index before running the step. These pack-files could be created by 'git fetch' commands or by the loose-objects task. The default value is 10. Setting the option to zero disables the task with the '--auto' option, and a negative value makes the task run every time. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Derrick Stolee	52fe41ff1c	maintenance: add incremental-repack task The previous change cleaned up loose objects using the 'loose-objects' that can be run safely in the background. Add a similar job that performs similar cleanups for pack-files. One issue with running 'git repack' is that it is designed to repack all pack-files into a single pack-file. While this is the most space-efficient way to store object data, it is not time or memory efficient. This becomes extremely important if the repo is so large that a user struggles to store two copies of the pack on their disk. Instead, perform an "incremental" repack by collecting a few small pack-files into a new pack-file. The multi-pack-index facilitates this process ever since 'git multi-pack-index expire' was added in `19575c7` (multi-pack-index: implement 'expire' subcommand, 2019-06-10) and 'git multi-pack-index repack' was added in `ce1e4a1` (midx: implement midx_repack(), 2019-06-10). The 'incremental-repack' task runs the following steps: 1. 'git multi-pack-index write' creates a multi-pack-index file if one did not exist, and otherwise will update the multi-pack-index with any new pack-files that appeared since the last write. This is particularly relevant with the background fetch job. When the multi-pack-index sees two copies of the same object, it stores the offset data into the newer pack-file. This means that some old pack-files could become "unreferenced" which I will use to mean "a pack-file that is in the pack-file list of the multi-pack-index but none of the objects in the multi-pack-index reference a location inside that pack-file." 2. 'git multi-pack-index expire' deletes any unreferenced pack-files and updaes the multi-pack-index to drop those pack-files from the list. This is safe to do as concurrent Git processes will see the multi-pack-index and not open those packs when looking for object contents. (Similar to the 'loose-objects' job, there are some Git commands that open pack-files regardless of the multi-pack-index, but they are rarely used. Further, a user that self-selects to use background operations would likely refrain from using those commands.) 3. 'git multi-pack-index repack --bacth-size=<size>' collects a set of pack-files that are listed in the multi-pack-index and creates a new pack-file containing the objects whose offsets are listed by the multi-pack-index to be in those objects. The set of pack- files is selected greedily by sorting the pack-files by modified time and adding a pack-file to the set if its "expected size" is smaller than the batch size until the total expected size of the selected pack-files is at least the batch size. The "expected size" is calculated by taking the size of the pack-file divided by the number of objects in the pack-file and multiplied by the number of objects from the multi-pack-index with offset in that pack-file. The expected size approximates how much data from that pack-file will contribute to the resulting pack-file size. The intention is that the resulting pack-file will be close in size to the provided batch size. The next run of the incremental-repack task will delete these repacked pack-files during the 'expire' step. In this version, the batch size is set to "0" which ignores the size restrictions when selecting the pack-files. It instead selects all pack-files and repacks all packed objects into a single pack-file. This will be updated in the next change, but it requires doing some calculations that are better isolated to a separate change. These steps are based on a similar background maintenance step in Scalar (and VFS for Git) [1]. This was incredibly effective for users of the Windows OS repository. After using the same VFS for Git repository for over a year, some users had _thousands_ of pack-files that combined to up to 250 GB of data. We noticed a few users were running into the open file descriptor limits (due in part to a bug in the multi-pack-index fixed by `af96fe3` (midx: add packs to packed_git linked list, 2019-04-29). These pack-files were mostly small since they contained the commits and trees that were pushed to the origin in a given hour. The GVFS protocol includes a "prefetch" step that asks for pre-computed pack- files containing commits and trees by timestamp. These pack-files were grouped into "daily" pack-files once a day for up to 30 days. If a user did not request prefetch packs for over 30 days, then they would get the entire history of commits and trees in a new, large pack-file. This led to a large number of pack-files that had poor delta compression. By running this pack-file maintenance step once per day, these repos with thousands of packs spanning 200+ GB dropped to dozens of pack- files spanning 30-50 GB. This was done all without removing objects from the system and using a constant batch size of two gigabytes. Once the work was done to reduce the pack-files to small sizes, the batch size of two gigabytes means that not every run triggers a repack operation, so the following run will not expire a pack-file. This has kept these repos in a "clean" state. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/PackfileMaintenanceStep.cs Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Derrick Stolee	18e449f86b	midx: enable core.multiPackIndex by default The core.multiPackIndex setting has been around since `c4d25228eb` (config: create core.multiPackIndex setting, 2018-07-12), but has been disabled by default. If a user wishes to use the multi-pack-index feature, then they must enable this config and run 'git multi-pack-index write'. The multi-pack-index feature is relatively stable now, so make the config option true by default. For users that do not use a multi-pack-index, the only extra cost will be a file lookup to see if a multi-pack-index file exists (once per process, per object directory). Also, this config option will be referenced by an upcoming "incremental-repack" task in the maintenance builtin, so move the config option into the repository settings struct. Note that if GIT_TEST_MULTI_PACK_INDEX=1, then we want to ignore the config option and treat core.multiPackIndex as enabled. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Derrick Stolee	3e220e6069	maintenance: create auto condition for loose-objects The loose-objects task deletes loose objects that already exist in a pack-file, then place the remaining loose objects into a new pack-file. If this step runs all the time, then we risk creating pack-files with very few objects with every 'git commit' process. To prevent overwhelming the packs directory with small pack-files, place a minimum number of objects to justify the task. The 'maintenance.loose-objects.auto' config option specifies a minimum number of loose objects to justify the task to run under the '--auto' option. This defaults to 100 loose objects. Setting the value to zero will prevent the step from running under '--auto' while a negative value will force it to run every time. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Derrick Stolee	252cfb7cb8	maintenance: add loose-objects task One goal of background maintenance jobs is to allow a user to disable auto-gc (gc.auto=0) but keep their repository in a clean state. Without any cleanup, loose objects will clutter the object database and slow operations. In addition, the loose objects will take up extra space because they are not stored with deltas against similar objects. Create a 'loose-objects' task for the 'git maintenance run' command. This helps clean up loose objects without disrupting concurrent Git commands using the following sequence of events: 1. Run 'git prune-packed' to delete any loose objects that exist in a pack-file. Concurrent commands will prefer the packed version of the object to the loose version. (Of course, there are exceptions for commands that specifically care about the location of an object. These are rare for a user to run on purpose, and we hope a user that has selected background maintenance will not be trying to do foreground maintenance.) 2. Run 'git pack-objects' on a batch of loose objects. These objects are grouped by scanning the loose object directories in lexicographic order until listing all loose objects -or- reaching 50,000 objects. This is more than enough if the loose objects are created only by a user doing normal development. We noticed users with _millions_ of loose objects because VFS for Git downloads blobs on-demand when a file read operation requires populating a virtual file. This step is based on a similar step in Scalar [1] and VFS for Git. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/LooseObjectsStep.cs Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Derrick Stolee	28cb5e66dd	maintenance: add prefetch task When working with very large repositories, an incremental 'git fetch' command can download a large amount of data. If there are many other users pushing to a common repo, then this data can rival the initial pack-file size of a 'git clone' of a medium-size repo. Users may want to keep the data on their local repos as close as possible to the data on the remote repos by fetching periodically in the background. This can break up a large daily fetch into several smaller hourly fetches. The task is called "prefetch" because it is work done in advance of a foreground fetch to make that 'git fetch' command much faster. However, if we simply ran 'git fetch <remote>' in the background, then the user running a foreground 'git fetch <remote>' would lose some important feedback when a new branch appears or an existing branch updates. This is especially true if a remote branch is force-updated and this isn't noticed by the user because it occurred in the background. Further, the functionality of 'git push --force-with-lease' becomes suspect. When running 'git fetch <remote> <options>' in the background, use the following options for careful updating: 1. --no-tags prevents getting a new tag when a user wants to see the new tags appear in their foreground fetches. 2. --refmap= removes the configured refspec which usually updates refs/remotes/<remote>/* with the refs advertised by the remote. While this looks confusing, this was documented and tested by `b40a50264a` (fetch: document and test --refmap="", 2020-01-21), including this sentence in the documentation: Providing an empty `<refspec>` to the `--refmap` option causes Git to ignore the configured refspecs and rely entirely on the refspecs supplied as command-line arguments. 3. By adding a new refspec "+refs/heads/:refs/prefetch/<remote>/" we can ensure that we actually load the new values somewhere in our refspace while not updating refs/heads or refs/remotes. By storing these refs here, the commit-graph job will update the commit-graph with the commits from these hidden refs. 4. --prune will delete the refs/prefetch/<remote> refs that no longer appear on the remote. 5. --no-write-fetch-head prevents updating FETCH_HEAD. We've been using this step as a critical background job in Scalar [1] (and VFS for Git). This solved a pain point that was showing up in user reports: fetching was a pain! Users do not like waiting to download the data that was created while they were away from their machines. After implementing background fetch, the foreground fetch commands sped up significantly because they mostly just update refs and download a small amount of new data. The effect is especially dramatic when paried with --no-show-forced-udpates (through fetch.showForcedUpdates=false). [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/FetchStep.cs Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	e1cfff6765	Sixteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
brian m. carlson	087c61677c	docs: explain how to deal with files that are always modified Users frequently have problems where two filenames differ only in case, causing one of those files to show up consistently as being modified. Let's add a FAQ entry that explains how to deal with that. In addition, let's explain another common case where files are consistently modified, which is when files using a smudge or clean filter have not been run through that filter. Explain the way to fix this as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
brian m. carlson	409f066716	docs: explain why reverts are not always applied on merge A common scenario is for a user to apply a change to one branch and cherry-pick it into another, then later revert it in the first branch. This results in the change being present when the two branches are merged, which is confusing to many users. We already have documentation for how this works in `git merge`, but it is clear from the frequency with which this is asked that it's hard to grasp. We also don't explain to users that they are better off doing a rebase in this case, which will do what they intended. Let's add an entry to the FAQ telling users what's happening and advising them to use rebase here. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
brian m. carlson	5065ce412e	docs: explain why squash merges are broken with long-running branches In many projects, squash merges are commonly used, primarily to keep a tidy history in the face of developers who do not use logically independent, bisectable commits. As common as this is, this tends to cause significant problems when squash merges are used to merge long-running branches due to the lack of any new merge bases. Even very experienced developers may make this mistake, so let's add a FAQ entry explaining why this is problematic and explaining that regular merge commits should be used to merge two long-running branches. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
René Scharfe	2947a7930d	archive: add --add-file Allow users to append non-tracked files. This simplifies the generation of source packages with a few extra files, e.g. containing version information. They get the same access times and user information as tracked files. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Junio C Hamano	385c171a01	Fifteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Aaron Lipman	b59cdffd7e	Doc: prefer more specific file name Change filters.txt to ref-reachability-filters.txt in order to avoid squatting on a file name that might be useful for another purpose. Signed-off-by: Aaron Lipman <alipman88@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Taylor Blau	d356d5debe	commit-graph: introduce 'commitGraph.maxNewFilters' Introduce a configuration variable to specify a default value for the recently-introduce '--max-new-filters' option of 'git commit-graph write'. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Taylor Blau	809e0327f5	builtin/commit-graph.c: introduce '--max-new-filters=<n>' Introduce a command-line flag to specify the maximum number of new Bloom filters that a 'git commit-graph write' is willing to compute from scratch. Prior to this patch, a commit-graph write with '--changed-paths' would compute Bloom filters for all selected commits which haven't already been computed (i.e., by a previous commit-graph write with '--split' such that a roll-up or replacement is performed). This behavior can cause prohibitively-long commit-graph writes for a variety of reasons: * There may be lots of filters whose diffs take a long time to generate (for example, they have close to the maximum number of changes, diffing itself takes a long time, etc). * Old-style commit-graphs (which encode filters with too many entries as not having been computed at all) cause us to waste time recomputing filters that appear to have not been computed only to discover that they are too-large. This can make the upper-bound of the time it takes for 'git commit-graph write --changed-paths' to be rather unpredictable. To make this command behave more predictably, introduce '--max-new-filters=<n>' to allow computing at most '<n>' Bloom filters from scratch. This lets "computing" already-known filters proceed quickly, while bounding the number of slow tasks that Git is willing to do. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago
Taylor Blau	59f0d5073f	bloom: encode out-of-bounds filters as non-empty When a changed-path Bloom filter has either zero, or more than a certain number (commonly 512) of entries, the commit-graph machinery encodes it as "missing". More specifically, it sets the indices adjacent in the BIDX chunk as equal to each other to indicate a "length 0" filter; that is, that the filter occupies zero bytes on disk. This has heretofore been fine, since the commit-graph machinery has no need to care about these filters with too few or too many changed paths. Both cases act like no filter has been generated at all, and so there is no need to store them. In a subsequent commit, however, the commit-graph machinery will learn to only compute Bloom filters for some commits in the current commit-graph layer. This is a change from the current implementation which computes Bloom filters for all commits that are in the layer being written. Critically for this patch, only computing some of the Bloom filters means adding a third state for length 0 Bloom filters: zero entries, too many entries, or "hasn't been computed". It will be important for that future patch to distinguish between "not representable" (i.e., zero or too-many changed paths), and "hasn't been computed". In particular, we don't want to waste time recomputing filters that have already been computed. To that end, change how we store Bloom filters in the "computed but not representable" category: - Bloom filters with no entries are stored as a single byte with all bits low (i.e., all queries to that Bloom filter will return "definitely not") - Bloom filters with too many entries are stored as a single byte with all bits set high (i.e., all queries to that Bloom filter will return "maybe"). These rules are sufficient to not incur a behavior change by changing the on-disk representation of these two classes. Likewise, no specification changes are necessary for the commit-graph format, either: - Filters that were previously empty will be recomputed and stored according to the new rules, and - old clients reading filters generated by new clients will interpret the filters correctly and be none the wiser to how they were generated. Clients will invoke the Bloom machinery in more cases than before, but this can be addressed by returning a NULL filter when all bits are set high. This can be addressed in a future patch. Note that this does increase the size of on-disk commit-graphs, but far less than other proposals. In particular, this is generally more efficient than storing a bitmap for which commits haven't computed their Bloom filters. Storing a bitmap incurs a penalty of one bit per commit, whereas storing explicit filters as above incurs a penalty of one byte per too-large or empty commit. In practice, these boundary commits likely occupy a small proportion of the overall number of commits, and so the size penalty is likely smaller than storing a bitmap for all commits. See, for example, these relative proportions of such boundary commits (collected by SZEDER Gábor): \| Percentage of \| commit-graph \| \| \| commits modifying \| file size \| \| ├────────┬──────────────┼───────────────────┤ pct. \| \| 0 path \| >= 512 paths \| before \| after \| change \| ┌────────────────┼────────┼──────────────┼─────────┼─────────┼───────────┤ \| android-base \| 13.20% \| 0.13% \| 37.468M \| 37.534M \| +0.1741 % \| \| cmssw \| 0.15% \| 0.23% \| 17.118M \| 17.119M \| +0.0091 % \| \| cpython \| 3.07% \| 0.01% \| 7.967M \| 7.971M \| +0.0423 % \| \| elasticsearch \| 0.70% \| 1.00% \| 8.833M \| 8.835M \| +0.0128 % \| \| gcc \| 0.00% \| 0.08% \| 16.073M \| 16.074M \| +0.0030 % \| \| gecko-dev \| 0.14% \| 0.64% \| 59.868M \| 59.874M \| +0.0105 % \| \| git \| 0.11% \| 0.02% \| 3.895M \| 3.895M \| +0.0020 % \| \| glibc \| 0.02% \| 0.10% \| 3.555M \| 3.555M \| +0.0021 % \| \| go \| 0.00% \| 0.07% \| 3.186M \| 3.186M \| +0.0018 % \| \| homebrew-cask \| 0.40% \| 0.02% \| 7.035M \| 7.035M \| +0.0065 % \| \| homebrew-core \| 0.01% \| 0.01% \| 11.611M \| 11.611M \| +0.0002 % \| \| jdk \| 0.26% \| 5.64% \| 5.537M \| 5.540M \| +0.0590 % \| \| linux \| 0.01% \| 0.51% \| 63.735M \| 63.740M \| +0.0073 % \| \| llvm-project \| 0.12% \| 0.03% \| 25.515M \| 25.516M \| +0.0050 % \| \| rails \| 0.10% \| 0.10% \| 6.252M \| 6.252M \| +0.0027 % \| \| rust \| 0.07% \| 0.17% \| 9.364M \| 9.364M \| +0.0033 % \| \| tensorflow \| 0.09% \| 1.02% \| 7.009M \| 7.010M \| +0.0158 % \| \| webkit \| 0.05% \| 0.31% \| 17.405M \| 17.406M \| +0.0047 % \| (where the above increase is determined by computing a non-split commit-graph before and after this patch). Given that these projects are all "large" by commit count, the storage cost by writing these filters explicitly is negligible. In the most extreme example, android-base (which has 494,848 commits at the time of writing) would have its commit-graph increase by a modest 68.4 KB. Finally, a test to exercise filters which contain too many changed path entries will be introduced in a subsequent patch. Suggested-by: SZEDER Gábor <szeder.dev@gmail.com> Suggested-by: Jakub Narębski <jnareb@gmail.com> Helped-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	4 years ago

1 2 3 4 5 ...

13034 Commits (18f09473bf8685a1f8db254cff915d4c31e86406)