Merge branch 'ds/commit-graph-incremental'
The commits in a repository can be described by multiple commit-graph files now, which allows the commit-graph files to be updated incrementally. * ds/commit-graph-incremental: commit-graph: test verify across alternates commit-graph: normalize commit-graph filenames commit-graph: test --split across alternate without --split commit-graph: test octopus merges with --split commit-graph: clean up chains after flattened write commit-graph: verify chains with --shallow mode commit-graph: create options for split files commit-graph: expire commit-graph files commit-graph: allow cross-alternate chains commit-graph: merge commit-graph chains commit-graph: add --split option to builtin commit-graph: write commit-graph chains commit-graph: rearrange chunk count logic commit-graph: add base graphs chunk commit-graph: load commit-graph chains commit-graph: rename commit_compare to oid_compare commit-graph: prepare for commit-graph chains commit-graph: document commit-graph chainsmaint
						commit
						92b1ea66b9
					
				|  | @ -10,7 +10,7 @@ SYNOPSIS | |||
| -------- | ||||
| [verse] | ||||
| 'git commit-graph read' [--object-dir <dir>] | ||||
| 'git commit-graph verify' [--object-dir <dir>] | ||||
| 'git commit-graph verify' [--object-dir <dir>] [--shallow] | ||||
| 'git commit-graph write' <options> [--object-dir <dir>] | ||||
|  | ||||
|  | ||||
|  | @ -26,7 +26,7 @@ OPTIONS | |||
| 	Use given directory for the location of packfiles and commit-graph | ||||
| 	file. This parameter exists to specify the location of an alternate | ||||
| 	that only has the objects directory, not a full `.git` directory. The | ||||
| 	commit-graph file is expected to be at `<dir>/info/commit-graph` and | ||||
| 	commit-graph file is expected to be in the `<dir>/info` directory and | ||||
| 	the packfiles are expected to be in `<dir>/pack`. | ||||
|  | ||||
|  | ||||
|  | @ -51,6 +51,25 @@ or `--stdin-packs`.) | |||
| + | ||||
| With the `--append` option, include all commits that are present in the | ||||
| existing commit-graph file. | ||||
| + | ||||
| With the `--split` option, write the commit-graph as a chain of multiple | ||||
| commit-graph files stored in `<dir>/info/commit-graphs`. The new commits | ||||
| not already in the commit-graph are added in a new "tip" file. This file | ||||
| is merged with the existing file if the following merge conditions are | ||||
| met: | ||||
| + | ||||
| * If `--size-multiple=<X>` is not specified, let `X` equal 2. If the new | ||||
| tip file would have `N` commits and the previous tip has `M` commits and | ||||
| `X` times `N` is greater than  `M`, instead merge the two files into a | ||||
| single file. | ||||
| + | ||||
| * If `--max-commits=<M>` is specified with `M` a positive integer, and the | ||||
| new tip file would have more than `M` commits, then instead merge the new | ||||
| tip with the previous tip. | ||||
| + | ||||
| Finally, if `--expire-time=<datetime>` is not specified, let `datetime` | ||||
| be the current time. After writing the split commit-graph, delete all | ||||
| unused commit-graph whose modified times are older than `datetime`. | ||||
|  | ||||
| 'read':: | ||||
|  | ||||
|  | @ -61,6 +80,9 @@ Used for debugging purposes. | |||
|  | ||||
| Read the commit-graph file and verify its contents against the object | ||||
| database. Used to check for corrupted data. | ||||
| + | ||||
| With the `--shallow` option, only check the tip commit-graph file in | ||||
| a chain of split commit-graphs. | ||||
|  | ||||
|  | ||||
| EXAMPLES | ||||
|  |  | |||
|  | @ -44,8 +44,9 @@ HEADER: | |||
|  | ||||
|   1-byte number (C) of "chunks" | ||||
|  | ||||
|   1-byte (reserved for later use) | ||||
|      Current clients should ignore this value. | ||||
|   1-byte number (B) of base commit-graphs | ||||
|       We infer the length (H*B) of the Base Graphs chunk | ||||
|       from this value. | ||||
|  | ||||
| CHUNK LOOKUP: | ||||
|  | ||||
|  | @ -92,6 +93,12 @@ CHUNK DATA: | |||
|       positions for the parents until reaching a value with the most-significant | ||||
|       bit on. The other bits correspond to the position of the last parent. | ||||
|  | ||||
|   Base Graphs List (ID: {'B', 'A', 'S', 'E'}) [Optional] | ||||
|       This list of H-byte hashes describe a set of B commit-graph files that | ||||
|       form a commit-graph chain. The graph position for the ith commit in this | ||||
|       file's OID Lookup chunk is equal to i plus the number of commits in all | ||||
|       base graphs.  If B is non-zero, this chunk must exist. | ||||
|  | ||||
| TRAILER: | ||||
|  | ||||
| 	H-byte HASH-checksum of all of the above. | ||||
|  |  | |||
|  | @ -127,6 +127,197 @@ Design Details | |||
|   helpful for these clones, anyway. The commit-graph will not be read or | ||||
|   written when shallow commits are present. | ||||
|  | ||||
| Commit Graphs Chains | ||||
| -------------------- | ||||
|  | ||||
| Typically, repos grow with near-constant velocity (commits per day). Over time, | ||||
| the number of commits added by a fetch operation is much smaller than the | ||||
| number of commits in the full history. By creating a "chain" of commit-graphs, | ||||
| we enable fast writes of new commit data without rewriting the entire commit | ||||
| history -- at least, most of the time. | ||||
|  | ||||
| ## File Layout | ||||
|  | ||||
| A commit-graph chain uses multiple files, and we use a fixed naming convention | ||||
| to organize these files. Each commit-graph file has a name | ||||
| `$OBJDIR/info/commit-graphs/graph-{hash}.graph` where `{hash}` is the hex- | ||||
| valued hash stored in the footer of that file (which is a hash of the file's | ||||
| contents before that hash). For a chain of commit-graph files, a plain-text | ||||
| file at `$OBJDIR/info/commit-graphs/commit-graph-chain` contains the | ||||
| hashes for the files in order from "lowest" to "highest". | ||||
|  | ||||
| For example, if the `commit-graph-chain` file contains the lines | ||||
|  | ||||
| ``` | ||||
| 	{hash0} | ||||
| 	{hash1} | ||||
| 	{hash2} | ||||
| ``` | ||||
|  | ||||
| then the commit-graph chain looks like the following diagram: | ||||
|  | ||||
|  +-----------------------+ | ||||
|  |  graph-{hash2}.graph  | | ||||
|  +-----------------------+ | ||||
| 	  | | ||||
|  +-----------------------+ | ||||
|  |                       | | ||||
|  |  graph-{hash1}.graph  | | ||||
|  |                       | | ||||
|  +-----------------------+ | ||||
| 	  | | ||||
|  +-----------------------+ | ||||
|  |                       | | ||||
|  |                       | | ||||
|  |                       | | ||||
|  |  graph-{hash0}.graph  | | ||||
|  |                       | | ||||
|  |                       | | ||||
|  |                       | | ||||
|  +-----------------------+ | ||||
|  | ||||
| Let X0 be the number of commits in `graph-{hash0}.graph`, X1 be the number of | ||||
| commits in `graph-{hash1}.graph`, and X2 be the number of commits in | ||||
| `graph-{hash2}.graph`. If a commit appears in position i in `graph-{hash2}.graph`, | ||||
| then we interpret this as being the commit in position (X0 + X1 + i), and that | ||||
| will be used as its "graph position". The commits in `graph-{hash2}.graph` use these | ||||
| positions to refer to their parents, which may be in `graph-{hash1}.graph` or | ||||
| `graph-{hash0}.graph`. We can navigate to an arbitrary commit in position j by checking | ||||
| its containment in the intervals [0, X0), [X0, X0 + X1), [X0 + X1, X0 + X1 + | ||||
| X2). | ||||
|  | ||||
| Each commit-graph file (except the base, `graph-{hash0}.graph`) contains data | ||||
| specifying the hashes of all files in the lower layers. In the above example, | ||||
| `graph-{hash1}.graph` contains `{hash0}` while `graph-{hash2}.graph` contains | ||||
| `{hash0}` and `{hash1}`. | ||||
|  | ||||
| ## Merging commit-graph files | ||||
|  | ||||
| If we only added a new commit-graph file on every write, we would run into a | ||||
| linear search problem through many commit-graph files.  Instead, we use a merge | ||||
| strategy to decide when the stack should collapse some number of levels. | ||||
|  | ||||
| The diagram below shows such a collapse. As a set of new commits are added, it | ||||
| is determined by the merge strategy that the files should collapse to | ||||
| `graph-{hash1}`. Thus, the new commits, the commits in `graph-{hash2}` and | ||||
| the commits in `graph-{hash1}` should be combined into a new `graph-{hash3}` | ||||
| file. | ||||
|  | ||||
| 			    +---------------------+ | ||||
| 			    |                     | | ||||
| 			    |    (new commits)    | | ||||
| 			    |                     | | ||||
| 			    +---------------------+ | ||||
| 			    |                     | | ||||
|  +-----------------------+  +---------------------+ | ||||
|  |  graph-{hash2} |->|                     | | ||||
|  +-----------------------+  +---------------------+ | ||||
| 	  |                 |                     | | ||||
|  +-----------------------+  +---------------------+ | ||||
|  |                       |  |                     | | ||||
|  |  graph-{hash1} |->|                     | | ||||
|  |                       |  |                     | | ||||
|  +-----------------------+  +---------------------+ | ||||
| 	  |                  tmp_graphXXX | ||||
|  +-----------------------+ | ||||
|  |                       | | ||||
|  |                       | | ||||
|  |                       | | ||||
|  |  graph-{hash0} | | ||||
|  |                       | | ||||
|  |                       | | ||||
|  |                       | | ||||
|  +-----------------------+ | ||||
|  | ||||
| During this process, the commits to write are combined, sorted and we write the | ||||
| contents to a temporary file, all while holding a `commit-graph-chain.lock` | ||||
| lock-file.  When the file is flushed, we rename it to `graph-{hash3}` | ||||
| according to the computed `{hash3}`. Finally, we write the new chain data to | ||||
| `commit-graph-chain.lock`: | ||||
|  | ||||
| ``` | ||||
| 	{hash3} | ||||
| 	{hash0} | ||||
| ``` | ||||
|  | ||||
| We then close the lock-file. | ||||
|  | ||||
| ## Merge Strategy | ||||
|  | ||||
| When writing a set of commits that do not exist in the commit-graph stack of | ||||
| height N, we default to creating a new file at level N + 1. We then decide to | ||||
| merge with the Nth level if one of two conditions hold: | ||||
|  | ||||
|   1. `--size-multiple=<X>` is specified or X = 2, and the number of commits in | ||||
|      level N is less than X times the number of commits in level N + 1. | ||||
|  | ||||
|   2. `--max-commits=<C>` is specified with non-zero C and the number of commits | ||||
|      in level N + 1 is more than C commits. | ||||
|  | ||||
| This decision cascades down the levels: when we merge a level we create a new | ||||
| set of commits that then compares to the next level. | ||||
|  | ||||
| The first condition bounds the number of levels to be logarithmic in the total | ||||
| number of commits.  The second condition bounds the total number of commits in | ||||
| a `graph-{hashN}` file and not in the `commit-graph` file, preventing | ||||
| significant performance issues when the stack merges and another process only | ||||
| partially reads the previous stack. | ||||
|  | ||||
| The merge strategy values (2 for the size multiple, 64,000 for the maximum | ||||
| number of commits) could be extracted into config settings for full | ||||
| flexibility. | ||||
|  | ||||
| ## Deleting graph-{hash} files | ||||
|  | ||||
| After a new tip file is written, some `graph-{hash}` files may no longer | ||||
| be part of a chain. It is important to remove these files from disk, eventually. | ||||
| The main reason to delay removal is that another process could read the | ||||
| `commit-graph-chain` file before it is rewritten, but then look for the | ||||
| `graph-{hash}` files after they are deleted. | ||||
|  | ||||
| To allow holding old split commit-graphs for a while after they are unreferenced, | ||||
| we update the modified times of the files when they become unreferenced. Then, | ||||
| we scan the `$OBJDIR/info/commit-graphs/` directory for `graph-{hash}` | ||||
| files whose modified times are older than a given expiry window. This window | ||||
| defaults to zero, but can be changed using command-line arguments or a config | ||||
| setting. | ||||
|  | ||||
| ## Chains across multiple object directories | ||||
|  | ||||
| In a repo with alternates, we look for the `commit-graph-chain` file starting | ||||
| in the local object directory and then in each alternate. The first file that | ||||
| exists defines our chain. As we look for the `graph-{hash}` files for | ||||
| each `{hash}` in the chain file, we follow the same pattern for the host | ||||
| directories. | ||||
|  | ||||
| This allows commit-graphs to be split across multiple forks in a fork network. | ||||
| The typical case is a large "base" repo with many smaller forks. | ||||
|  | ||||
| As the base repo advances, it will likely update and merge its commit-graph | ||||
| chain more frequently than the forks. If a fork updates their commit-graph after | ||||
| the base repo, then it should "reparent" the commit-graph chain onto the new | ||||
| chain in the base repo. When reading each `graph-{hash}` file, we track | ||||
| the object directory containing it. During a write of a new commit-graph file, | ||||
| we check for any changes in the source object directory and read the | ||||
| `commit-graph-chain` file for that source and create a new file based on those | ||||
| files. During this "reparent" operation, we necessarily need to collapse all | ||||
| levels in the fork, as all of the files are invalid against the new base file. | ||||
|  | ||||
| It is crucial to be careful when cleaning up "unreferenced" `graph-{hash}.graph` | ||||
| files in this scenario. It falls to the user to define the proper settings for | ||||
| their custom environment: | ||||
|  | ||||
|  1. When merging levels in the base repo, the unreferenced files may still be | ||||
|     referenced by chains from fork repos. | ||||
|  | ||||
|  2. The expiry time should be set to a length of time such that every fork has | ||||
|     time to recompute their commit-graph chain to "reparent" onto the new base | ||||
|     file(s). | ||||
|  | ||||
|  3. If the commit-graph chain is updated in the base, the fork will not have | ||||
|     access to the new chain until its chain is updated to reference those files. | ||||
|     (This may change in the future [5].) | ||||
|  | ||||
| Related Links | ||||
| ------------- | ||||
| [0] https://bugs.chromium.org/p/git/issues/detail?id=8 | ||||
|  | @ -153,3 +344,7 @@ Related Links | |||
|  | ||||
| [4] https://public-inbox.org/git/20180108154822.54829-1-git@jeffhostetler.com/T/#u | ||||
|     A patch to remove the ahead-behind calculation from 'status'. | ||||
|  | ||||
| [5] https://public-inbox.org/git/f27db281-abad-5043-6d71-cbb083b1c877@gmail.com/ | ||||
|     A discussion of a "two-dimensional graph position" that can allow reading | ||||
|     multiple commit-graph chains at the same time. | ||||
|  |  | |||
|  | @ -5,17 +5,18 @@ | |||
| #include "parse-options.h" | ||||
| #include "repository.h" | ||||
| #include "commit-graph.h" | ||||
| #include "object-store.h" | ||||
|  | ||||
| static char const * const builtin_commit_graph_usage[] = { | ||||
| 	N_("git commit-graph [--object-dir <objdir>]"), | ||||
| 	N_("git commit-graph read [--object-dir <objdir>]"), | ||||
| 	N_("git commit-graph verify [--object-dir <objdir>]"), | ||||
| 	N_("git commit-graph write [--object-dir <objdir>] [--append] [--reachable|--stdin-packs|--stdin-commits]"), | ||||
| 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow]"), | ||||
| 	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] <split options>"), | ||||
| 	NULL | ||||
| }; | ||||
|  | ||||
| static const char * const builtin_commit_graph_verify_usage[] = { | ||||
| 	N_("git commit-graph verify [--object-dir <objdir>]"), | ||||
| 	N_("git commit-graph verify [--object-dir <objdir>] [--shallow]"), | ||||
| 	NULL | ||||
| }; | ||||
|  | ||||
|  | @ -25,7 +26,7 @@ static const char * const builtin_commit_graph_read_usage[] = { | |||
| }; | ||||
|  | ||||
| static const char * const builtin_commit_graph_write_usage[] = { | ||||
| 	N_("git commit-graph write [--object-dir <objdir>] [--append] [--reachable|--stdin-packs|--stdin-commits]"), | ||||
| 	N_("git commit-graph write [--object-dir <objdir>] [--append|--split] [--reachable|--stdin-packs|--stdin-commits] <split options>"), | ||||
| 	NULL | ||||
| }; | ||||
|  | ||||
|  | @ -35,9 +36,10 @@ static struct opts_commit_graph { | |||
| 	int stdin_packs; | ||||
| 	int stdin_commits; | ||||
| 	int append; | ||||
| 	int split; | ||||
| 	int shallow; | ||||
| } opts; | ||||
|  | ||||
|  | ||||
| static int graph_verify(int argc, const char **argv) | ||||
| { | ||||
| 	struct commit_graph *graph = NULL; | ||||
|  | @ -45,11 +47,14 @@ static int graph_verify(int argc, const char **argv) | |||
| 	int open_ok; | ||||
| 	int fd; | ||||
| 	struct stat st; | ||||
| 	int flags = 0; | ||||
|  | ||||
| 	static struct option builtin_commit_graph_verify_options[] = { | ||||
| 		OPT_STRING(0, "object-dir", &opts.obj_dir, | ||||
| 			   N_("dir"), | ||||
| 			   N_("The object directory to store the graph")), | ||||
| 		OPT_BOOL(0, "shallow", &opts.shallow, | ||||
| 			 N_("if the commit-graph is split, only verify the tip file")), | ||||
| 		OPT_END(), | ||||
| 	}; | ||||
|  | ||||
|  | @ -59,21 +64,27 @@ static int graph_verify(int argc, const char **argv) | |||
|  | ||||
| 	if (!opts.obj_dir) | ||||
| 		opts.obj_dir = get_object_directory(); | ||||
| 	if (opts.shallow) | ||||
| 		flags |= COMMIT_GRAPH_VERIFY_SHALLOW; | ||||
|  | ||||
| 	graph_name = get_commit_graph_filename(opts.obj_dir); | ||||
| 	open_ok = open_commit_graph(graph_name, &fd, &st); | ||||
| 	if (!open_ok && errno == ENOENT) | ||||
| 		return 0; | ||||
| 	if (!open_ok) | ||||
| 	if (!open_ok && errno != ENOENT) | ||||
| 		die_errno(_("Could not open commit-graph '%s'"), graph_name); | ||||
| 	graph = load_commit_graph_one_fd_st(fd, &st); | ||||
|  | ||||
| 	FREE_AND_NULL(graph_name); | ||||
|  | ||||
| 	if (open_ok) | ||||
| 		graph = load_commit_graph_one_fd_st(fd, &st); | ||||
| 	 else | ||||
| 		graph = read_commit_graph_one(the_repository, opts.obj_dir); | ||||
|  | ||||
| 	/* Return failure if open_ok predicted success */ | ||||
| 	if (!graph) | ||||
| 		return 1; | ||||
| 		return !!open_ok; | ||||
|  | ||||
| 	UNLEAK(graph); | ||||
| 	return verify_commit_graph(the_repository, graph); | ||||
| 	return verify_commit_graph(the_repository, graph, flags); | ||||
| } | ||||
|  | ||||
| static int graph_read(int argc, const char **argv) | ||||
|  | @ -135,6 +146,7 @@ static int graph_read(int argc, const char **argv) | |||
| } | ||||
|  | ||||
| extern int read_replace_refs; | ||||
| static struct split_commit_graph_opts split_opts; | ||||
|  | ||||
| static int graph_write(int argc, const char **argv) | ||||
| { | ||||
|  | @ -156,9 +168,21 @@ static int graph_write(int argc, const char **argv) | |||
| 			N_("start walk at commits listed by stdin")), | ||||
| 		OPT_BOOL(0, "append", &opts.append, | ||||
| 			N_("include all commits already in the commit-graph file")), | ||||
| 		OPT_BOOL(0, "split", &opts.split, | ||||
| 			N_("allow writing an incremental commit-graph file")), | ||||
| 		OPT_INTEGER(0, "max-commits", &split_opts.max_commits, | ||||
| 			N_("maximum number of commits in a non-base split commit-graph")), | ||||
| 		OPT_INTEGER(0, "size-multiple", &split_opts.size_multiple, | ||||
| 			N_("maximum ratio between two levels of a split commit-graph")), | ||||
| 		OPT_EXPIRY_DATE(0, "expire-time", &split_opts.expire_time, | ||||
| 			N_("maximum number of commits in a non-base split commit-graph")), | ||||
| 		OPT_END(), | ||||
| 	}; | ||||
|  | ||||
| 	split_opts.size_multiple = 2; | ||||
| 	split_opts.max_commits = 0; | ||||
| 	split_opts.expire_time = 0; | ||||
|  | ||||
| 	argc = parse_options(argc, argv, NULL, | ||||
| 			     builtin_commit_graph_write_options, | ||||
| 			     builtin_commit_graph_write_usage, 0); | ||||
|  | @ -169,11 +193,16 @@ static int graph_write(int argc, const char **argv) | |||
| 		opts.obj_dir = get_object_directory(); | ||||
| 	if (opts.append) | ||||
| 		flags |= COMMIT_GRAPH_APPEND; | ||||
| 	if (opts.split) | ||||
| 		flags |= COMMIT_GRAPH_SPLIT; | ||||
|  | ||||
| 	read_replace_refs = 0; | ||||
|  | ||||
| 	if (opts.reachable) | ||||
| 		return write_commit_graph_reachable(opts.obj_dir, flags); | ||||
| 	if (opts.reachable) { | ||||
| 		if (write_commit_graph_reachable(opts.obj_dir, flags, &split_opts)) | ||||
| 			return 1; | ||||
| 		return 0; | ||||
| 	} | ||||
|  | ||||
| 	string_list_init(&lines, 0); | ||||
| 	if (opts.stdin_packs || opts.stdin_commits) { | ||||
|  | @ -193,7 +222,8 @@ static int graph_write(int argc, const char **argv) | |||
| 	if (write_commit_graph(opts.obj_dir, | ||||
| 			       pack_indexes, | ||||
| 			       commit_hex, | ||||
| 			       flags)) | ||||
| 			       flags, | ||||
| 			       &split_opts)) | ||||
| 		result = 1; | ||||
|  | ||||
| 	UNLEAK(lines); | ||||
|  |  | |||
|  | @ -1687,7 +1687,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix) | |||
| 		      "not exceeded, and then \"git restore --staged :/\" to recover.")); | ||||
|  | ||||
| 	if (git_env_bool(GIT_TEST_COMMIT_GRAPH, 0) && | ||||
| 	    write_commit_graph_reachable(get_object_directory(), 0)) | ||||
| 	    write_commit_graph_reachable(get_object_directory(), 0, NULL)) | ||||
| 		return 1; | ||||
|  | ||||
| 	repo_rerere(the_repository, 0); | ||||
|  |  | |||
|  | @ -687,7 +687,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix) | |||
|  | ||||
| 	if (gc_write_commit_graph && | ||||
| 	    write_commit_graph_reachable(get_object_directory(), | ||||
| 					 !quiet && !daemonized ? COMMIT_GRAPH_PROGRESS : 0)) | ||||
| 					 !quiet && !daemonized ? COMMIT_GRAPH_PROGRESS : 0, | ||||
| 					 NULL)) | ||||
| 		return 1; | ||||
|  | ||||
| 	if (auto_gc && too_many_loose_objects()) | ||||
|  |  | |||
							
								
								
									
										833
									
								
								commit-graph.c
								
								
								
								
							
							
						
						
									
										833
									
								
								commit-graph.c
								
								
								
								
							
										
											
												File diff suppressed because it is too large
												Load Diff
											
										
									
								
							|  | @ -47,15 +47,21 @@ struct commit_graph { | |||
| 	unsigned char num_chunks; | ||||
| 	uint32_t num_commits; | ||||
| 	struct object_id oid; | ||||
| 	char *filename; | ||||
| 	const char *obj_dir; | ||||
|  | ||||
| 	uint32_t num_commits_in_base; | ||||
| 	struct commit_graph *base_graph; | ||||
|  | ||||
| 	const uint32_t *chunk_oid_fanout; | ||||
| 	const unsigned char *chunk_oid_lookup; | ||||
| 	const unsigned char *chunk_commit_data; | ||||
| 	const unsigned char *chunk_extra_edges; | ||||
| 	const unsigned char *chunk_base_graphs; | ||||
| }; | ||||
|  | ||||
| struct commit_graph *load_commit_graph_one_fd_st(int fd, struct stat *st); | ||||
|  | ||||
| struct commit_graph *read_commit_graph_one(struct repository *r, const char *obj_dir); | ||||
| struct commit_graph *parse_commit_graph(void *graph_map, int fd, | ||||
| 					size_t graph_size); | ||||
|  | ||||
|  | @ -67,6 +73,13 @@ int generation_numbers_enabled(struct repository *r); | |||
|  | ||||
| #define COMMIT_GRAPH_APPEND     (1 << 0) | ||||
| #define COMMIT_GRAPH_PROGRESS   (1 << 1) | ||||
| #define COMMIT_GRAPH_SPLIT      (1 << 2) | ||||
|  | ||||
| struct split_commit_graph_opts { | ||||
| 	int size_multiple; | ||||
| 	int max_commits; | ||||
| 	timestamp_t expire_time; | ||||
| }; | ||||
|  | ||||
| /* | ||||
|  * The write_commit_graph* methods return zero on success | ||||
|  | @ -74,13 +87,17 @@ int generation_numbers_enabled(struct repository *r); | |||
|  * is not compatible with the commit-graph feature, then the | ||||
|  * methods will return 0 without writing a commit-graph. | ||||
|  */ | ||||
| int write_commit_graph_reachable(const char *obj_dir, unsigned int flags); | ||||
| int write_commit_graph_reachable(const char *obj_dir, unsigned int flags, | ||||
| 				 const struct split_commit_graph_opts *split_opts); | ||||
| int write_commit_graph(const char *obj_dir, | ||||
| 		       struct string_list *pack_indexes, | ||||
| 		       struct string_list *commit_hex, | ||||
| 		       unsigned int flags); | ||||
| 		       unsigned int flags, | ||||
| 		       const struct split_commit_graph_opts *split_opts); | ||||
|  | ||||
| int verify_commit_graph(struct repository *r, struct commit_graph *g); | ||||
| #define COMMIT_GRAPH_VERIFY_SHALLOW	(1 << 0) | ||||
|  | ||||
| int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags); | ||||
|  | ||||
| void close_commit_graph(struct raw_object_store *); | ||||
| void free_commit_graph(struct commit_graph *); | ||||
|  |  | |||
|  | @ -20,7 +20,7 @@ test_expect_success 'verify graph with no graph file' ' | |||
| test_expect_success 'write graph with no packs' ' | ||||
| 	cd "$TRASH_DIRECTORY/full" && | ||||
| 	git commit-graph write --object-dir . && | ||||
| 	test_path_is_file info/commit-graph | ||||
| 	test_path_is_missing info/commit-graph | ||||
| ' | ||||
|  | ||||
| test_expect_success 'close with correct error on bad input' ' | ||||
|  |  | |||
|  | @ -0,0 +1,343 @@ | |||
| #!/bin/sh | ||||
|  | ||||
| test_description='split commit graph' | ||||
| . ./test-lib.sh | ||||
|  | ||||
| GIT_TEST_COMMIT_GRAPH=0 | ||||
|  | ||||
| test_expect_success 'setup repo' ' | ||||
| 	git init && | ||||
| 	git config core.commitGraph true && | ||||
| 	infodir=".git/objects/info" && | ||||
| 	graphdir="$infodir/commit-graphs" && | ||||
| 	test_oid_init | ||||
| ' | ||||
|  | ||||
| graph_read_expect() { | ||||
| 	NUM_BASE=0 | ||||
| 	if test ! -z $2 | ||||
| 	then | ||||
| 		NUM_BASE=$2 | ||||
| 	fi | ||||
| 	cat >expect <<- EOF | ||||
| 	header: 43475048 1 1 3 $NUM_BASE | ||||
| 	num_commits: $1 | ||||
| 	chunks: oid_fanout oid_lookup commit_metadata | ||||
| 	EOF | ||||
| 	git commit-graph read >output && | ||||
| 	test_cmp expect output | ||||
| } | ||||
|  | ||||
| test_expect_success 'create commits and write commit-graph' ' | ||||
| 	for i in $(test_seq 3) | ||||
| 	do | ||||
| 		test_commit $i && | ||||
| 		git branch commits/$i || return 1 | ||||
| 	done && | ||||
| 	git commit-graph write --reachable && | ||||
| 	test_path_is_file $infodir/commit-graph && | ||||
| 	graph_read_expect 3 | ||||
| ' | ||||
|  | ||||
| graph_git_two_modes() { | ||||
| 	git -c core.commitGraph=true $1 >output | ||||
| 	git -c core.commitGraph=false $1 >expect | ||||
| 	test_cmp expect output | ||||
| } | ||||
|  | ||||
| graph_git_behavior() { | ||||
| 	MSG=$1 | ||||
| 	BRANCH=$2 | ||||
| 	COMPARE=$3 | ||||
| 	test_expect_success "check normal git operations: $MSG" ' | ||||
| 		graph_git_two_modes "log --oneline $BRANCH" && | ||||
| 		graph_git_two_modes "log --topo-order $BRANCH" && | ||||
| 		graph_git_two_modes "log --graph $COMPARE..$BRANCH" && | ||||
| 		graph_git_two_modes "branch -vv" && | ||||
| 		graph_git_two_modes "merge-base -a $BRANCH $COMPARE" | ||||
| 	' | ||||
| } | ||||
|  | ||||
| graph_git_behavior 'graph exists' commits/3 commits/1 | ||||
|  | ||||
| verify_chain_files_exist() { | ||||
| 	for hash in $(cat $1/commit-graph-chain) | ||||
| 	do | ||||
| 		test_path_is_file $1/graph-$hash.graph || return 1 | ||||
| 	done | ||||
| } | ||||
|  | ||||
| test_expect_success 'add more commits, and write a new base graph' ' | ||||
| 	git reset --hard commits/1 && | ||||
| 	for i in $(test_seq 4 5) | ||||
| 	do | ||||
| 		test_commit $i && | ||||
| 		git branch commits/$i || return 1 | ||||
| 	done && | ||||
| 	git reset --hard commits/2 && | ||||
| 	for i in $(test_seq 6 10) | ||||
| 	do | ||||
| 		test_commit $i && | ||||
| 		git branch commits/$i || return 1 | ||||
| 	done && | ||||
| 	git reset --hard commits/2 && | ||||
| 	git merge commits/4 && | ||||
| 	git branch merge/1 && | ||||
| 	git reset --hard commits/4 && | ||||
| 	git merge commits/6 && | ||||
| 	git branch merge/2 && | ||||
| 	git commit-graph write --reachable && | ||||
| 	graph_read_expect 12 | ||||
| ' | ||||
|  | ||||
| test_expect_success 'fork and fail to base a chain on a commit-graph file' ' | ||||
| 	test_when_finished rm -rf fork && | ||||
| 	git clone . fork && | ||||
| 	( | ||||
| 		cd fork && | ||||
| 		rm .git/objects/info/commit-graph && | ||||
| 		echo "$(pwd)/../.git/objects" >.git/objects/info/alternates && | ||||
| 		test_commit new-commit && | ||||
| 		git commit-graph write --reachable --split && | ||||
| 		test_path_is_file $graphdir/commit-graph-chain && | ||||
| 		test_line_count = 1 $graphdir/commit-graph-chain && | ||||
| 		verify_chain_files_exist $graphdir | ||||
| 	) | ||||
| ' | ||||
|  | ||||
| test_expect_success 'add three more commits, write a tip graph' ' | ||||
| 	git reset --hard commits/3 && | ||||
| 	git merge merge/1 && | ||||
| 	git merge commits/5 && | ||||
| 	git merge merge/2 && | ||||
| 	git branch merge/3 && | ||||
| 	git commit-graph write --reachable --split && | ||||
| 	test_path_is_missing $infodir/commit-graph && | ||||
| 	test_path_is_file $graphdir/commit-graph-chain && | ||||
| 	ls $graphdir/graph-*.graph >graph-files && | ||||
| 	test_line_count = 2 graph-files && | ||||
| 	verify_chain_files_exist $graphdir | ||||
| ' | ||||
|  | ||||
| graph_git_behavior 'split commit-graph: merge 3 vs 2' merge/3 merge/2 | ||||
|  | ||||
| test_expect_success 'add one commit, write a tip graph' ' | ||||
| 	test_commit 11 && | ||||
| 	git branch commits/11 && | ||||
| 	git commit-graph write --reachable --split && | ||||
| 	test_path_is_missing $infodir/commit-graph && | ||||
| 	test_path_is_file $graphdir/commit-graph-chain && | ||||
| 	ls $graphdir/graph-*.graph >graph-files && | ||||
| 	test_line_count = 3 graph-files && | ||||
| 	verify_chain_files_exist $graphdir | ||||
| ' | ||||
|  | ||||
| graph_git_behavior 'three-layer commit-graph: commit 11 vs 6' commits/11 commits/6 | ||||
|  | ||||
| test_expect_success 'add one commit, write a merged graph' ' | ||||
| 	test_commit 12 && | ||||
| 	git branch commits/12 && | ||||
| 	git commit-graph write --reachable --split && | ||||
| 	test_path_is_file $graphdir/commit-graph-chain && | ||||
| 	test_line_count = 2 $graphdir/commit-graph-chain && | ||||
| 	ls $graphdir/graph-*.graph >graph-files && | ||||
| 	test_line_count = 2 graph-files && | ||||
| 	verify_chain_files_exist $graphdir | ||||
| ' | ||||
|  | ||||
| graph_git_behavior 'merged commit-graph: commit 12 vs 6' commits/12 commits/6 | ||||
|  | ||||
| test_expect_success 'create fork and chain across alternate' ' | ||||
| 	git clone . fork && | ||||
| 	( | ||||
| 		cd fork && | ||||
| 		git config core.commitGraph true && | ||||
| 		rm -rf $graphdir && | ||||
| 		echo "$(pwd)/../.git/objects" >.git/objects/info/alternates && | ||||
| 		test_commit 13 && | ||||
| 		git branch commits/13 && | ||||
| 		git commit-graph write --reachable --split && | ||||
| 		test_path_is_file $graphdir/commit-graph-chain && | ||||
| 		test_line_count = 3 $graphdir/commit-graph-chain && | ||||
| 		ls $graphdir/graph-*.graph >graph-files && | ||||
| 		test_line_count = 1 graph-files && | ||||
| 		git -c core.commitGraph=true  rev-list HEAD >expect && | ||||
| 		git -c core.commitGraph=false rev-list HEAD >actual && | ||||
| 		test_cmp expect actual && | ||||
| 		test_commit 14 && | ||||
| 		git commit-graph write --reachable --split --object-dir=.git/objects/ && | ||||
| 		test_line_count = 3 $graphdir/commit-graph-chain && | ||||
| 		ls $graphdir/graph-*.graph >graph-files && | ||||
| 		test_line_count = 1 graph-files | ||||
| 	) | ||||
| ' | ||||
|  | ||||
| graph_git_behavior 'alternate: commit 13 vs 6' commits/13 commits/6 | ||||
|  | ||||
| test_expect_success 'test merge stragety constants' ' | ||||
| 	git clone . merge-2 && | ||||
| 	( | ||||
| 		cd merge-2 && | ||||
| 		git config core.commitGraph true && | ||||
| 		test_line_count = 2 $graphdir/commit-graph-chain && | ||||
| 		test_commit 14 && | ||||
| 		git commit-graph write --reachable --split --size-multiple=2 && | ||||
| 		test_line_count = 3 $graphdir/commit-graph-chain | ||||
|  | ||||
| 	) && | ||||
| 	git clone . merge-10 && | ||||
| 	( | ||||
| 		cd merge-10 && | ||||
| 		git config core.commitGraph true && | ||||
| 		test_line_count = 2 $graphdir/commit-graph-chain && | ||||
| 		test_commit 14 && | ||||
| 		git commit-graph write --reachable --split --size-multiple=10 && | ||||
| 		test_line_count = 1 $graphdir/commit-graph-chain && | ||||
| 		ls $graphdir/graph-*.graph >graph-files && | ||||
| 		test_line_count = 1 graph-files | ||||
| 	) && | ||||
| 	git clone . merge-10-expire && | ||||
| 	( | ||||
| 		cd merge-10-expire && | ||||
| 		git config core.commitGraph true && | ||||
| 		test_line_count = 2 $graphdir/commit-graph-chain && | ||||
| 		test_commit 15 && | ||||
| 		git commit-graph write --reachable --split --size-multiple=10 --expire-time=1980-01-01 && | ||||
| 		test_line_count = 1 $graphdir/commit-graph-chain && | ||||
| 		ls $graphdir/graph-*.graph >graph-files && | ||||
| 		test_line_count = 3 graph-files | ||||
| 	) && | ||||
| 	git clone --no-hardlinks . max-commits && | ||||
| 	( | ||||
| 		cd max-commits && | ||||
| 		git config core.commitGraph true && | ||||
| 		test_line_count = 2 $graphdir/commit-graph-chain && | ||||
| 		test_commit 16 && | ||||
| 		test_commit 17 && | ||||
| 		git commit-graph write --reachable --split --max-commits=1 && | ||||
| 		test_line_count = 1 $graphdir/commit-graph-chain && | ||||
| 		ls $graphdir/graph-*.graph >graph-files && | ||||
| 		test_line_count = 1 graph-files | ||||
| 	) | ||||
| ' | ||||
|  | ||||
| test_expect_success 'remove commit-graph-chain file after flattening' ' | ||||
| 	git clone . flatten && | ||||
| 	( | ||||
| 		cd flatten && | ||||
| 		test_line_count = 2 $graphdir/commit-graph-chain && | ||||
| 		git commit-graph write --reachable && | ||||
| 		test_path_is_missing $graphdir/commit-graph-chain && | ||||
| 		ls $graphdir >graph-files && | ||||
| 		test_line_count = 0 graph-files | ||||
| 	) | ||||
| ' | ||||
|  | ||||
| corrupt_file() { | ||||
| 	file=$1 | ||||
| 	pos=$2 | ||||
| 	data="${3:-\0}" | ||||
| 	chmod a+w "$file" && | ||||
| 	printf "$data" | dd of="$file" bs=1 seek="$pos" conv=notrunc | ||||
| } | ||||
|  | ||||
| test_expect_success 'verify hashes along chain, even in shallow' ' | ||||
| 	git clone --no-hardlinks . verify && | ||||
| 	( | ||||
| 		cd verify && | ||||
| 		git commit-graph verify && | ||||
| 		base_file=$graphdir/graph-$(head -n 1 $graphdir/commit-graph-chain).graph && | ||||
| 		corrupt_file "$base_file" 1760 "\01" && | ||||
| 		test_must_fail git commit-graph verify --shallow 2>test_err && | ||||
| 		grep -v "^+" test_err >err && | ||||
| 		test_i18ngrep "incorrect checksum" err | ||||
| 	) | ||||
| ' | ||||
|  | ||||
| test_expect_success 'verify --shallow does not check base contents' ' | ||||
| 	git clone --no-hardlinks . verify-shallow && | ||||
| 	( | ||||
| 		cd verify-shallow && | ||||
| 		git commit-graph verify && | ||||
| 		base_file=$graphdir/graph-$(head -n 1 $graphdir/commit-graph-chain).graph && | ||||
| 		corrupt_file "$base_file" 1000 "\01" && | ||||
| 		git commit-graph verify --shallow && | ||||
| 		test_must_fail git commit-graph verify 2>test_err && | ||||
| 		grep -v "^+" test_err >err && | ||||
| 		test_i18ngrep "incorrect checksum" err | ||||
| 	) | ||||
| ' | ||||
|  | ||||
| test_expect_success 'warn on base graph chunk incorrect' ' | ||||
| 	git clone --no-hardlinks . base-chunk && | ||||
| 	( | ||||
| 		cd base-chunk && | ||||
| 		git commit-graph verify && | ||||
| 		base_file=$graphdir/graph-$(tail -n 1 $graphdir/commit-graph-chain).graph && | ||||
| 		corrupt_file "$base_file" 1376 "\01" && | ||||
| 		git commit-graph verify --shallow 2>test_err && | ||||
| 		grep -v "^+" test_err >err && | ||||
| 		test_i18ngrep "commit-graph chain does not match" err | ||||
| 	) | ||||
| ' | ||||
|  | ||||
| test_expect_success 'verify after commit-graph-chain corruption' ' | ||||
| 	git clone --no-hardlinks . verify-chain && | ||||
| 	( | ||||
| 		cd verify-chain && | ||||
| 		corrupt_file "$graphdir/commit-graph-chain" 60 "G" && | ||||
| 		git commit-graph verify 2>test_err && | ||||
| 		grep -v "^+" test_err >err && | ||||
| 		test_i18ngrep "invalid commit-graph chain" err && | ||||
| 		corrupt_file "$graphdir/commit-graph-chain" 60 "A" && | ||||
| 		git commit-graph verify 2>test_err && | ||||
| 		grep -v "^+" test_err >err && | ||||
| 		test_i18ngrep "unable to find all commit-graph files" err | ||||
| 	) | ||||
| ' | ||||
|  | ||||
| test_expect_success 'verify across alternates' ' | ||||
| 	git clone --no-hardlinks . verify-alt && | ||||
| 	( | ||||
| 		cd verify-alt && | ||||
| 		rm -rf $graphdir && | ||||
| 		altdir="$(pwd)/../.git/objects" && | ||||
| 		echo "$altdir" >.git/objects/info/alternates && | ||||
| 		git commit-graph verify --object-dir="$altdir/" && | ||||
| 		test_commit extra && | ||||
| 		git commit-graph write --reachable --split && | ||||
| 		tip_file=$graphdir/graph-$(tail -n 1 $graphdir/commit-graph-chain).graph && | ||||
| 		corrupt_file "$tip_file" 100 "\01" && | ||||
| 		test_must_fail git commit-graph verify --shallow 2>test_err && | ||||
| 		grep -v "^+" test_err >err && | ||||
| 		test_i18ngrep "commit-graph has incorrect fanout value" err | ||||
| 	) | ||||
| ' | ||||
|  | ||||
| test_expect_success 'add octopus merge' ' | ||||
| 	git reset --hard commits/10 && | ||||
| 	git merge commits/3 commits/4 && | ||||
| 	git branch merge/octopus && | ||||
| 	git commit-graph write --reachable --split && | ||||
| 	git commit-graph verify && | ||||
| 	test_line_count = 3 $graphdir/commit-graph-chain | ||||
| ' | ||||
|  | ||||
| graph_git_behavior 'graph exists' merge/octopus commits/12 | ||||
|  | ||||
| test_expect_success 'split across alternate where alternate is not split' ' | ||||
| 	git commit-graph write --reachable && | ||||
| 	test_path_is_file .git/objects/info/commit-graph && | ||||
| 	cp .git/objects/info/commit-graph . && | ||||
| 	git clone --no-hardlinks . alt-split && | ||||
| 	( | ||||
| 		cd alt-split && | ||||
| 		echo "$(pwd)"/../.git/objects >.git/objects/info/alternates && | ||||
| 		test_commit 18 && | ||||
| 		git commit-graph write --reachable --split && | ||||
| 		test_line_count = 1 $graphdir/commit-graph-chain | ||||
| 	) && | ||||
| 	test_cmp commit-graph .git/objects/info/commit-graph | ||||
| ' | ||||
|  | ||||
| test_done | ||||
		Loading…
	
		Reference in New Issue
	
	 Junio C Hamano
						Junio C Hamano