From a5411df0d9fb659321c6ce001c1f8d6e08426b8b Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 11 May 2018 14:00:55 -0400 Subject: [PATCH 1/4] mark_tree_contents_uninteresting(): drop missing object check It's generally acceptable for UNINTERESTING objects in a traversal to be unavailable (e.g., see aeeae1b771). When marking trees UNINTERESTING, we access the object database twice: once to check if the object is missing (and return quietly if it is), and then again to actually parse it. We can instead just try to parse; if that fails, we can then return quietly. That halves the effort we spend on locating the object. Note that this isn't _exactly_ the same as the original behavior, as the parse failure could be due to other problems than a missing object: it could be corrupted, in which case the original code would have died. But the new behavior is arguably better, as it covers the object being unavailable for any reason. We'll also still issue a warning to stderr in such a case. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- revision.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/revision.c b/revision.c index b42c836d7a..9de92bb5e5 100644 --- a/revision.c +++ b/revision.c @@ -51,12 +51,9 @@ static void mark_tree_contents_uninteresting(struct tree *tree) { struct tree_desc desc; struct name_entry entry; - struct object *obj = &tree->object; - if (!has_object_file(&obj->oid)) + if (parse_tree_gently(tree, 1) < 0) return; - if (parse_tree(tree) < 0) - die("bad tree %s", oid_to_hex(&obj->oid)); init_tree_desc(&desc, tree->buffer, tree->size); while (tree_entry(&desc, &entry)) { From 577dd0d29bf7cbbdf1f00c20a90caac50b1c603f Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 11 May 2018 14:01:59 -0400 Subject: [PATCH 2/4] mark_parents_uninteresting(): drop missing object check We allow UNINTERESTING objects in a traversal to be unavailable. As part of this, mark_parents_uninteresting() checks whether we have a particular uninteresting parent; if not, we will mark it as "parsed" so that later code skips it. This code is redundant and even a little bit harmful, so let's drop it. It's redundant because when our parse_object() call in add_parents_to_list() fails, we already quietly skip UNINTERESTING parents. This redundancy is a historical artifact. The mark_parents_uninteresting() protection is from 454fbbcde3 (git-rev-list: allow missing objects when the parent is marked UNINTERESTING, 2005-07-10). Much later, aeeae1b771 (revision traversal: allow UNINTERESTING objects to be missing, 2009-01-27) covered more cases by making the actual parse more gentle. As an aside, even if this weren't redundant, it would be insufficient. The gentle parsing handles both missing and corrupted objects, whereas the has_object_file() check we're getting rid of covers only missing ones. And the code we're dropping is harmful for two reasons: 1. We spend extra time on the object lookup, even though we don't actually need the information at this point (and will just repeat that lookup later when we parse for the common case that we _do_ have the object). 2. It "lies" about the commit by setting the parsed flag, even though we didn't load any useful data into the struct. This shouldn't matter for the UNINTERESTING case, but we may later clear our flags and do another traversal in the same process. While pretty unlikely, it's possible that we could then look at the same commit without the UNINTERESTING flag, in which case we'd produce the wrong result (we'd think it's a commit with no parents, when in fact we should probably die due to the missing object). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- revision.c | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/revision.c b/revision.c index 9de92bb5e5..e1fa9b13b9 100644 --- a/revision.c +++ b/revision.c @@ -102,18 +102,6 @@ void mark_parents_uninteresting(struct commit *commit) struct commit *commit = pop_commit(&parents); while (commit) { - /* - * A missing commit is ok iff its parent is marked - * uninteresting. - * - * We just mark such a thing parsed, so that when - * it is popped next time around, we won't be trying - * to parse it and get an error. - */ - if (!commit->object.parsed && - !has_object_file(&commit->object.oid)) - commit->object.parsed = 1; - if (commit->object.flags & UNINTERESTING) break; From 43fc643b75af91363eb1246528c706c1654ddc2e Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 11 May 2018 14:02:27 -0400 Subject: [PATCH 3/4] mark_parents_uninteresting(): replace list with stack The mark_parents_uninteresting() function uses two loops: an outer one to process our queue of pending parents, and an inner one to process first-parent chains. This is a clever optimization from 941ba8db57 (Eliminate recursion in setting/clearing marks in commit list, 2012-01-14) to limit the number of linked-list allocations when following single-parent chains. Unfortunately, this makes the result a little hard to read. Let's replace the list with a stack. Then we don't have to worry about doing this double-loop optimization, as we'll just reuse the top element of the stack as we pop/push. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- revision.c | 67 +++++++++++++++++++++++++++++++++++------------------- 1 file changed, 43 insertions(+), 24 deletions(-) diff --git a/revision.c b/revision.c index e1fa9b13b9..125327062f 100644 --- a/revision.c +++ b/revision.c @@ -91,38 +91,57 @@ void mark_tree_uninteresting(struct tree *tree) mark_tree_contents_uninteresting(tree); } -void mark_parents_uninteresting(struct commit *commit) +struct commit_stack { + struct commit **items; + size_t nr, alloc; +}; +#define COMMIT_STACK_INIT { NULL, 0, 0 } + +static void commit_stack_push(struct commit_stack *stack, struct commit *commit) { - struct commit_list *parents = NULL, *l; + ALLOC_GROW(stack->items, stack->nr + 1, stack->alloc); + stack->items[stack->nr++] = commit; +} - for (l = commit->parents; l; l = l->next) - commit_list_insert(l->item, &parents); +static struct commit *commit_stack_pop(struct commit_stack *stack) +{ + return stack->nr ? stack->items[--stack->nr] : NULL; +} - while (parents) { - struct commit *commit = pop_commit(&parents); +static void commit_stack_clear(struct commit_stack *stack) +{ + FREE_AND_NULL(stack->items); + stack->nr = stack->alloc = 0; +} - while (commit) { - if (commit->object.flags & UNINTERESTING) - break; +void mark_parents_uninteresting(struct commit *commit) +{ + struct commit_stack pending = COMMIT_STACK_INIT; + struct commit_list *l; - commit->object.flags |= UNINTERESTING; + for (l = commit->parents; l; l = l->next) + commit_stack_push(&pending, l->item); - /* - * Normally we haven't parsed the parent - * yet, so we won't have a parent of a parent - * here. However, it may turn out that we've - * reached this commit some other way (where it - * wasn't uninteresting), in which case we need - * to mark its parents recursively too.. - */ - if (!commit->parents) - break; + while (pending.nr > 0) { + struct commit *commit = commit_stack_pop(&pending); - for (l = commit->parents->next; l; l = l->next) - commit_list_insert(l->item, &parents); - commit = commit->parents->item; - } + if (commit->object.flags & UNINTERESTING) + return; + commit->object.flags |= UNINTERESTING; + + /* + * Normally we haven't parsed the parent + * yet, so we won't have a parent of a parent + * here. However, it may turn out that we've + * reached this commit some other way (where it + * wasn't uninteresting), in which case we need + * to mark its parents recursively too.. + */ + for (l = commit->parents; l; l = l->next) + commit_stack_push(&pending, l->item); } + + commit_stack_clear(&pending); } static void add_pending_object_with_path(struct rev_info *revs, From 8702b30fd7b6c804f1bb59877a4c2f773fbe00f8 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 11 May 2018 14:03:15 -0400 Subject: [PATCH 4/4] mark_parents_uninteresting(): avoid most allocation Commit 941ba8db57 (Eliminate recursion in setting/clearing marks in commit list, 2012-01-14) used a clever double-loop to avoid allocations for single-parent chains of history. However, it did so only when following parents of parents (which was an uncommon case), and _always_ incurred at least one allocation to populate the list of pending parents in the first place. We can turn this into zero-allocation in the common case by iterating directly over the initial parent list, and then following up on any pending items we might have discovered. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- revision.c | 44 +++++++++++++++++++++++++------------------- 1 file changed, 25 insertions(+), 19 deletions(-) diff --git a/revision.c b/revision.c index 125327062f..c21acb2cb8 100644 --- a/revision.c +++ b/revision.c @@ -114,32 +114,38 @@ static void commit_stack_clear(struct commit_stack *stack) stack->nr = stack->alloc = 0; } -void mark_parents_uninteresting(struct commit *commit) +static void mark_one_parent_uninteresting(struct commit *commit, + struct commit_stack *pending) { - struct commit_stack pending = COMMIT_STACK_INIT; struct commit_list *l; + if (commit->object.flags & UNINTERESTING) + return; + commit->object.flags |= UNINTERESTING; + + /* + * Normally we haven't parsed the parent + * yet, so we won't have a parent of a parent + * here. However, it may turn out that we've + * reached this commit some other way (where it + * wasn't uninteresting), in which case we need + * to mark its parents recursively too.. + */ for (l = commit->parents; l; l = l->next) - commit_stack_push(&pending, l->item); + commit_stack_push(pending, l->item); +} - while (pending.nr > 0) { - struct commit *commit = commit_stack_pop(&pending); +void mark_parents_uninteresting(struct commit *commit) +{ + struct commit_stack pending = COMMIT_STACK_INIT; + struct commit_list *l; - if (commit->object.flags & UNINTERESTING) - return; - commit->object.flags |= UNINTERESTING; + for (l = commit->parents; l; l = l->next) + mark_one_parent_uninteresting(l->item, &pending); - /* - * Normally we haven't parsed the parent - * yet, so we won't have a parent of a parent - * here. However, it may turn out that we've - * reached this commit some other way (where it - * wasn't uninteresting), in which case we need - * to mark its parents recursively too.. - */ - for (l = commit->parents; l; l = l->next) - commit_stack_push(&pending, l->item); - } + while (pending.nr > 0) + mark_one_parent_uninteresting(commit_stack_pop(&pending), + &pending); commit_stack_clear(&pending); }