When we originally did the series that contains 7ba826290a
(revision: add rev_input_given flag, 2017-08-02) the intent
was that "git rev-list --stdin </dev/null" would similarly
become a successful noop. However, an attempt at the time to
do that did not work[1]. The problem is that rev_input_given
serves two roles:
- it tells rev-list.c that it should not error out
- it tells revision.c that it should not have the "default"
ref kick (e.g., "HEAD" in "git log")
We want to trigger the former, but not the latter. This is
technically possible with a single flag, if we set the flag
only after revision.c's revs->def check. But this introduces
a rather subtle ordering dependency.
Instead, let's keep two flags: one to denote when we got
actual input (which triggers both roles) and one for when we
read stdin (which triggers only the first).
This does mean a caller interested in the first role has to
check both flags, but there's only one such caller. And any
future callers might want to make the distinction anyway
(e.g., if they care less about erroring out, and more about
whether revision.c soaked up our stdin).
In fact, we already keep such a flag internally in
revision.c for this purpose, so this is really just exposing
that to the caller (and the old function-local flag can go
away in favor of our new one).
[1] https://public-inbox.org/git/20170802223416.gwiezhbuxbdmbjzx@sigill.intra.peff.net/
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I looped over the toplevel header files, creating a temporary two-line C
program for each consisting of
#include "git-compat-util.h"
#include $HEADER
This patch is the result of manually fixing errors in compiling those
tiny programs.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --range-diff option announces the embedded range-diff generically
as "Range-diff:", however, we can do better when --reroll-count is
specified by emitting "Range-diff against v{n}:" instead.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When submitting a revised version of a patch series, it can be helpful
(to reviewers) to include a summary of changes since the previous
attempt in the form of a range-diff, however, doing so involves manually
copy/pasting the diff into the cover letter.
Add a --range-diff option to automate this process. The argument to
--range-diff specifies the tip of the previous attempt against which to
generate the range-diff. For example:
git format-patch --cover-letter --range-diff=v1 -3 v2
(At this stage, the previous attempt and the patch series being
formatted must share a common base, however, a subsequent enhancement
will make it possible to specify an explicit revision range for the
previous attempt.)
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --interdiff option introduces the embedded interdiff generically as
"Interdiff:", however, we can do better when --reroll-count is specified
by emitting "Interdiff against v{n}:" instead.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When submitting a revised version of a patch series, it can be helpful
(to reviewers) to include a summary of changes since the previous
attempt in the form of an interdiff, however, doing so involves manually
copy/pasting the diff into the cover letter.
Add an --interdiff option to automate this process. The argument to
--interdiff specifies the tip of the previous attempt against which to
generate the interdiff. For example:
git format-patch --cover-letter --interdiff=v1 -3 v2
The previous attempt and the patch series being formatted must share a
common base.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A filter line in a request to upload-pack filters out objects regardless
of whether they are directly referenced by a "want" line or not. This
means that cloning with "--filter=blob:none" (or another filter that
excludes blobs) from a repository with at least one ref pointing to a
blob (for example, the Git repository itself) results in output like the
following:
error: missing object referenced by 'refs/tags/junio-gpg-pub'
and if that particular blob is not referenced by a fetched tree, the
resulting clone fails fsck because there is no object from the remote to
vouch that the missing object is a promisor object.
Update both the protocol and the upload-pack implementation to include
all explicitly specified "want" objects in the packfile regardless of
the filter specification.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of relying on commit->util to store the source string, let the
user provide a commit-slab to store the source strings in.
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was an undocumented debugging aid that does not seem to
have come in handy in the past decade, judging from its lack
of mentions on the mailing list.
Let's drop it in the name of simplicity. This is morally a
revert of 3131b71301 (Add "--show-all" revision walker flag
for debugging, 2008-02-09), but note that I did leave in the
mapping of UNINTERESTING to "^" in get_revision_mark(). I
don't think this would be possible to trigger with the
current code, but it's the only sensible marker.
We'll skip the usual deprecation period because this was
explicitly a debugging aid that was never documented.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create header for pretty.c to make formatting interface more structured.
This is a middle point, this file would be merged further with other
files which contain formatting stuff.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach rev-list to support termination of an object traversal at any
object from a promisor remote (whether one that the local repo also has,
or one that the local repo knows about because it has another promisor
object that references it).
This will be used subsequently in gc and in the connectivity check used
by fetch.
For efficiency, if an object is referenced by a promisor object, and is
in the local repo only as a non-promisor object, object traversal will
not stop there. This is to avoid building the list of promisor object
references.
(In list-objects.c, the case where obj is NULL in process_blob() and
process_tree() do not need to be changed because those happen only when
there is a conflict between the expected type and the existing object.
If the object doesn't exist, an object will be synthesized, which is
fine.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functionality to list tree objects in the order they were seen
while traversing the commits will be used in one of the next commits,
where we teach `git describe` to describe not only commits, but blobs, too.
The change in list-objects.c is rather minimal as we'll be re-using
the infrastructure put in place of the revision walking machinery. For
example one could expect that add_pending_tree is not called, but rather
commit->tree is directly passed to the tree traversal function. This
however requires a lot more code than just emptying the queue containing
trees after each commit.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Setting `leak_pending = 1` tells `prepare_revision_walk()` not to
release the `pending` array, and makes that the caller's responsibility.
See 4a43d374f (revision: add leak_pending flag, 2011-10-01) and
353f5657a (bisect: use leak_pending flag, 2011-10-01).
Commit 1da1e07c8 (clean up name allocation in prepare_revision_walk,
2014-10-15) fixed a memory leak in `prepare_revision_walk()` by
switching from `free()` to `object_array_clear()`. However, where we use
the `leak_pending`-mechanism, we're still only calling `free()`.
Use `object_array_clear()` instead. Copy some helpful comments from
353f5657a to the other callers that we update to clarify the memory
responsibilities, and to highlight that the commits are not affected
when we clear the array -- it is indeed correct to both tidy up the
commit flags and clear the object array.
Document `leak_pending` in revision.h to help future users get this
right.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The revision walker can walk through per-worktree refs like HEAD or
SHA-1 references in the index. These currently are from the current
worktree only. This new flag is added to change rev-list behavior in
this regard:
When single_worktree is set, only current worktree is considered. When
it is not set (which is the default), all worktrees are considered.
The default is chosen so because the two big components that rev-list
works with are object database (entirely shared between worktrees) and
refs (mostly shared). It makes sense that default behavior goes per-repo
too instead of per-worktree.
The flag will eventually be exposed as a rev-list argument with
documents. For now it stays internal until the new behavior is fully
implemented.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally a caller that invokes setup_revisions() has to
check rev.pending to see if anything was actually queued for
the traversal. But they can't tell the difference between
two cases:
1. The user gave us no tip from which to start a
traversal.
2. The user tried to give us tips via --glob, --all, etc,
but their patterns ended up being empty.
Let's set a flag in the rev_info struct that callers can use
to tell the difference. We can set this from the
init_all_refs_cb() function. That's a little funny because
it's not exactly about initializing the "cb" struct itself.
But that function is the common setup place for doing
pattern traversals that is used by --glob, --all, etc.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rev_info.early_output started out as an unsigned int in cdcefbc97 (Add
"--early-output" log flag for interactive GUI use, 2007-11-03), but
later it was turned into a single bit in a bit field in cc243c3ce
(show: --ignore-missing, 2011-05-18) without explanation, though the
code using it still expects it to be a regular integer type and uses
it as a counter. Consequently, any even number given via
'--early-output=<N>', or indeed a plain '--early-output' defaulting to
100 effectively disabled the feature.
Turn rev_info.early_output back into its origin unsigned int data
type, making '--early-output' work again.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename this function and convert it to take a pointer to struct
object_id.
This is a prerequisite for converting get_reference, which is needed to
convert parse_object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The definition of path_name() was removed by 2824e1841 (list-objects:
pass full pathname to callbacks); remove its declaration as well.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log --pretty={medium,full,fuller}" and "git log" by default
prepend 4 spaces to the log message, so it makes sense to enable
the new "expand-tabs" facility by default for these formats.
Add --no-expand-tabs option to override the new default.
The change alone breaks a test in t4201 that runs "git shortlog"
on the output from "git log", and expects that the output from
"git log" does not do such a tab expansion. Adjust the test to
explicitly disable expand-tabs with --no-expand-tabs.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A commit log message sometimes tries to line things up using tabs,
assuming fixed-width font with the standard 8-place tab settings.
Viewing such a commit however does not work well in "git log", as
we indent the lines by prefixing 4 spaces in front of them.
This should all line up:
Column 1 Column 2
-------- --------
A B
ABCD EFGH
SPACES Instead of Tabs
Even with multi-byte UTF8 characters:
Column 1 Column 2
-------- --------
Ä B
åäö 100
A Møøse once bit my sister..
Tab-expand the lines in "git log --expand-tabs" output before
prefixing 4 spaces.
This is based on the patch by Linus Torvalds, but at this step, we
require an explicit command line option to enable the behaviour.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we find a blob at "a/b/c", we currently pass this to
our show_object_fn callbacks as two components: "a/b/" and
"c". Callbacks which want the full value then call
path_name(), which concatenates the two. But this is an
inefficient interface; the path is a strbuf, and we could
simply append "c" to it temporarily, then roll back the
length, without creating a new copy.
So we could improve this by teaching the callsites of
path_name() this trick (and there are only 3). But we can
also notice that no callback actually cares about the
broken-down representation, and simply pass each callback
the full path "a/b/c" as a string. The callback code becomes
even simpler, then, as we do not have to worry about freeing
an allocated buffer, nor rolling back our modification to
the strbuf.
This is theoretically less efficient, as some callbacks
would not bother to format the final path component. But in
practice this is not measurable. Since we use the same
strbuf over and over, our work to grow it is amortized, and
we really only pay to memcpy a few bytes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we left name_path as a thin wrapper
around a strbuf. This patch drops it entirely. As a result,
every show_object_fn callback needs to be adjusted. However,
none of their code needs to be changed at all, because the
only use was to pass it to path_name(), which now handles
the bare strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct name_path" data is examined in only two places:
we generate it in process_tree(), and we convert it to a
single string in path_name(). Everyone else just passes it
through to those functions.
We can further note that process_tree() already keeps a
single strbuf with the leading tree path, for use with
tree_entry_interesting().
Instead of building a separate name_path linked list, let's
just use the one we already build in "base". This reduces
the amount of code (especially tricky code in path_name()
which did not check for integer overflows caused by deep
or large pathnames).
It is also more efficient in some instances. Any time we
were using tree_entry_interesting, we were building up the
strbuf anyway, so this is an immediate and obvious win
there. In cases where we were not, we trade off storing
"pathname/" in a strbuf on the heap for each level of the
path, instead of two pointers and an int on the stack (with
one pointer into the tree object). On a 64-bit system, the
latter is 20 bytes; so if path components are less than that
on average, this has lower peak memory usage. In practice
it probably doesn't matter either way; we are already
holding in memory all of the tree objects leading up to each
pathname, and for normal-depth pathnames, we are only
talking about hundreds of bytes.
This patch leaves "struct name_path" as a thin wrapper
around the strbuf, to avoid disrupting callbacks. We should
fix them, but leaving it out makes this diff easier to view.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we find a blob at "a/b/c", we currently pass this to
our show_object_fn callbacks as two components: "a/b/" and
"c". Callbacks which want the full value then call
path_name(), which concatenates the two. But this is an
inefficient interface; the path is a strbuf, and we could
simply append "c" to it temporarily, then roll back the
length, without creating a new copy.
So we could improve this by teaching the callsites of
path_name() this trick (and there are only 3). But we can
also notice that no callback actually cares about the
broken-down representation, and simply pass each callback
the full path "a/b/c" as a string. The callback code becomes
even simpler, then, as we do not have to worry about freeing
an allocated buffer, nor rolling back our modification to
the strbuf.
This is theoretically less efficient, as some callbacks
would not bother to format the final path component. But in
practice this is not measurable. Since we use the same
strbuf over and over, our work to grow it is amortized, and
we really only pay to memcpy a few bytes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we left name_path as a thin wrapper
around a strbuf. This patch drops it entirely. As a result,
every show_object_fn callback needs to be adjusted. However,
none of their code needs to be changed at all, because the
only use was to pass it to path_name(), which now handles
the bare strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct name_path" data is examined in only two places:
we generate it in process_tree(), and we convert it to a
single string in path_name(). Everyone else just passes it
through to those functions.
We can further note that process_tree() already keeps a
single strbuf with the leading tree path, for use with
tree_entry_interesting().
Instead of building a separate name_path linked list, let's
just use the one we already build in "base". This reduces
the amount of code (especially tricky code in path_name()
which did not check for integer overflows caused by deep
or large pathnames).
It is also more efficient in some instances. Any time we
were using tree_entry_interesting, we were building up the
strbuf anyway, so this is an immediate and obvious win
there. In cases where we were not, we trade off storing
"pathname/" in a strbuf on the heap for each level of the
path, instead of two pointers and an int on the stack (with
one pointer into the tree object). On a 64-bit system, the
latter is 20 bytes; so if path components are less than that
on average, this has lower peak memory usage. In practice
it probably doesn't matter either way; we are already
holding in memory all of the tree objects leading up to each
pathname, and for normal-depth pathnames, we are only
talking about hundreds of bytes.
This patch leaves "struct name_path" as a thin wrapper
around the strbuf, to avoid disrupting callbacks. We should
fix them, but leaving it out makes this diff easier to view.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Oftentimes, patches created by git format-patch will be stored in
version control or compared with diff. In these cases, two otherwise
identical patches can have different commit hashes, leading to diff
noise. Teach git format-patch a --zero-commit option that instead
produces an all-zero hash to avoid this diff noise.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.
Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor. However, the tricky case is where we use the
enum labels as constants, like:
show_date(t, tz, DATE_NORMAL);
Ideally we could say:
show_date(t, tz, &{ DATE_NORMAL });
but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:
1. Manually add a "struct date_mode d = { DATE_NORMAL }"
definition to each caller, and pass "&d". This makes
the callers uglier, because they sometimes do not even
have their own scope (e.g., they are inside a switch
statement).
2. Provide a pre-made global "date_normal" struct that can
be passed by address. We'd also need "date_rfc2822",
"date_iso8601", and so forth. But at least the ugliness
is defined in one place.
3. Provide a wrapper that generates the correct struct on
the fly. The big downside is that we end up pointing to
a single global, which makes our wrapper non-reentrant.
But show_date is already not reentrant, so it does not
matter.
This patch implements 3, along with a minor macro to keep
the size of the callers sane.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log --grep=<string>" shows only commits with messages that
match the given string, but sometimes it is useful to be able to
show only commits that do *not* have certain messages (e.g. "show
me ones that are not FIXUP commits").
Originally, we had the invert-grep flag in grep_opt, but because
"git grep --invert-grep" does not make sense except in conjunction
with "--files-with-matches", which is already covered by
"--files-without-matches", it was moved it to revisions structure.
To have the flag there expresses the function to the feature better.
When the newly inserted two tests run, the history would have commits
with messages "initial", "second", "third", "fourth", "fifth", "sixth"
and "Second", committed in this order. The commits that does not match
either "th" or "Sec" is "second" and "initial". For the case insensitive
case only "initial" matches.
Signed-off-by: Christoph Junghans <ottxor@gentoo.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit fbd4a70 (list-objects: mark more commits as edges in
mark_edges_uninteresting - 2013-08-16), we marked an increasing number
of edges uninteresting. This change, and the subsequent change to make
this conditional on --objects-edge, are used by --thin to make much
smaller packs for shallow clones.
Unfortunately, they cause a significant performance regression when
pushing non-shallow clones with lots of refs (23.322 seconds vs.
4.785 seconds with 22400 refs). Add an option to git rev-list,
--objects-edge-aggressive, that preserves this more aggressive behavior,
while leaving --objects-edge to provide more performant behavior.
Preserve the current behavior for the moment by using the aggressive
option.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is currently no easy way to ask the revision traversal
machinery to include objects reachable from the index (e.g.,
blobs and trees that have not yet been committed). This
patch adds an option to do so.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to add all reflog entries as tips for finding
reachable objects. The revision machinery can already do
this (to support "rev-list --reflog"); we can reuse that
code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pack-objects is computing the reachability bitmap to
serve a fetch request, it can erroneously die() if some of
the UNINTERESTING objects are not present. Upload-pack
throws away HAVE lines from the client for objects we do not
have, but we may have a tip object without all of its
ancestors (e.g., if the tip is no longer reachable and was
new enough to survive a `git prune`, but some of its
reachable objects did get pruned).
In the non-bitmap case, we do a revision walk with the HAVE
objects marked as UNINTERESTING. The revision walker
explicitly ignores errors in accessing UNINTERESTING commits
to handle this case (and we do not bother looking at
UNINTERESTING trees or blobs at all).
When we have bitmaps, however, the process is quite
different. The bitmap index for a pack-objects run is
calculated in two separate steps:
First, we perform an extensive walk from all the HAVEs to
find the full set of objects reachable from them. This walk
is usually optimized away because we are expected to hit an
object with a bitmap during the traversal, which allows us
to terminate early.
Secondly, we perform an extensive walk from all the WANTs,
which usually also terminates early because we hit a commit
with an existing bitmap.
Once we have the resulting bitmaps from the two walks, we
AND-NOT them together to obtain the resulting set of objects
we need to pack.
When we are walking the HAVE objects, the revision walker
does not know that we are walking it only to mark the
results as uninteresting. We strip out the UNINTERESTING flag,
because those objects _are_ interesting to us during the
first walk. We want to keep going to get a complete set of
reachable objects if we can.
We need some way to tell the revision walker that it's OK to
silently truncate the HAVE walk, just like it does for the
UNINTERESTING case. This patch introduces a new
`ignore_missing_links` flag to the `rev_info` struct, which
we set only for the HAVE walk.
It also adds tests to cover UNINTERESTING objects missing
from several positions: a missing blob, a missing tree, and
a missing parent commit. The missing blob already worked (as
we do not care about its contents at all), but the other two
cases caused us to die().
Note that there are a few cases we do not need to test:
1. We do not need to test a missing tree, with the blob
still present. Without the tree that refers to it, we
would not know that the blob is relevant to our walk.
2. We do not need to test a tip commit that is missing.
Upload-pack omits these for us (and in fact, we
complain even in the non-bitmap case if it fails to do
so).
Reported-by: Siddharth Agarwal <sid0@fb.com>
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Option explanation is in rev-list-options.txt. The interaction with -z
is left undecided.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the field "flags" is mainly used by the revision walker, it is
also used in many other places. Centralize the whole flag allocation to
one place for a better overview (and easier to move flags if we have
too).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise we might not have 'struct diff_options'.
[jc: needs a matching follow-up patch to remove inclusion of diff.h
from *.c files that do not themselves use anything from diff.h]
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit enables users of `struct rev_info` to peform custom limiting
during a revision walk (i.e. `get_revision`).
If the field `include_check` has been set to a callback, this callback
will be issued once for each commit before it is added to the "pending"
list of the revwalk. If the include check returns 0, the commit will be
marked as added but won't be pushed to the pending list, effectively
limiting the walk.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
People often find "git log --branches" etc. that includes _all_
branches is cumbersome to use when they want to grab most but except
some. The same applies to --tags, --all and --glob.
Teach the revision machinery to remember patterns, and then upon the
next such a globbing option, exclude those that match the pattern.
With this, I can view only my integration branches (e.g. maint,
master, etc.) without topic branches, which are named after two
letters from primary authors' names, slash and topic name.
git rev-list --no-walk --exclude=??/* --branches |
git name-rev --refs refs/heads/* --stdin
This one shows things reachable from local and remote branches that
have not been merged to the integration branches.
git log --remotes --branches --not --exclude=??/* --branches
It may be a bit rough around the edges, in that the pattern to give
the exclude option depends on what globbing option follows. In
these examples, the pattern "??/*" is used, not "refs/heads/??/*",
because the globbing option that follows the -"-exclude=<pattern>"
is "--branches". As each use of globbing option resets previously
set "--exclude", this may not be such a bad thing, though.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using pathspec filtering in combination with diff-based log
output, parent simplification happens before the diff is computed.
The diff is therefore against the *simplified* parents.
This works okay, arguably by accident, in the normal case:
simplification reduces to one parent as long as the commit is TREESAME
to it. So the simplified parent of any given commit must have the
same tree contents on the filtered paths as its true (unfiltered)
parent.
However, --full-diff breaks this guarantee, and indeed gives pretty
spectacular results when comparing the output of
git log --graph --stat ...
git log --graph --full-diff --stat ...
(--graph internally kicks in parent simplification, much like
--parents).
To fix it, store a copy of the parent list before simplification (in a
slab) whenever --full-diff is in effect. Then use the stored parents
instead of the simplified ones in the commit display code paths. The
latter do not actually check for --full-diff to avoid duplicated code;
they just grab the original parents if save_parents() has not been
called for this revision walk.
For ordinary commits it should be obvious that this is the right thing
to do.
Merge commits are a bit subtle. Observe that with default
simplification, merge simplification is an all-or-nothing decision:
either the merge is TREESAME to one parent and disappears, or it is
different from all parents and the parent list remains intact.
Redundant parents are not pruned, so the existing code also shows them
as a merge.
So if we do show a merge commit, the parent list just consists of the
rewrite result on each parent. Running, e.g., --cc on this in
--full-diff mode is not very useful: if any commits were skipped, some
hunks will disagree with all sides of the merge (with one side,
because commits were skipped; with the others, because they didn't
have those changes in the first place). This triggers --cc showing
these hunks spuriously.
Therefore I believe that even for merge commits it is better to show
the diffs wrt. the original parents.
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Format-patch generates emails with the "From" address set to the
author of each patch. If you are going to send the emails, however,
you would want to replace the author identity with yours (if they
are not the same), and bump the author identity to an in-body
header.
Normally this is handled by git-send-email, which does the
transformation before sending out the emails. However, some
workflows may not use send-email (e.g., imap-send, or a custom
script which feeds the mbox to a non-git MUA). They could each
implement this feature themselves, but getting it right is
non-trivial (one must canonicalize the identities by reversing any
RFC2047 encoding or RFC822 quoting of the headers, which has caused
many bugs in send-email over the years).
This patch takes a different approach: it teaches format-patch a
"--from" option which handles the ident check and in-body header
while it is writing out the email. It's much simpler to do at this
level (because we haven't done any quoting yet), and any workflow
based on format-patch can easily turn it on.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The primary invariant of sort_in_topological_order() is that a
parent commit is not emitted until all children of it are. When
traversing a forked history like this with "git log C E":
A----B----C
\
D----E
we ensure that A is emitted after all of B, C, D, and E are done, B
has to wait until C is done, and D has to wait until E is done.
In some applications, however, we would further want to control how
these child commits B, C, D and E on two parallel ancestry chains
are shown.
Most of the time, we would want to see C and B emitted together, and
then E and D, and finally A (i.e. the --topo-order output). The
"lifo" parameter of the sort_in_topological_order() function is used
to control this behaviour. We start the traversal by knowing two
commits, C and E. While keeping in mind that we also need to
inspect E later, we pick C first to inspect, and we notice and
record that B needs to be inspected. By structuring the "work to be
done" set as a LIFO stack, we ensure that B is inspected next,
before other in-flight commits we had known that we will need to
inspect, e.g. E.
When showing in --date-order, we would want to see commits ordered
by timestamps, i.e. show C, E, B and D in this order before showing
A, possibly mixing commits from two parallel histories together.
When "lifo" parameter is set to false, the function keeps the "work
to be done" set sorted in the date order to realize this semantics.
After inspecting C, we add B to the "work to be done" set, but the
next commit we inspect from the set is E which is newer than B.
The name "lifo", however, is too strongly tied to the way how the
function implements its behaviour, and does not describe what the
behaviour _means_.
Replace this field with an enum rev_sort_order, with two possible
values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and
update the existing code. The mechanical replacement rule is:
"lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
"lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing edge-based operations on the revision graph, it can be
useful to be able to identify the INTERESTING graph's connection(s) to
the bottom commit(s) specified by the user.
Conceptually when the user specifies "A..B" (== B ^A), they are asking
for the history from A to B. The first connection from A onto the
INTERESTING graph is part of that history, and should be considered. If
we consider only INTERESTING nodes and their connections, then we're
really only considering the history from A's immediate descendants to B.
This patch does not change behaviour, but adds a new BOTTOM flag to
indicate the bottom commits specified by the user, ready to be used by
following patches.
We immediately use the BOTTOM flag to return collect_bottom_commits() to
its original approach of examining the pending commit list rather than
the command line. This will ensure alignment of the definition of
"bottom" with future patches.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
History simplification previously always treated merges as TREESAME
if they were TREESAME to any parent.
While this was consistent with the default behaviour, this could be
extremely unhelpful when searching detailed history, and could not be
overridden. For example, if a merge had ignored a change, as if by "-s
ours", then:
git log -m -p --full-history -Schange file
would successfully locate "change"'s addition but would not locate the
merge that resolved against it.
Futher, simplify_merges could drop the actual parent that a commit
was TREESAME to, leaving it as a normal commit marked TREESAME that
isn't actually TREESAME to its remaining parent.
Now redefine a commit's TREESAME flag to be true only if a commit is
TREESAME to _all_ of its parents. This doesn't affect either the default
simplify_history behaviour (because partially TREESAME merges are turned
into normal commits), or full-history with parent rewriting (because all
merges are output). But it does affect other modes. The clearest
difference is that --full-history will show more merges - sufficient to
ensure that -m -p --full-history log searches can really explain every
change to the file, including those changes' ultimate fate in merges.
Also modify simplify_merges to recalculate TREESAME after removing
a parent. This is achieved by storing per-parent TREESAME flags on the
initial scan, so the combined flag can be easily recomputed.
This fixes some t6111 failures, but creates a couple of new ones -
we are now showing some merges that don't need to be shown.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>