Earlier we got rid of two function-scope static variables that kept
track of the states of helper functions by making them extra arguments
that are passed throughout the callchain. Now we have a convenient
place to store and pass them around in the form of "struct mailinfo",
change them into two fields in the struct.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This requires us to pass "struct mailinfo" to more functions
throughout the codepath that read input lines. Incidentally,
later steps are helped by this patch passing the struct to
more callchains.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two are the only easy ones that do not require passing the
structure around to deep corners of the callchain.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this first step, move only 'email' and 'name' fields in there and
remove the corresponding globals. In subsequent patches, more
globals will be moved to this and the structure will be passed
around as a new parameter to more functions.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the previous steps, it becomes clear that the mailinfo()
function is the only one that wants the "line" to be directly
touchable. Move it to the function scope of this function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the previous two commits, we established that the local
variable "line" in handle_body() and handle_boundary() functions
always refer to the global "line" that is used as the common and
shared "current line from the input". They are the only callers of
the last function that refers to the global line directly, i.e.
find_boundary(). Pass "line" as a parameter to this leaf function
to complete the clean-up. Now the only function that directly refers
to the global "line" is the caller of handle_body() at the very
beginning of this whole callchain.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function has a single caller, and called with the global "line"
holding the multi-part boundary line the caller saw while processing
the e-mail body. The function then goes into a loop to process each
line of the input, and fills the same global "line" variable from
the input as it needs to read more lines to process the multi-part
headers.
Let the caller explicitly pass a pointer to this global "line"
variable as an argument, and have the function itself use that
strbuf throughout, instead of referring to the global "line" itself.
There still is a helper function that this function calls that still
touches the global directly; it will be updated as the series progresses.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function has a single caller, and called with the global "line"
holding the first line of the e-mail body after the caller finished
processing the e-mail headers. The function then goes into a loop
to process each line of the input, starting from what was given by
its caller, and fills the same global "line" variable from the input
as it needs to process more lines.
Let the caller explicitly pass a pointer to this global "line"
variable as an argument, and have the function itself use that
strbuf throughout, instead of referring to the global "line" itself.
There are helper functions that this function calls that still touch
the global directly; they will be updated as the series progresses.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two helper functions use "static int" in their scope to keep track
of the state while repeatedly getting called once for each input
line. Move these state variables to their ultimate caller and pass
down pointers to them along the callchain, as a small step in
preparation for making this entire callchain more reentrant.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function wants to call find_boundary() and is called only from
one place without any recursing, so it becomes easier to read if it
appears after the called function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whether this loop is left via EOF/break or upon finding a
non-continuation line, the storage used for the contination line
handling is left behind.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This does not make a difference within the context of "git mailinfo"
that runs once and exits, as flushing and closing would happen upon
process termination. It however will matter when we eventually make
it callable as an API function.
Besides, cleaning after yourself once you are done is a good hygiene.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We pre-increment the pointer that we will use to store something at,
so the pointer is already beyond the end of the array if it points
at content[MAX_BOUNDARIES].
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In olden days we might have wanted to behave differently in
decode_header() if the header line was encoded with RFC2047, but we
apparently do not do so, hence this helper function can go, together
with its return value.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The called function checks if the second parameter is either a NULL
or an empty string at the very beginning and returns without doing
anything. Remove the useless call.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:
- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option adds the content of the Message-Id header at the end of the
commit message prepared by git-mailinfo. This is useful in order to
associate commit messages automatically with mailing list discussions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The just-released Apple Xcode 6.0.1 has -Wstring-plus-int enabled by
default which complains about pointer arithmetic applied to a string
literal:
builtin/mailinfo.c:303:24: warning:
adding 'long' to a string does not append to the string
return !memcmp(SAMPLE + (cp - line), cp, strlen(SAMPLE) ...
~~~~~~~^~~~~~~~~~~~~
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since commit 81c5cf7 (mailinfo: skip bogus UNIX From line inside
body, 2006-05-21), we have treated lines like ">From" in the body as
headers. This makes "git am" work for people who erroneously paste
the whole output from format-patch:
From 12345abcd...fedcba543210 Mon Sep 17 00:00:00 2001
From: them
Subject: [PATCH] whatever
into their email body (assuming that an mbox writer then quotes
"From" as ">From", as otherwise we would actually mailsplit on the
in-body line).
However, this has false positives if somebody actually has a commit
body that starts with "From "; in this case we erroneously remove
the line entirely from the commit message. We can make this check
more robust by making sure the line actually looks like a real mbox
"From" line.
Inspect the line that begins with ">From " a more carefully to only
skip lines that match the expected pattern (note that the datestamp
part of the format-patch output is designed to be kept constant to
help those who write magic(5) entries).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The array header is defined as:
static const char *header[MAX_HDR_PARSED] = {
"From","Subject","Date",
};
When looking for the index of a specfic string in that array, simply
use strcmp() instead of memcmp(). This avoids running over the end of
the string (e.g. with memcmp("Subject", "From", 7)) and gets rid of
magic string length constants.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Callers of reencode_string() that re-encodes a string from one
encoding to another all used ad-hoc way to bypass the case where the
input and the output encodings are the same. Some did strcmp(),
some did strcasecmp(), yet some others when converting to UTF-8 used
is_encoding_utf8().
Introduce same_encoding() helper function to make these callers use
the same logic. Notably, is_encoding_utf8() has a work-around for
common misconfiguration to use "utf8" to name UTF-8 encoding, which
does not match "UTF-8" hence strcasecmp() would not consider the
same. Make use of it in this helper function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently "git am" does insane things if the mbox it is given contains
attachments with a MIME type that aren't "text/*".
In particular, it will still decode them, and pass them "one line at a
time" to the mail body filter, but because it has determined that they
aren't text (without actually looking at the contents, just at the mime
type) the "line" will be the encoding line (eg 'base64') rather than a
line of *content*.
Which then will cause the text filtering to fail, because we won't
correctly notice when the attachment text switches from the commit message
to the actual patch. Resulting in a patch failure, even if patch may be a
perfectly well-formed attachment, it's just that the message type may be
(for example) "application/octet-stream" instead of "text/plain".
Just remove all the bogus games with the message_type. The only difference
that code creates is how the data is passed to the filter function
(chunked per-pred-code line or per post-decode line), and that difference
is *wrong*, since chunking things per pre-decode line can never be a
sensible operation, and cannot possibly matter for binary data anyway.
This code goes all the way back to March of 2007, in commit 87ab799234
("builtin-mailinfo.c infrastrcture changes"), and apparently Don used to
pass random mbox contents to git. However, the pre-decode vs post-decode
logic really shouldn't matter even for that case, and more importantly, "I
fed git am crap" is not a valid reason to break *real* patch attachments.
If somebody really cares, and determines that some attachment is binary
data (by looking at the data, not the MIME-type), the whole attachment
should be dismissed, rather than fed in random-sized chunks to
"handle_filter()".
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Don Zickus <dzickus@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"Content-type: text/plain; charset=UTF-8" header should not appear
twice in the input, but it is always better to gracefully deal with
such a case. The current code concatenates the value to the values
we have seen previously, producing nonsense such as "utf8UTF-8".
Instead of concatenating, forget the previous value and use the last
value we see.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already strip the more common Re: and re:, and we do not often
see RE: from saner MUA, but this prefix does exist and gets used
from time to time.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a line in the message is not a valid utf-8, "git mailinfo"
attempts to convert it to utf-8 assuming the input is latin1 (and
punt if it does not convert cleanly). Using the same heuristics in
"git commit" and "git commit-tree" lets the editor output be in
latin1 to make the overall system more consistent.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The logic for the -b mode, where [PATCH] is dropped but [foo] is not,
silently ate all spaces after the ].
Fix this by keeping the next isspace() character, if there is any.
Being more thorough is pointless, as the later cleanup_space() call
will normalize any sequence of whitespace to a single ' '.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without the "-k" option, mailinfo will convert a folded
subject header like:
Subject: this is a
subject that doesn't
fit on one line
into a single line. With "-k", however, we assumed that
these newlines were significant and represented something
that the sending side would want us to preserve.
For messages created by format-patch, this assumption was
broken by a1f6baa (format-patch: wrap long header lines,
2011-02-23). For messages sent by arbitrary MUAs, this was
probably never a good assumption to make, as they may have
been folding subjects in accordance with rfc822's line
length recommendations all along.
This patch now joins folded lines with a single whitespace
character. This treats header folding purely as a syntactic
feature of the transport mechanism, not as something that
format-patch is trying to tell us about the original
subject.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* builtin/commit.c: Replace block of code with a one-liner call to
logmsg_reencode().
* commit.c: new function for looking up a comit by name
* pretty.c: helper methods for getting output encodings
Add helpers get_log_output_encoding() and
get_commit_output_encoding() that eliminate some messy and duplicate
if-blocks.
Signed-off-by: Pat Notz <patnotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX
5.1 fails to compile git.
enum style is inconsistent already, with some enums declared on one
line, some over 3 lines with the enum values all on the middle line,
sometimes with 1 enum value per line... and independently of that the
trailing comma is sometimes present and other times absent, often
mixing with/without trailing comma styles in a single file, and
sometimes in consecutive enum declarations.
Clearly, omitting the comma is the more portable style, and this patch
changes all enum declarations to use the portable omitted dangling
comma style consistently.
Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handle perforations found “in the wild” more robustly by recognizing
“%<” as an alternative scissors mark.
This feature is only meant to support old habits. Discourage new use
of the percent-based version by only documenting the 8< symbol so new
users’ perforations can still be recognized by old versions of Git.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This shrinks the top-level directory a bit, and makes it much more
pleasant to use auto-completion on the thing. Instead of
[torvalds@nehalem git]$ em buil<tab>
Display all 180 possibilities? (y or n)
[torvalds@nehalem git]$ em builtin-sh
builtin-shortlog.c builtin-show-branch.c builtin-show-ref.c
builtin-shortlog.o builtin-show-branch.o builtin-show-ref.o
[torvalds@nehalem git]$ em builtin-shor<tab>
builtin-shortlog.c builtin-shortlog.o
[torvalds@nehalem git]$ em builtin-shortlog.c
you get
[torvalds@nehalem git]$ em buil<tab> [type]
builtin/ builtin.h
[torvalds@nehalem git]$ em builtin [auto-completes to]
[torvalds@nehalem git]$ em builtin/sh<tab> [type]
shortlog.c shortlog.o show-branch.c show-branch.o show-ref.c show-ref.o
[torvalds@nehalem git]$ em builtin/sho [auto-completes to]
[torvalds@nehalem git]$ em builtin/shor<tab> [type]
shortlog.c shortlog.o
[torvalds@nehalem git]$ em builtin/shortlog.c
which doesn't seem all that different, but not having that annoying
break in "Display all 180 possibilities?" is quite a relief.
NOTE! If you do this in a clean tree (no object files etc), or using an
editor that has auto-completion rules that ignores '*.o' files, you
won't see that annoying 'Display all 180 possibilities?' message - it
will just show the choices instead. I think bash has some cut-off
around 100 choices or something.
So the reason I see this is that I'm using an odd editory, and thus
don't have the rules to cut down on auto-completion. But you can
simulate that by using 'ls' instead, or something similar.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are rebasing we know that the header lines in the
patch are good and that we don't need to pick up any headers
from the body of the patch.
This makes it possible to rebase commits whose commit message
start with "From" or "Date".
Test vectors by Jeff King.
Signed-off-by: Lukas Sandström <luksan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent "cut at scissors" implementation rewinds and truncates
the output file to store the message when it sees a scissors mark,
but it did not check if these library calls succeeded.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
[sp: Use fseek as rewind returns void]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This teaches mailinfo the scissors -- >8 -- mark; the command ignores
everything before it in the message body.
For lefties among us, we also support -- 8< -- ;-)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It fed two arguments to override the corresponding global variables,
but the caller always assigned the values to the global variables
first and then passed those global variables to this function.
Stop pretending to be a proper API to confuse people.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There should be no functional change. Just the necessary changes and
simplifications associated with calling strbuf_getwholeline() rather
than an internal function or fgets.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, we remove leading [bracketed] [strings] from the Subject:
header when coming up with the summary of the patch. This is because
there are mailing lists etc that add their own headers to the subject, and
they know they can add things in brackets. The most obvious example is the
Linux kernel security list. Their emails look like
Subject: [Security] [patch] random: make get_random_int() more random
and other people mangle Subject: themselves in a similar way, e.g.:
Subject: [PATCH -rc] [BUGFIX] x86: fix kernel_trap_sp()
Subject: [BUGFIX][PATCH] fix bad page removal from LRU (Was Re: [RFC][PATCH] ..
even though "fix" is more than enough cue to mark it as a [BUGFIX].
Some projects however want to keep these bracketed strings. With this
option, we remove only [bracketed strings that contain word PATCH], so we
will turn things like these
[PATCH] [mailinfo] -b ...
[PATCH v2] [mailinfo] -b ...
[PATCH (v2) 1/4] [mailinfo] -b ...
into
[mailinfo] -b ...
This lacks tests and integration to the "git am" toolchain to be useful,
but it is a start.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 650d30d8a1.
Some mailing lists are configured add prefix "[listname] " to all their
messages, and also people hand-edit subject lines, be it an output from
format-patch or a patch generated by some other means.
We cannot stop people from mucking with the subject line, and with the
change, there always will be need for hand editing the subject when that
happens. People have depended on the leading [bracketed string] removal.
git-format-patch prepends patches with a [PATCH x/n] prefix, but
mailinfo used to remove any number of square-bracket pairs and
the content between them. This prevents one from using a commit
subject like this:
[ and ] must be allowed as input
Removing the square bracket pair from this rather clumsily
constructed subject line loses important information, so we must
take care not to.
This patch causes the subject stripping to stop after it has
encountered one pair of square brackets.
One possible downside of this patch is that the patch-handling
programs will now fail at removing author-added square-brackets
to be removed, such as
[RFC][PATCH x/n]
However, since format-patch only adds one set of square brackets,
this behaviour is quite easily undesrstood and defended while the
previous behaviour is not.
Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some platforms do not understand the character encoding "latin1" which is
another name for "ISO8859-1". So use "ISO8859-1" instead which all tested
platforms understand.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>