Since some frontends may be working with source material where
the dates are only readily available as RFC 2822 strings, it is
more friendly if fast-import exposes Git's parse_date() function
to handle the conversion. This way the frontend doesn't need
to perform the parsing itself.
The new --date-format option to fast-import can be used by a
frontend to select which format it will supply date strings in.
The default is the standard `raw` Git format, which fast-import
has always supported. Format rfc2822 can be used to activate the
parse_date() function instead.
Because fast-import could also be useful for creating new, current
commits, the format `now` is also supported to generate the current
system timestamp. The implementation of `now` is a trivial call
to datestamp(), but is actually a whole whopping 3 lines so that
fast-import can verify the frontend really meant `now`.
As part of this change I have added validation of the `raw` date
format. Prior to this change fast-import would accept anything
in a `committer` command, even if it was seriously malformed.
Now fast-import requires the '> ' near the end of the string and
verifies the timestamp is formatted properly.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Frontends can use this file to validate imports after they
have been completed.
Performance
-----------
The design of gfi allows it to import large projects in a minimum
@ -127,6 +132,78 @@ results, such as branch names or file names with leading or trailing
@@ -127,6 +132,78 @@ results, such as branch names or file names with leading or trailing
spaces in their name, or early termination of gfi when it encounters
unexpected input.
Date Formats
~~~~~~~~~~~~
The following date formats are supported. A frontend should select
the format it will use for this import by passing the format name
in the `--date-format=<fmt>` command line option.
`raw`::
This is the Git native format and is `<time> SP <tz>`.
It is also gfi's default format, if `--date-format` was
not specified.
+
The time of the event is specified by `<time>` as the number of
seconds since the UNIX epoch (midnight, Jan 1, 1970, UTC) and is
written as an ASCII decimal integer.
+
The timezone is specified by `<tz>` as a positive or negative offset
from UTC. For example EST (which is typically 5 hours behind GMT)
would be expressed in `<tz>` by ``-0500'' while GMT is ``+0000''.
+
If the timezone is not available in the source material, use
``+0000'', or the most common local timezone. For example many
organizations have a CVS repository which has only ever been accessed
by users who are located in the same location and timezone. In this
case the user's timezone can be easily assumed.
+
Unlike the `rfc2822` format, this format is very strict. Any
variation in formatting will cause gfi to reject the value.
`rfc2822`::
This is the standard email format as described by RFC 2822.
+
An example value is ``Tue Feb 6 11:22:18 2007 -0500''. The Git
parser is accurate, but a little on the lenient side. Its the
same parser used by gitlink:git-am[1] when applying patches
received from email.
+
Some malformed strings may be accepted as valid dates. In some of
these cases Git will still be able to obtain the correct date from
the malformed string. There are also some types of malformed
strings which Git will parse wrong, and yet consider valid.
Seriously malformed strings will be rejected.
+
If the source material is formatted in RFC 2822 style dates,
the frontend should let gfi handle the parsing and conversion
(rather than attempting to do it itself) as the Git parser has
been well tested in the wild.
+
Frontends should prefer the `raw` format if the source material
is already in UNIX-epoch format, or is easily convertible to
that format, as there is no ambiguity in parsing.
`now`::
Always use the current time and timezone. The literal
`now` must always be supplied for `<when>`.
+
This is a toy format. The current time and timezone of this system
is always copied into the identity string at the time it is being
created by gfi. There is no way to specify a different time or
timezone.
+
This particular format is supplied as its short to implement and
may be useful to a process that wants to create a new commit
right now, without needing to use a working directory or
gitlink:git-update-index[1].
+
If separate `author` and `committer` commands are used in a `commit`
the timestamps may not match, as the system clock will be polled
twice (once for each command). The only way to ensure that both
author and committer identity information has the same timestamp
is to omit `author` (thus copying from `committer`) or to use a
date format other than `now`.
Commands
~~~~~~~~
gfi accepts several commands to update the current repository
@ -168,8 +245,8 @@ change to the project.
@@ -168,8 +245,8 @@ change to the project.
@ -222,12 +299,10 @@ the email address from the other fields in the line. Note that
@@ -222,12 +299,10 @@ the email address from the other fields in the line. Note that
`<name>` is free-form and may contain any sequence of bytes, except
`LT` and `LF`. It is typically UTF-8 encoded.
The time of the change is specified by `<time>` as the number of
seconds since the UNIX epoc (midnight, Jan 1, 1970, UTC) and is
written as an ASCII decimal integer. The committer's
timezone is specified by `<tz>` as a positive or negative offset
from UTC. For example EST (which is typically 5 hours behind GMT)
would be expressed in `<tz>` by ``-0500'' while GMT is ``+0000''.
The time of the change is specified by `<when>` using the date format
that was selected by the `--date-format=<fmt>` command line option.
See ``Date Formats'' above for the set of supported formats, and
their syntax.
`from`
^^^^^^
@ -394,7 +469,7 @@ lightweight (non-annotated) tags see the `reset` command below.
@@ -394,7 +469,7 @@ lightweight (non-annotated) tags see the `reset` command below.