In order to work with preprocessed dts files more easily, dts will parse
line number information in the form emitted by cpp.
Anton Blanchard (using a fuzzer) reported that including a line number
directive with a nul character (a literal nul in the input file, not a \0
sequence) would cause dtc to SEGV. I spotted several more problems on
examining the code:
* It modified yytext in place which seems to work, but is ugly and I'm
not sure if it's safe on all lex/flex versions
* The regexp used in the lexer to recognize line number information
accepts strings with escape characters, but it won't process these
escapes.
- GNU cpp at least, will generate \ escapes in line number
information, at least with files containing " or \ in the name
This patch reworks the handling of line number information to address
these problems. \ escapes should now be handled directly. nuls in file
names (either with a literal nul in the input file, or with a \0 escape
sequence) are still not permitted, but will now result in a lexical error
rather than a SEGV.
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The code handling integer literals in dtc-lexer.l assumes that the flex
regexp means that strtoull() can't fail to interpret the string as a valid
integer (either decimal, octal, or hexadecimal). This is not true for
octals. For example '09' is accepted as a literal by the regexp,
strtoull() attempts to handle it as octal, but it has a bad digit.
This changes the code to give a more useful error in this case.
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The scanners of the latest version of dtc and
convert-dtsv0 are no longer use start condition
"INCLUDE". so we should delete it.
Signed-off-by: Wang Long <long.wanglong@huawei.com>
At present, the lexer token for references to a path doesn't permit a
reference to the root node &{/}. Fixing the lexer exposes another bug
handling this case.
This patch fixes both bugs and adds testcases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
To match the processing of integer literals, character literals are passed
as a string from lexer to parser then interpreted there. This is just as
awkward as it was for integer literals, without the excuse that we used to
need the information about the dts version to process them correctly.
So, move character literal processing back to the lexer as well, cleaning
things up.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
At the moment integer literals are passed from the lexer to the parser as
a string, where it's evaluated into an integer by eval_literal(). That
strange approach happened because we needed to know whether we were
processing dts-v0 or dts-v1 - only known at the parser level - to know
how to interpret the literal properly.
dts-v0 support has been gone for some time now, and the base and bits
parameters to eval_literal() are essentially useless.
So, clean things up by moving the literal interpretation back to the lexer.
This also introduces a new lexical_error() function to report malformed
literals and set the treesource_error flag so that they'll cause a parse
failure at the top level.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The isdigit(), isprint(), etc. functions take an int, whose value is
required to be in the range of an _unsigned_ char, or EOF. This, horribly,
means that systems which have a signed char by default need casts to pass
a char variable safely to these functions.
We can't do this more nicely by making the variables themselves 'unsigned
char *' because then we'll get warnings passing them to the strchr() etc.
functions.
At least the cygwin version of these functions, are designed to generate
warnings if this isn't done, as explained by this comment from ctype.h:
These macros are intentionally written in a manner that will trigger
a gcc -Wall warning if the user mistakenly passes a 'char' instead
of an int containing an 'unsigned char'.
Signed-off-by: Serge Lamikhov-Center <Serge.Lamikhov@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We already use the C99 bool type from stdbool.h in a few places. However
there are many other places we represent boolean values as plain ints.
This patch changes that.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Previously, the #line parsing regex ended with ({WS}+[0-9]+)?. The {WS}
could match line-break characters. If the #line directive did not contain
the optional flags field at the end, this could cause any integer data on
the next line to be consumed as part of the #line directive parsing. This
could cause syntax errors (i.e. #line parsing consuming the leading 0
from a hex literal 0x1234, leaving x1234 to be parsed as cell data,
which is a syntax error), or invalid compilation results (i.e. simply
consuming literal 1234 as part of the #line processing, thus removing it
from the cell data).
Fix this by replacing {WS} with [ \t] so that it can't match line-breaks.
Convert all instances of {WS}, even though the other instances should be
irrelevant for any well-formed #line directive. This is done for
consistency and ultimate safety.
Reported-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Line control directives of the following formats are supported:
#line LINE "FILE"
# LINE "FILE" [FLAGS]
This allows dtc to consume the output of pre-processors, and to provide
error messages that refer to the original filename, including taking
into account any #include directives that the pre-processor may have
performed.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
The device tree language as currently defined conflicts with the C pre-
processor in one aspect - when a property or node name begins with a #
character, a pre-processor would attempt to interpret it as a directive,
fail, and most likely error out.
This change allows a property/node name to be prefixed with \. This
prevents a pre-processor from seeing # as the first non-whitespace
character on the line, and hence prevents the conflict. \ was previously
an illegal character in property/node names, so this change is
backwards compatible. The \ is stripped from the name during parsing
by dtc.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
dtc currently allows the contents of properties to be changed, and the
contents of nodes to be added to. There are situations where removing
properties or nodes may be useful. This change implements the following
syntax to do that:
/ {
/delete-property/ propname;
/delete-node/ nodename;
};
or:
/delete-node/ &noderef;
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Written by David Gibson <david@gibson.dropbear.id.au>. Additions by me:
* Ported to ToT dtc.
* Renamed cell to integer throughout.
* Implemented value range checks.
* Allow U/L/UL/LL/ULL suffix on literals.
* Enabled the commented test.
Signed-off-by: Stephen Warren <swarren@wwwdotorg.org>
Elements of size 8, 16, 32, and 64 bits are supported. The new
/bits/ syntax was selected so as to not pollute the reserved
keyword space with uint8/uint16/... type names.
With this patch the following property assignment:
property = /bits/ 16 <0x1234 0x5678 0x0 0xffff>;
is equivalent to:
property = <0x12345678 0x0000ffff>;
It is now also possible to directly specify a 64 bit literal in a
cell list, also known as an array using:
property = /bits/ 64 <0xdeadbeef00000000>;
It is an error to attempt to store a literal into an element that is
too small to hold the literal, and the compiler will generate an
error when it detects this. For instance:
property = /bits/ 8 <256>;
Will fail to compile. It is also an error to attempt to place a
reference in a non 32-bit element.
The documentation has been changed to reflect that the cell list
is now an array of elements that can be of sizes other than the
default 32-bit cell size.
The sized_cells test tests the creation and access of 8, 16, 32,
and 64-bit sized elements. It also tests that the creation of two
properties, one with 16 bit elements and one with 32 bit elements
result in the same property contents.
Signed-off-by: Anton Staaf <robotboy@chromium.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
With this patch the following property assignment:
property = <0x12345678 'a' '\r' 100>;
is equivalent to:
property = <0x12345678 0x00000061 0x0000000D 0x00000064>
Signed-off-by: Anton Staaf <robotboy@chromium.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
When nodes are modified by merging device trees, nodes to be updated/merged can
be specified by a label. Specifying nodes by full path (instead of label)
doesn't quite work. This patch fixes that.
Signed-off-by: John Bonesio <bones@secretlab.ca>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
We are almost clean already with the -Wredundant-decls warning. The
only exception is a declaration for isatty() inside the flex-generated
code. This can be removed by using flex's "never-interactive" option,
which we probably should be using anyway, since we never parse
interactively in the sense that this option implies.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This mod allows successful build of dtc using both bison/flex and yacc/lex.
Signed-off-by: Lukasz Wojcik <zbr@semihalf.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Our YYLTYPE current carries around first and last line and first and
last column information. However, of these, on the first line
information is actually filled in properly.
Furthermore, filling in the line number information from yylineno is
kind of clunky: we have to copy its value to the srcfile stack and
back to handle include file positioning correctly.
This patch cleans this up. We turn off flex's yylineno option and
instead track the line and column number ourselves from
YY_USER_ACTION. The line and column number are stored directly inside
the srcfile_state structure, so it's automatically a per-file
quantity. We now also fill in all the yylloc from YY_USER_ACTION.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch cleans up our handling of input files, particularly dts
source files, but also (to an extent) other input files such as those
used by /incbin/ and those used in -I dtb and -I fs modes.
We eliminate the current clunky mechanism which combines search paths
(which we don't actually use at present) with the open relative to
current source file behaviour, which we do.
Instead there's a single srcfile_relative_open() entry point for
callers which opens a new input file relative to the current source
file (which the srcpos code tracks internally). It doesn't currently
do search paths, but we can add that later without messing with the
callers, by drawing the search path from a global (which makes sense
anyway, rather than shuffling it around the rest of the processing
code).
That suffices for non-dts input files. For the actual dts files,
srcfile_push() and srcfile_pop() wrappers open the file while also
keeping track of it as the current source file for future opens.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that all in-kernel-tree DTS files are properly /dts-v1/,
remove direct support for the older, un-numbered DTS
source file format.
Convert existing tests to /dts-v1/ and remove support
for the conversion tests themselves.
For now, though, the conversion tool still exists.
Signed-off-by: Jon Loeliger <jdl@freescale.com>
Current, every lexer rule starts with some boiler plate to update the
yylloc value for use by the parser. One of the rules, even mistakenly
has a redundant allocation to one of the members.
This patch uses the flex YY_USER_ACTION macro hook, which is executed
before every rule to avoid this duplication.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Many places in dtc use strdup(), but none of them actually check the
return value to see if the implied allocation succeeded. This is a
potential bug, which we fix in the patch below by replacing strdup()
with an xstrdup() which in analogy to xmalloc() will quit with a fatal
error if the allocation fails.
I felt the introduciton of util.[ch] was a better choice
for utility oriented code than directly using srcpos.c
for the new string function.
This patch is a re-factoring of Dave Gibson's similar patch.
Signed-off-by: Jon Loeliger <jdl@freescale.com>
dtc does not use the input() function in flex. Apparently on some gcc
versions the unused function will cause warnings. Therefore, this
patch removes the function by using the 'noinput' option to flex.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently we scan the /include/ directive as two tokens, the
"/include/" keyword itself, then the string giving the file name to
include. We use a special scanner state to keep the two linked
together, and use the scanner state stack to keep track of the
original state while we're parsing the two /include/ tokens.
This does mean that we need to enable the 'stack' option in flex,
which results in a not-easily-suppressed warning from the flex
boilerplate code. This is mildly irritating.
However, this two-token scanning of the /include/ directive also has
some extremely strange edge cases, because there are a variety of
tokens recognized in all scanner states, including INCLUDE. For
example the following strange dts file:
/include/ /dts-v1/;
/ {
/* ... */
};
Will be processed successfully with the /include/ being effectively
ignored: the '/dts-v1/' and ';' are recognized even in INCLUDE state,
then the ';' transitions us to PROPNODENAME state, throwing away
INCLUDE, and the previous state is never popped off the stack. Or
for another example this construct:
foo /include/ = "somefile.dts"
will be parsed as though it were:
foo = /include/ "somefile.dts"
Again, the '=' is scanned without leaving INCLUDE state, then the next
string triggers the include logic.
And finally, we use a different regexp for the string with the
included filename than the normal string regexpt, which is also
potentially weird.
This patch, therefore, cleans up the lexical handling of the /include/
directive. Instead of the INCLUDE state, we instead scan the whole
include directive, both keyword and filename as a single token. This
does mean a bit more complexity in extracting the filename out of
yytext, but I think it's worth it to avoid the strageness described
above. It also means it's no longer possible to put a comment between
the /include/ and the filename, but I'm really not very worried about
breaking files using such a strange construct.
On Wed, Jun 04, 2008 at 09:26:23AM -0500, Jon Loeliger wrote:
> David Gibson wrote:
>
>> But as I said that can be dealt with in the future without breaking
>> compatibility. Objection withdrawn.
>>
>
> And on that note, I officially implore Scott to
> re-submit his binary include patch!
Scott's original patch does still have some implementation details I
didn't like. So in the interests of saving time, I've addressed some
of those, added a testcase, and and now resubmitting my revised
version of Scott's patch.
dtc: Add support for binary includes.
A property's data can be populated with a file's contents
as follows:
node {
prop = /incbin/("path/to/data");
};
A subset of a file can be included by passing start and size parameters.
For example, to include bytes 8 through 23:
node {
prop = /incbin/("path/to/data", 8, 16);
};
As with /include/, non-absolute paths are looked for in the directory
of the source file that includes them.
Implementation revised, and a testcase added by David Gibson
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Scott Wood <scottwood@freescale.com>
Currently, dt_from_source() uses push_input_file() to set up the
initial input file for the lexer. That sounds sensible - put the
outermost input file at the bottom of the stack - until you realise
that what it *actually* does is pushes the current, uninitialized,
lexer input state onto the stack, then sets up the new lexer input.
That necessitates an extra check in pop_input_file(), rather than
signalling termination in the natural way when the include stack is
empty, it has to check when it pops the bogus uninitialized state off
the stack. Ick.
With that fixed, push_input_file(), pop_input_file() and
incl_file_stack itself become local to the lexer, so make them static.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All current callers of dtc_open_file() immediately die() if it returns
an error. In a non-interative tool like dtc, it's hard to see what
you could sensibly do to recover from a failure to open an input file
in any case.
Therefore, make dtc_open_file() itself die() if there's an error
opening the requested file. This removes the need for error checking
at the callsites, and ensures a consistent error message in all cases.
While we're at it, change the rror message from fstree.c when we fail
to open the input directory to match dtc_open_file()'s error message.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Error reporting in push_input_file() is a mess. One error results in
a message and exit(1), others result in a message and return 0 - which
is turned into an exit(1) at one callsite. The other callsite doesn't
check errors, but probably should. One of the error conditions gives
a message, but can only be the result of an internal programming
error, not a user error.
So. Clean that up by making push_input_file() a void function, using
die() to report errors and quit.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This allows /include/s to work when in non-default states,
such as PROPNODECHAR.
We may want to use state stacks to get rid of BEGIN_DEFAULT() altogether...
Signed-off-by: Scott Wood <scottwood@freescale.com>
Looking in the diretory dtc is invoked from is not very useful behavior.
As part of the code reorganization to implement this, I removed the
uniquifying of name storage -- it seemed a rather dubious optimization
given likely usage, and some aspects of it would have been mildly awkward
to integrate with the new code.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Commit 7c44c2f9cb broke backwards
compatibility more badly than I realised. Contrary to what I thought
there are in-kernel, in-use dts files which relied on
references-to-path with paths including a comma, which no longer
compile after that commit.
So, this patch reinstates full support for bare references-to-path in
dts-v0 input. This means there will be some rather surprising lexical
corner cases when using path-expanded references in v0 files. But,
since path-expanded references are new, v0 files shouldn't typically
be using them anyway. If the corner cases cause a problem, you can
always convert to dts-v1 which handles the lexical issues here more
nicely.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch applies a couple of tiny cleanups to the lexer. The
not-very-useful 'WS' named pattern is removed, and the debugging
printf() for single character tokens is moved to the top of the
action, which results in less confusing output when LEXDEBUG is
switched on (because it goes before the printf()s for possible
resulting lexer state changes).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The recent change to the lexer to only recognize property and node
names in the appropriate context removed a number of lexical warts in
our language that would have gotten ugly as we add expression support
and so forth.
But there's one nasty one remaining: references can contain a full
path, including the various problematic node name characters (',', '+'
and '-', for example). This would cause trouble with expressions, and
it also causes trouble with the patch I'm working on to allow
expanding references to paths rather than phandles. This patch
therefore reworks the lexer to mitigate these problems.
- References to labels cause no problems. These are now
recognized separately from references to full paths. No syntax change
here.
- References to full paths, including problematic characters
are allowed by "quoting" the path with braces
e.g. &{/pci@10000/somedevice@3,8000}. The braces protect any internal
problematic characters from being confused with operators or whatever.
- For compatibility with existing dts files, in v0 dts files
we allow bare references to paths as before &/foo/bar/whatever - but
*only* if the path contains no troublesome characters. Specifically
only [a-zA-Z0-9_@/] are allowed.
This is an incompatible change to the dts-v1 format, but since AFAIK
no-one has yet switched to dts-v1 files, I think we can get away with
it. Better to make the transition when people to convert to v1, and
get rid of the problematic old syntax.
Strictly speaking, it's also an incompatible change to the v0 format,
since some path references that were allowed before are no longer
allowed. I suspect no-one has been using the no-longer-supported
forms (certainly none of the kernel dts files will cause trouble).
We might need to think about this harder, though.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
dtc: Switch dtc to C-style literals
This patch introduces a new version of dts file, distinguished from
older files by starting with the special token /dts-v1/. dts files in
the new version take C-style literals instead of the old bare hex or
OF-style base notation. In addition, the "range" for of memreserve entries
(/memreserve/ f0000-fffff) is no longer recognized in the new format.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jon Loeliger <jdl@freescale.com>
The current scheme of having CELLDATA and MEMRESERVE states to
recognize hex literals instead of node or property names is
arse-backwards. The patch switches things around so that literals are
lexed in normal states, and property/node names are only recognized in
the special PROPNODENAME state, which is only entered after a { or a
;, and is left as soon as we scan a property/node name or a keyword.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jon Loeliger <jdl@freescale.com>
dtc supports the use of C-style escapes (\n, \t and so forth) in
string property definitions via the data_copy_escape_string()
function. However, while it supports the most common escape
characters, it doesn't support the full set that C does, which is a
potential gotcha.
Worse, a bug in the lexer means that while data_copy_escape_string()
can handle the \" escape, a string with such an escape won't lex
correctly.
This patch fixes both problems, extending data_copy_escape_string() to
support the missing escapes, and fixing the regex for strings in the
lexer to handle internal escaped quotes.
This also adds a testcase for string escape functionality.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This large patch removes all trailing whitespace from dtc (including
libfdt, the testsuite and documentation). It also removes a handful
of redundant blank lines (at the end of functions, or when there are
two blank lines together for no particular reason).
As well as anything else, this means that quilt won't whinge when I go
to convert the whole of libfdt into a patch to apply to the kernel.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Change the lexer to recognise a label in any context. Place
before other celldata and bytestrings to avoid the initial
characters being stolen by other matches.
A label is a character sequence starting with an alphabetic
or underscore optinally followed by the same plus digits and
terminating in a colon.
The included terminating colon will prevent matching hex numbers.
Signed-off-by: Milton Miller <miltonm@bga.com>
At present, the lexer in dtc recognizes only space, tab and newline as
whitespace characters. This is broken; in particular this means that
dtc will get syntax errors on files with DOS-style (CR-LF) newlines.
This patch fixes the problem, using flex's built-int [:space:]
character class.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
- Change include syntax to: /include/ "filename"
- Move private functions directly into dtc-lexer.l
- Define YYID for some older parser templates
Also fix a #include ordering problem around YYLTPE.
Signed-off-by; Jon Loeliger <jdl@freescale.com>
Acked-by: Haiying Wang <Haiying.Wang@freescale.com>
Keeps track of open files in a stack, and assigns
a filenum to source positions for each lexical token.
Modified error reporting to show source file as well.
No policy on file directory basis has been decided.
Still handles stdin.
Tested on all arch/powerpc/boot/dts DTS files
Signed-off-by: Jon Loeliger <jdl@freescale.com>
New syntax d#, b#, o# and h# allow for an explicit prefix
on cell values to specify their base. Eg: <d# 123>
Signed-off-by: Jon Loeliger <jdl@freescale.com>
At present each property definition in a dts file must give as the
value either a string ("abc..."), a bytestring ([12abcd...]) or a cell
list (<1 2 3 ...>). This patch allows a property value to be given as
several of these, comma-separated. The final property value is just
the components appended together. So a property could have a list of
cells followed by a string, or a bytestring followed by some cells.
Cells are always aligned, so if cells are given following a string or
bytestring which is not a multiple of 4 bytes long, zero bytes are
inserted to align the following cells.
The primary motivation for this feature, however, is to allow defining
a property as a list of several strings. This is what's needed for
defining OF 'compatible' properties, and is less ugly and fiddly than
using embedded \0s in the strings.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Jon Loeliger <jdl@freescale.com>