You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

291 lines
6.3 KiB

20 years ago
/*
* (C) Copyright David Gibson <dwg@au1.ibm.com>, IBM Corporation. 2005.
*
*
20 years ago
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
* USA
20 years ago
*/
%option noyywrap nounput noinput never-interactive
20 years ago
%x INCLUDE
20 years ago
%x BYTESTRING
%x PROPNODENAME
%s V1
20 years ago
dtc: Fix some lexical problems with references The recent change to the lexer to only recognize property and node names in the appropriate context removed a number of lexical warts in our language that would have gotten ugly as we add expression support and so forth. But there's one nasty one remaining: references can contain a full path, including the various problematic node name characters (',', '+' and '-', for example). This would cause trouble with expressions, and it also causes trouble with the patch I'm working on to allow expanding references to paths rather than phandles. This patch therefore reworks the lexer to mitigate these problems. - References to labels cause no problems. These are now recognized separately from references to full paths. No syntax change here. - References to full paths, including problematic characters are allowed by "quoting" the path with braces e.g. &{/pci@10000/somedevice@3,8000}. The braces protect any internal problematic characters from being confused with operators or whatever. - For compatibility with existing dts files, in v0 dts files we allow bare references to paths as before &/foo/bar/whatever - but *only* if the path contains no troublesome characters. Specifically only [a-zA-Z0-9_@/] are allowed. This is an incompatible change to the dts-v1 format, but since AFAIK no-one has yet switched to dts-v1 files, I think we can get away with it. Better to make the transition when people to convert to v1, and get rid of the problematic old syntax. Strictly speaking, it's also an incompatible change to the v0 format, since some path references that were allowed before are no longer allowed. I suspect no-one has been using the no-longer-supported forms (certainly none of the kernel dts files will cause trouble). We might need to think about this harder, though. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
17 years ago
PROPNODECHAR [a-zA-Z0-9,._+*#?@-]
PATHCHAR ({PROPNODECHAR}|[/])
LABEL [a-zA-Z_][a-zA-Z0-9_]*
dtc: Clean up lexing of include files Currently we scan the /include/ directive as two tokens, the "/include/" keyword itself, then the string giving the file name to include. We use a special scanner state to keep the two linked together, and use the scanner state stack to keep track of the original state while we're parsing the two /include/ tokens. This does mean that we need to enable the 'stack' option in flex, which results in a not-easily-suppressed warning from the flex boilerplate code. This is mildly irritating. However, this two-token scanning of the /include/ directive also has some extremely strange edge cases, because there are a variety of tokens recognized in all scanner states, including INCLUDE. For example the following strange dts file: /include/ /dts-v1/; / { /* ... */ }; Will be processed successfully with the /include/ being effectively ignored: the '/dts-v1/' and ';' are recognized even in INCLUDE state, then the ';' transitions us to PROPNODENAME state, throwing away INCLUDE, and the previous state is never popped off the stack. Or for another example this construct: foo /include/ = "somefile.dts" will be parsed as though it were: foo = /include/ "somefile.dts" Again, the '=' is scanned without leaving INCLUDE state, then the next string triggers the include logic. And finally, we use a different regexp for the string with the included filename than the normal string regexpt, which is also potentially weird. This patch, therefore, cleans up the lexical handling of the /include/ directive. Instead of the INCLUDE state, we instead scan the whole include directive, both keyword and filename as a single token. This does mean a bit more complexity in extracting the filename out of yytext, but I think it's worth it to avoid the strageness described above. It also means it's no longer possible to put a comment between the /include/ and the filename, but I'm really not very worried about breaking files using such a strange construct.
17 years ago
STRING \"([^\\"]|\\.)*\"
CHAR_LITERAL '([^']|\\')*'
dtc: Clean up lexing of include files Currently we scan the /include/ directive as two tokens, the "/include/" keyword itself, then the string giving the file name to include. We use a special scanner state to keep the two linked together, and use the scanner state stack to keep track of the original state while we're parsing the two /include/ tokens. This does mean that we need to enable the 'stack' option in flex, which results in a not-easily-suppressed warning from the flex boilerplate code. This is mildly irritating. However, this two-token scanning of the /include/ directive also has some extremely strange edge cases, because there are a variety of tokens recognized in all scanner states, including INCLUDE. For example the following strange dts file: /include/ /dts-v1/; / { /* ... */ }; Will be processed successfully with the /include/ being effectively ignored: the '/dts-v1/' and ';' are recognized even in INCLUDE state, then the ';' transitions us to PROPNODENAME state, throwing away INCLUDE, and the previous state is never popped off the stack. Or for another example this construct: foo /include/ = "somefile.dts" will be parsed as though it were: foo = /include/ "somefile.dts" Again, the '=' is scanned without leaving INCLUDE state, then the next string triggers the include logic. And finally, we use a different regexp for the string with the included filename than the normal string regexpt, which is also potentially weird. This patch, therefore, cleans up the lexical handling of the /include/ directive. Instead of the INCLUDE state, we instead scan the whole include directive, both keyword and filename as a single token. This does mean a bit more complexity in extracting the filename out of yytext, but I think it's worth it to avoid the strageness described above. It also means it's no longer possible to put a comment between the /include/ and the filename, but I'm really not very worried about breaking files using such a strange construct.
17 years ago
WS [[:space:]]
COMMENT "/*"([^*]|\*+[^*/])*\*+"/"
LINECOMMENT "//".*\n
20 years ago
%{
#include "dtc.h"
#include "srcpos.h"
#include "dtc-parser.tab.h"
YYLTYPE yylloc;
extern bool treesource_error;
/* CAUTION: this will stop working if we ever use yyless() or yyunput() */
#define YY_USER_ACTION \
{ \
srcpos_update(&yylloc, yytext, yyleng); \
}
20 years ago
/*#define LEXDEBUG 1*/
#ifdef LEXDEBUG
#define DPRINT(fmt, ...) fprintf(stderr, fmt, ##__VA_ARGS__)
#else
#define DPRINT(fmt, ...) do { } while (0)
#endif
20 years ago
static int dts_version = 1;
#define BEGIN_DEFAULT() DPRINT("<V1>\n"); \
BEGIN(V1); \
static void push_input_file(const char *filename);
static bool pop_input_file(void);
static void lexical_error(const char *fmt, ...);
20 years ago
%}
%%
dtc: Clean up lexing of include files Currently we scan the /include/ directive as two tokens, the "/include/" keyword itself, then the string giving the file name to include. We use a special scanner state to keep the two linked together, and use the scanner state stack to keep track of the original state while we're parsing the two /include/ tokens. This does mean that we need to enable the 'stack' option in flex, which results in a not-easily-suppressed warning from the flex boilerplate code. This is mildly irritating. However, this two-token scanning of the /include/ directive also has some extremely strange edge cases, because there are a variety of tokens recognized in all scanner states, including INCLUDE. For example the following strange dts file: /include/ /dts-v1/; / { /* ... */ }; Will be processed successfully with the /include/ being effectively ignored: the '/dts-v1/' and ';' are recognized even in INCLUDE state, then the ';' transitions us to PROPNODENAME state, throwing away INCLUDE, and the previous state is never popped off the stack. Or for another example this construct: foo /include/ = "somefile.dts" will be parsed as though it were: foo = /include/ "somefile.dts" Again, the '=' is scanned without leaving INCLUDE state, then the next string triggers the include logic. And finally, we use a different regexp for the string with the included filename than the normal string regexpt, which is also potentially weird. This patch, therefore, cleans up the lexical handling of the /include/ directive. Instead of the INCLUDE state, we instead scan the whole include directive, both keyword and filename as a single token. This does mean a bit more complexity in extracting the filename out of yytext, but I think it's worth it to avoid the strageness described above. It also means it's no longer possible to put a comment between the /include/ and the filename, but I'm really not very worried about breaking files using such a strange construct.
17 years ago
<*>"/include/"{WS}*{STRING} {
char *name = strchr(yytext, '\"') + 1;
yytext[yyleng-1] = '\0';
push_input_file(name);
}
<*>^"#"(line)?[ \t]+[0-9]+[ \t]+{STRING}([ \t]+[0-9]+)? {
char *line, *tmp, *fn;
/* skip text before line # */
line = yytext;
while (!isdigit((unsigned char)*line))
line++;
/* skip digits in line # */
tmp = line;
while (!isspace((unsigned char)*tmp))
tmp++;
/* "NULL"-terminate line # */
*tmp = '\0';
/* start of filename */
fn = strchr(tmp + 1, '"') + 1;
/* strip trailing " from filename */
tmp = strchr(fn, '"');
*tmp = 0;
/* -1 since #line is the number of the next line */
srcpos_set_line(xstrdup(fn), atoi(line) - 1);
}
<*><<EOF>> {
if (!pop_input_file()) {
yyterminate();
}
}
dtc: Clean up lexing of include files Currently we scan the /include/ directive as two tokens, the "/include/" keyword itself, then the string giving the file name to include. We use a special scanner state to keep the two linked together, and use the scanner state stack to keep track of the original state while we're parsing the two /include/ tokens. This does mean that we need to enable the 'stack' option in flex, which results in a not-easily-suppressed warning from the flex boilerplate code. This is mildly irritating. However, this two-token scanning of the /include/ directive also has some extremely strange edge cases, because there are a variety of tokens recognized in all scanner states, including INCLUDE. For example the following strange dts file: /include/ /dts-v1/; / { /* ... */ }; Will be processed successfully with the /include/ being effectively ignored: the '/dts-v1/' and ';' are recognized even in INCLUDE state, then the ';' transitions us to PROPNODENAME state, throwing away INCLUDE, and the previous state is never popped off the stack. Or for another example this construct: foo /include/ = "somefile.dts" will be parsed as though it were: foo = /include/ "somefile.dts" Again, the '=' is scanned without leaving INCLUDE state, then the next string triggers the include logic. And finally, we use a different regexp for the string with the included filename than the normal string regexpt, which is also potentially weird. This patch, therefore, cleans up the lexical handling of the /include/ directive. Instead of the INCLUDE state, we instead scan the whole include directive, both keyword and filename as a single token. This does mean a bit more complexity in extracting the filename out of yytext, but I think it's worth it to avoid the strageness described above. It also means it's no longer possible to put a comment between the /include/ and the filename, but I'm really not very worried about breaking files using such a strange construct.
17 years ago
<*>{STRING} {
DPRINT("String: %s\n", yytext);
20 years ago
yylval.data = data_copy_escape_string(yytext+1,
yyleng-2);
return DT_STRING;
}
<*>"/dts-v1/" {
DPRINT("Keyword: /dts-v1/\n");
dts_version = 1;
BEGIN_DEFAULT();
return DT_V1;
}
<*>"/memreserve/" {
DPRINT("Keyword: /memreserve/\n");
BEGIN_DEFAULT();
return DT_MEMRESERVE;
}
<*>"/bits/" {
DPRINT("Keyword: /bits/\n");
BEGIN_DEFAULT();
return DT_BITS;
}
<*>"/delete-property/" {
DPRINT("Keyword: /delete-property/\n");
DPRINT("<PROPNODENAME>\n");
BEGIN(PROPNODENAME);
return DT_DEL_PROP;
}
<*>"/delete-node/" {
DPRINT("Keyword: /delete-node/\n");
DPRINT("<PROPNODENAME>\n");
BEGIN(PROPNODENAME);
return DT_DEL_NODE;
}
dtc: Fix some lexical problems with references The recent change to the lexer to only recognize property and node names in the appropriate context removed a number of lexical warts in our language that would have gotten ugly as we add expression support and so forth. But there's one nasty one remaining: references can contain a full path, including the various problematic node name characters (',', '+' and '-', for example). This would cause trouble with expressions, and it also causes trouble with the patch I'm working on to allow expanding references to paths rather than phandles. This patch therefore reworks the lexer to mitigate these problems. - References to labels cause no problems. These are now recognized separately from references to full paths. No syntax change here. - References to full paths, including problematic characters are allowed by "quoting" the path with braces e.g. &{/pci@10000/somedevice@3,8000}. The braces protect any internal problematic characters from being confused with operators or whatever. - For compatibility with existing dts files, in v0 dts files we allow bare references to paths as before &/foo/bar/whatever - but *only* if the path contains no troublesome characters. Specifically only [a-zA-Z0-9_@/] are allowed. This is an incompatible change to the dts-v1 format, but since AFAIK no-one has yet switched to dts-v1 files, I think we can get away with it. Better to make the transition when people to convert to v1, and get rid of the problematic old syntax. Strictly speaking, it's also an incompatible change to the v0 format, since some path references that were allowed before are no longer allowed. I suspect no-one has been using the no-longer-supported forms (certainly none of the kernel dts files will cause trouble). We might need to think about this harder, though. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
17 years ago
<*>{LABEL}: {
DPRINT("Label: %s\n", yytext);
yylval.labelref = xstrdup(yytext);
yylval.labelref[yyleng-1] = '\0';
return DT_LABEL;
}
<V1>([0-9]+|0[xX][0-9a-fA-F]+)(U|L|UL|LL|ULL)? {
char *e;
DPRINT("Integer Literal: '%s'\n", yytext);
errno = 0;
yylval.integer = strtoull(yytext, &e, 0);
assert(!(*e) || !e[strspn(e, "UL")]);
if (errno == ERANGE)
lexical_error("Integer literal '%s' out of range",
yytext);
else
/* ERANGE is the only strtoull error triggerable
* by strings matching the pattern */
assert(errno == 0);
return DT_LITERAL;
20 years ago
}
<*>{CHAR_LITERAL} {
struct data d;
DPRINT("Character literal: %s\n", yytext);
d = data_copy_escape_string(yytext+1, yyleng-2);
if (d.len == 1) {
lexical_error("Empty character literal");
yylval.integer = 0;
return DT_CHAR_LITERAL;
}
yylval.integer = (unsigned char)d.val[0];
if (d.len > 2)
lexical_error("Character literal has %d"
" characters instead of 1",
d.len - 1);
return DT_CHAR_LITERAL;
}
<*>\&{LABEL} { /* label reference */
dtc: Fix some lexical problems with references The recent change to the lexer to only recognize property and node names in the appropriate context removed a number of lexical warts in our language that would have gotten ugly as we add expression support and so forth. But there's one nasty one remaining: references can contain a full path, including the various problematic node name characters (',', '+' and '-', for example). This would cause trouble with expressions, and it also causes trouble with the patch I'm working on to allow expanding references to paths rather than phandles. This patch therefore reworks the lexer to mitigate these problems. - References to labels cause no problems. These are now recognized separately from references to full paths. No syntax change here. - References to full paths, including problematic characters are allowed by "quoting" the path with braces e.g. &{/pci@10000/somedevice@3,8000}. The braces protect any internal problematic characters from being confused with operators or whatever. - For compatibility with existing dts files, in v0 dts files we allow bare references to paths as before &/foo/bar/whatever - but *only* if the path contains no troublesome characters. Specifically only [a-zA-Z0-9_@/] are allowed. This is an incompatible change to the dts-v1 format, but since AFAIK no-one has yet switched to dts-v1 files, I think we can get away with it. Better to make the transition when people to convert to v1, and get rid of the problematic old syntax. Strictly speaking, it's also an incompatible change to the v0 format, since some path references that were allowed before are no longer allowed. I suspect no-one has been using the no-longer-supported forms (certainly none of the kernel dts files will cause trouble). We might need to think about this harder, though. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
17 years ago
DPRINT("Ref: %s\n", yytext+1);
yylval.labelref = xstrdup(yytext+1);
dtc: Fix some lexical problems with references The recent change to the lexer to only recognize property and node names in the appropriate context removed a number of lexical warts in our language that would have gotten ugly as we add expression support and so forth. But there's one nasty one remaining: references can contain a full path, including the various problematic node name characters (',', '+' and '-', for example). This would cause trouble with expressions, and it also causes trouble with the patch I'm working on to allow expanding references to paths rather than phandles. This patch therefore reworks the lexer to mitigate these problems. - References to labels cause no problems. These are now recognized separately from references to full paths. No syntax change here. - References to full paths, including problematic characters are allowed by "quoting" the path with braces e.g. &{/pci@10000/somedevice@3,8000}. The braces protect any internal problematic characters from being confused with operators or whatever. - For compatibility with existing dts files, in v0 dts files we allow bare references to paths as before &/foo/bar/whatever - but *only* if the path contains no troublesome characters. Specifically only [a-zA-Z0-9_@/] are allowed. This is an incompatible change to the dts-v1 format, but since AFAIK no-one has yet switched to dts-v1 files, I think we can get away with it. Better to make the transition when people to convert to v1, and get rid of the problematic old syntax. Strictly speaking, it's also an incompatible change to the v0 format, since some path references that were allowed before are no longer allowed. I suspect no-one has been using the no-longer-supported forms (certainly none of the kernel dts files will cause trouble). We might need to think about this harder, though. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
17 years ago
return DT_REF;
}
<*>"&{/"{PATHCHAR}*\} { /* new-style path reference */
dtc: Fix some lexical problems with references The recent change to the lexer to only recognize property and node names in the appropriate context removed a number of lexical warts in our language that would have gotten ugly as we add expression support and so forth. But there's one nasty one remaining: references can contain a full path, including the various problematic node name characters (',', '+' and '-', for example). This would cause trouble with expressions, and it also causes trouble with the patch I'm working on to allow expanding references to paths rather than phandles. This patch therefore reworks the lexer to mitigate these problems. - References to labels cause no problems. These are now recognized separately from references to full paths. No syntax change here. - References to full paths, including problematic characters are allowed by "quoting" the path with braces e.g. &{/pci@10000/somedevice@3,8000}. The braces protect any internal problematic characters from being confused with operators or whatever. - For compatibility with existing dts files, in v0 dts files we allow bare references to paths as before &/foo/bar/whatever - but *only* if the path contains no troublesome characters. Specifically only [a-zA-Z0-9_@/] are allowed. This is an incompatible change to the dts-v1 format, but since AFAIK no-one has yet switched to dts-v1 files, I think we can get away with it. Better to make the transition when people to convert to v1, and get rid of the problematic old syntax. Strictly speaking, it's also an incompatible change to the v0 format, since some path references that were allowed before are no longer allowed. I suspect no-one has been using the no-longer-supported forms (certainly none of the kernel dts files will cause trouble). We might need to think about this harder, though. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
17 years ago
yytext[yyleng-1] = '\0';
DPRINT("Ref: %s\n", yytext+2);
yylval.labelref = xstrdup(yytext+2);
dtc: Fix some lexical problems with references The recent change to the lexer to only recognize property and node names in the appropriate context removed a number of lexical warts in our language that would have gotten ugly as we add expression support and so forth. But there's one nasty one remaining: references can contain a full path, including the various problematic node name characters (',', '+' and '-', for example). This would cause trouble with expressions, and it also causes trouble with the patch I'm working on to allow expanding references to paths rather than phandles. This patch therefore reworks the lexer to mitigate these problems. - References to labels cause no problems. These are now recognized separately from references to full paths. No syntax change here. - References to full paths, including problematic characters are allowed by "quoting" the path with braces e.g. &{/pci@10000/somedevice@3,8000}. The braces protect any internal problematic characters from being confused with operators or whatever. - For compatibility with existing dts files, in v0 dts files we allow bare references to paths as before &/foo/bar/whatever - but *only* if the path contains no troublesome characters. Specifically only [a-zA-Z0-9_@/] are allowed. This is an incompatible change to the dts-v1 format, but since AFAIK no-one has yet switched to dts-v1 files, I think we can get away with it. Better to make the transition when people to convert to v1, and get rid of the problematic old syntax. Strictly speaking, it's also an incompatible change to the v0 format, since some path references that were allowed before are no longer allowed. I suspect no-one has been using the no-longer-supported forms (certainly none of the kernel dts files will cause trouble). We might need to think about this harder, though. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
17 years ago
return DT_REF;
}
20 years ago
<BYTESTRING>[0-9a-fA-F]{2} {
yylval.byte = strtol(yytext, NULL, 16);
DPRINT("Byte: %02x\n", (int)yylval.byte);
20 years ago
return DT_BYTE;
}
<BYTESTRING>"]" {
DPRINT("/BYTESTRING\n");
BEGIN_DEFAULT();
20 years ago
return ']';
}
<PROPNODENAME>\\?{PROPNODECHAR}+ {
DPRINT("PropNodeName: %s\n", yytext);
yylval.propnodename = xstrdup((yytext[0] == '\\') ?
yytext + 1 : yytext);
BEGIN_DEFAULT();
return DT_PROPNODENAME;
20 years ago
}
"/incbin/" {
DPRINT("Binary Include\n");
return DT_INCBIN;
}
dtc: Clean up lexing of include files Currently we scan the /include/ directive as two tokens, the "/include/" keyword itself, then the string giving the file name to include. We use a special scanner state to keep the two linked together, and use the scanner state stack to keep track of the original state while we're parsing the two /include/ tokens. This does mean that we need to enable the 'stack' option in flex, which results in a not-easily-suppressed warning from the flex boilerplate code. This is mildly irritating. However, this two-token scanning of the /include/ directive also has some extremely strange edge cases, because there are a variety of tokens recognized in all scanner states, including INCLUDE. For example the following strange dts file: /include/ /dts-v1/; / { /* ... */ }; Will be processed successfully with the /include/ being effectively ignored: the '/dts-v1/' and ';' are recognized even in INCLUDE state, then the ';' transitions us to PROPNODENAME state, throwing away INCLUDE, and the previous state is never popped off the stack. Or for another example this construct: foo /include/ = "somefile.dts" will be parsed as though it were: foo = /include/ "somefile.dts" Again, the '=' is scanned without leaving INCLUDE state, then the next string triggers the include logic. And finally, we use a different regexp for the string with the included filename than the normal string regexpt, which is also potentially weird. This patch, therefore, cleans up the lexical handling of the /include/ directive. Instead of the INCLUDE state, we instead scan the whole include directive, both keyword and filename as a single token. This does mean a bit more complexity in extracting the filename out of yytext, but I think it's worth it to avoid the strageness described above. It also means it's no longer possible to put a comment between the /include/ and the filename, but I'm really not very worried about breaking files using such a strange construct.
17 years ago
<*>{WS}+ /* eat whitespace */
<*>{COMMENT}+ /* eat C-style comments */
<*>{LINECOMMENT}+ /* eat C++-style comments */
20 years ago
<*>"<<" { return DT_LSHIFT; };
<*>">>" { return DT_RSHIFT; };
<*>"<=" { return DT_LE; };
<*>">=" { return DT_GE; };
<*>"==" { return DT_EQ; };
<*>"!=" { return DT_NE; };
<*>"&&" { return DT_AND; };
<*>"||" { return DT_OR; };
<*>. {
DPRINT("Char: %c (\\x%02x)\n", yytext[0],
(unsigned)yytext[0]);
if (yytext[0] == '[') {
DPRINT("<BYTESTRING>\n");
BEGIN(BYTESTRING);
}
if ((yytext[0] == '{')
|| (yytext[0] == ';')) {
DPRINT("<PROPNODENAME>\n");
BEGIN(PROPNODENAME);
}
20 years ago
return yytext[0];
}
%%
static void push_input_file(const char *filename)
{
assert(filename);
srcfile_push(filename);
yyin = current_srcfile->f;
yypush_buffer_state(yy_create_buffer(yyin, YY_BUF_SIZE));
}
static bool pop_input_file(void)
{
if (srcfile_pop() == 0)
return false;
yypop_buffer_state();
yyin = current_srcfile->f;
return true;
}
static void lexical_error(const char *fmt, ...)
{
va_list ap;
va_start(ap, fmt);
srcpos_verror(&yylloc, "Lexical error", fmt, ap);
va_end(ap);
treesource_error = true;
}