-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Issue 23999 - literal suffixes dont mix well with template instan… #15339
base: master
Are you sure you want to change the base?
Conversation
Thanks for your pull request and interest in making D better, @ntrel! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla references
Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + dmd#15339" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a special case that's impossible to express in the lexical grammar.
@dkorpel Then can it just be a dmd diagnostic error rather than a language error? |
Is the current dangling else error part of the grammar? Edit: I see that's actually a warning, not an error. |
3bf4767
to
e12bd3d
Compare
It could be expressed in the lexical grammar by allowing arbitrary identifiers as a suffix and checking it later:
The lexer or another part of the compiler could later verify that the suffix is supported. |
@tim-dlang That might work for strings, but for an integer literal and float literal to use too (to also solve the comment in bugzilla), that would make it ambiguous for a digit character followed by an IdentifierStart token, from the grammar POV. That said, we already have 2 special case rules, which are both for floating point literals which don't obey maximal munch (see end of this section): Those rules actually change the meaning of the tokens! So we could add a special rule saying:
And that rule would only forbid certain patterns rather than redefining them. I would much rather do that and make it an error, as gcc and clang do. That's because people probably often don't use the More importantly, we can never add any new literal suffixes without breaking code if we don't have an error. |
@ntrel I don't think it would be ambigous for integer or float literals. The lexer would accept as many identifier characters as possible, but later error on invalid characters. For comparison, C/C++ use multiple phases in the compiler. First a number is lexed as a pp-number, which allows an arbitrary suffix. A later phase distinguishes between integer-literal and floating-literal, which is more strict for the suffix, but can not split the token. But I also think it would be best to just add a special case and make it an error. |
That's pretty smart! |
Well, you would be breaking code now without adding new literal suffixes. That being said, I'd much rather add an error than a warning. Warnings are bad. |
This is for dlang/dmd#15339. I have ignored the ImaginarySuffix FloatLiteral variants, as they are deprecated.
@tim-dlang The grammar you linked for floating-literal seems to require either a That grammar change doesn't try to disallow a hex literal from having an identifier immediately following it, because:
So if the spec pull is OK, I need to update this pull to remove the hex literal warning and make the others errors again. |
compiler/src/dmd/lexer.d
Outdated
@@ -1972,6 +1972,13 @@ class Lexer | |||
case 'd': | |||
t.postfix = *p; | |||
p++; | |||
// diagnose e.g. `@r"_"dtype var;` | |||
if (!Ccompile && (isidchar(*p) || *p & 0x80)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking for *p & 0x80
will also produce a warning for unicode line and paragraph separators. I don't know if anybody uses them, but they are currently allowed: https://dlang.org/spec/lex.html#end_of_line
Foo!q{foo}c
c;
Foo!q{foo}b
b;
Yes,
I think it would be more consistent if hex literals behave the same. Consider the following example:
Depending on the font, you may not easily see if it is |
e12bd3d
to
0a2e8d0
Compare
@tim-dlang I'm going to focus just on suffixed literals for this pull. Also good point about the unicode whitespace characters, I think I'll just drop the unicode detection. |
0a2e8d0
to
7a8459c
Compare
Also allow digit after string postfix or numeric suffix.
This could cause a false positive for unicode line endings.
9681e65
to
ce69adf
Compare
This is for dlang/dmd#15339. I have ignored the ImaginarySuffix FloatLiteral variants, as they are deprecated.
This is for dlang/dmd#15339. I have ignored the ImaginarySuffix FloatLiteral variants, as they are deprecated.
This is ready to go now. |
I'll ask what Walter thinks of this |
I note that the way this is implemented, we can never add any additional suffix characters. I suggest instead that the check should be for any suffixes that are not valid suffixes. |
@@ -1973,6 +1973,13 @@ class Lexer | |||
case 'd': | |||
t.postfix = *p; | |||
p++; | |||
// disallow e.g. `@r"_"dtype var;` | |||
if (!Ccompile && isalpha(*p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's already in not Ccompile land.
if (!Ccompile && isalpha(*p)) | ||
{ | ||
const loc = loc(); | ||
error(loc, "identifier character cannot follow string `%c` postfix without whitespace", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say "invalid suffix character %c", because other syntax is not an issue.
break; | ||
default: | ||
// disallow e.g. `Foo!5Luvar;` | ||
if (!Ccompile && flags >= FLAGS.unsigned && isalpha(*p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need Ccompile check or flags check
continue; | ||
default: | ||
break; | ||
break LIntegerSuffix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this loop seems more complicated than necessary
gotSuffix = true; | ||
} | ||
// disallow e.g. `Foo!5fvar;` | ||
if (!Ccompile && gotSuffix && isalpha(*p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think it would be a problem if Ccompile was true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think gotSuffix is needed, just check for invalid suffix alpha
this code is fairly out of sync with the current lexer. Please rebase. |
…tiations