Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variables should resolve before keywords. #31

Open
widavies opened this issue Aug 24, 2020 · 2 comments
Open

Variables should resolve before keywords. #31

widavies opened this issue Aug 24, 2020 · 2 comments
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@widavies
Copy link
Contributor

widavies commented Aug 24, 2020

Great library, one small issue I'm having though. It seems like you resolve built-in keywords/functions before the variables (from the ContextModel) are resolved. This yields a couple of problems.

If I bind a variable name like score, the library seems to be trying to resolve e and subsequently doesn't properly resolve score (as I presume its then trying to resolve scor?). This happens whenever any variable name contains a built-in keyword. Other examples of invalid variable names:

variable
ceiling
resin

Basically anything that contains e or a built-in function like sin, ceil, etc.

Also, it should be noted in my use case that variables can be named anything (my users define variables).

So for example, a variable name could be something like qw%d. This is more of a special case, and not really needed, but it would be nice if your code was smart enough to tell that this is probably a variable name and not a wd and d separated by a modulus. Although, if you're resolving variables first, this actually should be a fairly trivial case to handle.

Thanks! I can also take a look at fixing this if you point me in a right direction / have any pointers.

@fkleon fkleon added the bug Indicates an unexpected problem or unintended behavior label Aug 24, 2020
@fkleon
Copy link
Owner

fkleon commented Aug 24, 2020

Thanks for the report, you're right that this is unintentional. It's is an issue with the parser and somewhat related to #10.

For example, score is currently parsed as e(scor):

$ dart bin/interpreter.dart
[..]
score
> Tokens: [(VAR: scor), (EFUNC: e)]
> RPN: [(VAR: scor), (EFUNC: e)]
> Expression: e(scor)

The issue is in the Lexer#tokenize function in lib/src/parser.dart which processes the input as individual characters first, and prioritises known keywords before considering longer variable or function names. This leads to situations where parts of word tokens are parsed incorrectly.

I think this could be improved fairly easily by simply avoiding splitting up coherent word tokens (e.g. potential variable/function names). The qw%d case is a bit trickier as that contains the modulus operator. I'd be inclined to restrict legal variable names to alphanumeric characters only, and probably also requiring the first character to be alphabetic to avoid confusion with number literals.

Feel free to have a stab at this, and let me know if you have any further questions.

On a side note, I've started rewriting the parser to use a grammar-based approach as described in this comment. The WIP code is on the feature-parser-rewrite branch but it's still missing support for parsing functions, and I'm not sure when I'll get around to finishing that off. I'd be happy to accept a contribution to either the existing or new WIP parser code.

@widavies
Copy link
Contributor Author

For now, I may just use a workaround of using a regex replace for all variables instead of using the ContextModel to bind variables. I may try to work on a temporary fix, but I'm rather pressed for time and may just wait for your feature-parser-rewrite. No rush. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants