Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/compiler #483

Open
wants to merge 89 commits into
base: main
Choose a base branch
from
Open

Conversation

happytomatoe
Copy link
Contributor

@happytomatoe happytomatoe commented Oct 3, 2024

Improved Jack compiler. Underneath the covers it uses ANTLR
List of improvements:

  • Added more error messages
  • Closes Jack compiler is lagging with big file. Ofc we can speed it up by calling compiler only once. As you can see that 1 key stroke calls compiler 2 times on screenshot.
    Screenshot from 2024-10-03 17-39-47

Cons:

  • I've removed the interfaces for built in classes. Though we don't have a validation rule to check if you supplying the right arguments(type) when calling OS classes
  • If we want to use advanced ANTLR feature better to switch to JS target

Don't get scared because it says 13k LOC. Most of it are tests and test files. The src changes is around 1k LOC. This doesn't include the generated files.

You can check it out on https://happytomatoe.github.io/web-ide/compiler

TODO:

  • Test lexer errors
  • Test parser
  • Test validations
  • Fix error message if class name doesn't match file name
  • Validation doesn't work for empty function without return
  • Fix bug connected to an error when playing Pong

@happytomatoe
Copy link
Contributor Author

happytomatoe commented Oct 11, 2024

@happytomatoe I hope you're able to keep making progress on this, but I wanted to let you know that I'm unavailable for the coming week. I'll be back October 20th to review any changes.

Have a nice holiday if it's a holiday)

@happytomatoe
Copy link
Contributor Author

Small update - I was fed up working with official antlr4 typescript target. Migrated to antlr4ng

@DavidSouther
Copy link
Collaborator

Small update - I was fed up working with official antlr4 typescript target. Migrated to antlr4ng

Have you looked at Treesitter at all? (question, not a suggestion, but I hear great things about it and might be where I'd start if I was going from scratch today).

Copy link
Collaborator

@DavidSouther DavidSouther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, these are mostly optional thoughts at this point, up to you to incorporate them now or hold off a bit.

) {
state.files[name] = content;
state.isCompiled = false;
this.compile(state);
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Please do your best in the meantime to keep this as tidy as you can.

@@ -11,6 +11,7 @@ export interface SubroutineInfo {
type: SubroutineType;
localVarsCount?: number;
}

export type GlobalSymbolTable = Record<string, GenericSymbol>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see in several places this is used specifically as a map, rather than an ad-hoc object. Should this be Map<string, GenericSymbol>? It'll enforce type checking the possibly null .get response, but will also make it easier to check .size instead of Object.keys(...)

(Edit: this is the current interface with the compile, so that would also need to change. Agreed to do that leter.)

@@ -22,13 +22,14 @@
"@nand2tetris/projects": "^1.0.0",
"@nand2tetris/runner": "^1.0.0",
"@types/node": "^20.14.2",
"antlr4": "^4.13.2",
"antlr4ng": "^3.0.7",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change need a change in the README?

Copy link
Contributor Author

@happytomatoe happytomatoe Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've already changed the README. Both of these libraries implement the same api/interfaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This library already has a ton of comments in typescript sources so I don't know if makes sense to add anything else in terms of documentation

Comment on lines +44 to +55
files: Record<string, string>,
cmd: Command,
): Record<string, string | CompilationError> {
if (files.type == "LexerOrParserError") {
throw new Error("Expected tree but got a lexer or parser error");
}
const result: Record<string, string | CompilationError> = {};
for (const name of Object.keys(files)) {
result[name] = "";
}
const trees: Record<string, ProgramContext> = {};
const errors: Record<string, CompilationError> = {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For most of these - are these Objects or Maps? (It's OK if they're records, but please leave a comment if that's a specific design choice, and why Record is preferred to Map)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are maps. I will change it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavidSouther Why to choose Map in these cases?

Copy link
Contributor Author

@happytomatoe happytomatoe Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After doing a small research on this topic I don't see any substantial difference other than a little bit better api.

return errors;
}
const validateTree = treeOrErrors as ProgramContext;
const vmWriter = new VMWriter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I added a comment earlier about Record vs Map, but VMWriter wants a Record. Hmm, maybe that should have been a Map too. Up to you to change to using Maps, or acknowledging the comment and moving on.

simulator/src/jack/antlr.compiler.test.ts Outdated Show resolved Hide resolved
@happytomatoe
Copy link
Contributor Author

happytomatoe commented Oct 22, 2024

Small update - I was fed up working with official antlr4 typescript target. Migrated to antlr4ng

Have you looked at Treesitter at all? (question, not a suggestion, but I hear great things about it and might be where I'd start if I was going from scratch today).

What do you hear specifically? It's a pure C library with language bindings. I didn't work with such combination. Probably it has it's own pain points. I don't know.
I've seen this tool when I was building prettier plugin for Jack as it can build syntax tree(Tree with whitespaces and comments.)

If you would build this from scratch, based on my experience, the most go to options are

  • Antlr - because it's old and mature. There are commercial companies that use this tool - https://strumenta.com/. There are a lot documentation/resources that can help with niche problems. On the main website looks like Python used this library before switching to own implementation.
  • Chevrotain - I've seen java prettier plugin and probably of others prettier plugins that use chevrotain. If comparing with previous option it probably is faster (according to their benchmark). Though we probably don't need a faster lexer/parser. I don't know how mature is this library. I've seen a PR where they are implementing a feature from ANTLR. I also don't know if there are commercial companies that use it.

@happytomatoe
Copy link
Contributor Author

Testing is done from my subjective point of view.

simulator/src/jack/antlr.compiler.test.ts Outdated Show resolved Hide resolved
simulator/src/jack/symbol.ts Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[improvement]: Jack compiler is lagging with big files
2 participants