Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(parser): the 1st version of a new combinator style parser #16876

Merged
merged 39 commits into from
May 24, 2024

Conversation

TennyZhuang
Copy link
Contributor

@TennyZhuang TennyZhuang commented May 21, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Introducing a new parser with winnow.

This is the first version of the new parser, with only the datatype parsing. However, it's also the most complicated part of our Parser. Due to our wrong design previously, we use < and > to define struct type, then introduce a necessary state angle_number during parsing. Fortunately, the new framework handle it perfectly.

winnow-rs is a parser-combinator crate, and it's forked by me for a simple patch winnow-rs/winnow#525.

The new framework can progressively replace the original parser, see Parser::parse_v2 for detail.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@TennyZhuang TennyZhuang changed the title feat(parser): the 1st version of a new parser-combinator style parser feat(parser): the 1st version of a new combinator style parser May 21, 2024
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
@TennyZhuang TennyZhuang marked this pull request as ready for review May 23, 2024 08:48
@TennyZhuang TennyZhuang requested a review from a team as a code owner May 23, 2024 08:48
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
@TennyZhuang
Copy link
Contributor Author

parser_test and planner_test passed

@TennyZhuang TennyZhuang requested a review from xxchan May 23, 2024 09:52
@wangrunji0408
Copy link
Contributor

wangrunji0408 commented May 23, 2024

Can we have a better error display with this new parser framework? if it provides span in some way. 🤔

screenshot.png

taken from #15091

@TennyZhuang
Copy link
Contributor Author

Can we have a better error display with this new parser framework? if it provides span in some way. 🤔

It's possible, but must after we finish the whole refactor.

Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attempt in this PR is truly exciting. The combinator style makes the code more readable and maintainable (although cut_err could be a bit subtle 🤯). Looking forward to seeing the user-facing benefits (like better error reporting) after the refactoring.

src/sqlparser/src/parser_v2/mod.rs Show resolved Hide resolved
src/sqlparser/src/parser_v2/mod.rs Outdated Show resolved Hide resolved
src/sqlparser/src/parser_v2/number.rs Outdated Show resolved Hide resolved

delimited(
Token::LParen,
cut_err(literal_uint.try_map(move |v| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need cut_err here, or shall we add cut_err to all callers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float is a valid data type, while float( not. So we must cut here.

src/sqlparser/src/parser_v2/impl_.rs Show resolved Hide resolved
src/sqlparser/src/parser_v2/data_type.rs Outdated Show resolved Hide resolved
src/sqlparser/src/parser_v2/data_type.rs Outdated Show resolved Hide resolved
Comment on lines +201 to +216
let msg = if let Some(e) = e.into_inner()
&& let Some(cause) = e.cause()
{
format!(": {}", cause)
} else {
"".to_string()
};
ParserError::ParserError(format!(
"Unexpected {}{}",
if self.index + token_stream.location() >= self.tokens.len() {
&"EOF" as &dyn std::fmt::Display
} else {
&self.tokens[self.index + token_stream.location()] as &dyn std::fmt::Display
},
msg
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use ParseError here for better error reporting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which ParserError? If you mean v2, I'd prefer give user a consistent user experience now.

TennyZhuang and others added 5 commits May 24, 2024 14:16
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
Signed-off-by: TennyZhuang <[email protected]>
@xxchan
Copy link
Member

xxchan commented May 24, 2024

The combinator style makes the code more readable and maintainable

Why do you think the combinator is more maintainable? I think now everyone can work with the parser without learning, but not afterwards.

@TennyZhuang
Copy link
Contributor Author

Why do you think the combinator is more maintainable? I think now everyone can work with the parser without learning, but not afterwards.

Let's use Golang to rewrite RisingWave 😄

Signed-off-by: TennyZhuang <[email protected]>
Copy link
Member

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rubber stamp

@TennyZhuang TennyZhuang added this pull request to the merge queue May 24, 2024
Merged via the queue into main with commit 1aa8579 May 24, 2024
27 of 28 checks passed
@TennyZhuang TennyZhuang deleted the parser_v2_take_2 branch May 24, 2024 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants