Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing: option to [end] to terminate parsing even if there is further input #684

Open
wezm opened this issue Jun 5, 2024 · 1 comment
Labels
A-parsing Area: parsing C-feature-request Category: a new feature (not already implemented)

Comments

@wezm
Copy link

wezm commented Jun 5, 2024

I use time in my rsspls project (thanks!). It's a tool that uses CSS selectors to extract parts of web pages and build an RSS feed from them. time is used for parsing dates that will become the published date of the RSS item. In wezm/rsspls#46 the element in the HTML that contains the date actually has two dates in it like this:

<td><td tabindex="0" role="cell" class="periodo-pubblicazione date">31/05/2024<br>  15/06/2024</td>

Which is "31/05/2024 15/06/2024" when extracted. We'd like to be able to parse the first date. This is similar in nature to #471 but my idea is to add a modifier to the end component that allows it to be used even when all the input has not been consumed. This would allow using a format description like [day padding:zero]/[month padding:zero]/[year][end eof:false]

I'd be open to implementing this if it seems reasonable.

@jhpratt jhpratt added C-feature-request Category: a new feature (not already implemented) A-parsing Area: parsing labels Jun 18, 2024
@jhpratt
Copy link
Member

jhpratt commented Jun 18, 2024

So…I definitely get where you're coming from. Any implementation of this would necessarily be a new method rather than a modifier on [end]. The reason for this is a tad involved, but I'll try to simplify as much as possible. After some layers of indirection for ergonomics, calls to parse end up calling Sealed::parse (in parsable.rs). This is ultimately where your desire lies — the value is parsed successfully, but fails because there is remaining input. Any [end] modifier is long gone by the time this situation is encountered.

Right now, the only way to approach this is to go through the Parsed struct directly. For example,

let mut parsed = Parsed::new();
let remaining = parsed.parse_items(format_description!("[day]/[month]/[year]"))?;
let value = parsed.into();

This is typed off hand, and naturally relies on some assumptions. You're experienced with Rust, so I trust you're able to figure that much out. Even within time, this is the approach that would need to be taken. I'm not necessarily opposed to having something more ergonomic, but I don't think it's trivial either (a new method isn't ideal).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-parsing Area: parsing C-feature-request Category: a new feature (not already implemented)
Projects
None yet
Development

No branches or pull requests

2 participants