Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quoted first header in a stream with a UTF-8 BOM is not handled nicely #14

Open
airbreather opened this issue Jun 19, 2019 · 1 comment

Comments

@airbreather
Copy link
Owner

The visitor optionally lets us ignore a UTF-8 BOM on the first header field if present, however, if that field starts with a double-quote, then the tokenizer will fail to treat it as quoted.

The good news is that with the new model I'm doing for #12, all the CsvInput implementations could be the ones that optionally ignore a UTF-8 BOM if present.

airbreather added a commit that referenced this issue Jun 22, 2019
Related to #14 although it doesn't quite resolve the issue.
@airbreather
Copy link
Owner Author

At the time of this comment, the top result for a Google search for the term "FEFF", on its own, is a blog post of someone having this exact same problem with Ruby's CSV parser. Cursively is in good company.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant