Quoted first header in a stream with a UTF-8 BOM is not handled nicely #14

airbreather · 2019-06-19T13:52:32Z

The visitor optionally lets us ignore a UTF-8 BOM on the first header field if present, however, if that field starts with a double-quote, then the tokenizer will fail to treat it as quoted.

The good news is that with the new model I'm doing for #12, all the CsvInput implementations could be the ones that optionally ignore a UTF-8 BOM if present.

The text was updated successfully, but these errors were encountered:

Related to #14 although it doesn't quite resolve the issue.

airbreather · 2019-06-22T12:38:31Z

At the time of this comment, the top result for a Google search for the term "FEFF", on its own, is a blog post of someone having this exact same problem with Ruby's CSV parser. Cursively is in good company.

airbreather added a commit that referenced this issue Jun 22, 2019

Inputs can all optionally ignore leading BOM.

c4d0f79

Related to #14 although it doesn't quite resolve the issue.

airbreather added binary breaking change source breaking change bug Something isn't working labels Jul 22, 2019

airbreather added this to the 2.0.0 milestone Jul 22, 2019

airbreather mentioned this issue Jul 22, 2019

Wishlist for breaking changes #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quoted first header in a stream with a UTF-8 BOM is not handled nicely #14

Quoted first header in a stream with a UTF-8 BOM is not handled nicely #14

airbreather commented Jun 19, 2019

airbreather commented Jun 22, 2019

Quoted first header in a stream with a UTF-8 BOM is not handled nicely #14

Quoted first header in a stream with a UTF-8 BOM is not handled nicely #14

Comments

airbreather commented Jun 19, 2019

airbreather commented Jun 22, 2019