Skip to content

Commit

Permalink
Merge: Simplify CSV
Browse files Browse the repository at this point in the history
This PR is a rewriting of the CSV library, it should now be easier to use and should not fail anymore due to `\r\n` being the default, this one has been replaced by a single `\n` character.

The `CSVFormat` class has been removed since it introduced more complexity than usefulness, and now the separator, eol and delimiter can be set independently after creation of the `CsvDocument` or a `CsvReader/Writer`.

Concerning performance, the new parser is way faster than the old one.
On a simple 4Mio file, parsing used to take 2.401s.
On the new parser, the measured user time is 0.179s, hence an improvement by a factor of ~12.

Old code sample
~~~nit
import csv

var fl = new FileReader.open(args[0])
var rd = new CsvReader.with_format(fl, new CsvFormat('"', ',', "\r"))

var lns = new Array[Array[String]]
for i in rd do lns.add i
~~~

New code sample
~~~nit
import csv

var rd = new CsvReader.from_string(args[0].to_path.read_all)
rd.eol = "\r"
rd.read_all
~~~

Pull-Request: #2048
Reviewed-by: Alexis Laferrière <alexis.laf@xymus.net>
Reviewed-by: Jean Privat <jean@pryen.org>
  • Loading branch information
privat committed May 3, 2016
2 parents 6578e32 + d357b77 commit 31ed461
Show file tree
Hide file tree
Showing 16 changed files with 307 additions and 496 deletions.
3 changes: 3 additions & 0 deletions lib/core/stream.nit
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,9 @@ abstract class Writer
# Write a single byte
fun write_byte(value: Byte) is abstract

# Writes a single char
fun write_char(c: Char) do write(c.to_s)

# Can the stream be used to write
fun is_writable: Bool is abstract
end
Expand Down
2 changes: 1 addition & 1 deletion lib/core/text/abstract_text.nit
Original file line number Diff line number Diff line change
Expand Up @@ -1756,7 +1756,7 @@ redef class Char
end

# Length of `self` in a UTF-8 String
private fun u8char_len: Int do
fun u8char_len: Int do
var c = self.code_point
if c < 0x80 then return 1
if c <= 0x7FF then return 2
Expand Down
Loading

0 comments on commit 31ed461

Please sign in to comment.