Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with CSV: Unterminated quoted field at end of CSV line #176

Closed
tobiasschweizer opened this issue Jul 8, 2022 · 8 comments
Closed

Comments

@tobiasschweizer
Copy link

tobiasschweizer commented Jul 8, 2022

Hi there,

I am trying to map the following CSV file: https://data.snf.ch/Exportcsv/OutputdataScientificPublication.csv

I am using rmlmapper-6.0.0-r363-all.jar (CLI).

Mapping:

@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix schema: <http://schema.org/>.
@prefix wgs84_pos: <http://www.w3.org/2003/01/geo/wgs84_pos#lat>.
@prefix gn: <http://www.geonames.org/ontology#>.
@prefix carml: <http://carml.taxonic.com/carml/> .
@prefix fnml: <http://semweb.mmlab.be/ns/fnml#> .
@prefix grel: <http://users.ugent.be/~bjdmeest/function/grel.ttl#> .
@prefix fno: <https://w3id.org/function/ontology#> .
@prefix crml: <http://semweb.mmlab.be/ns/rml/condition#> .
@base <http://example.com/ns#>.



<#LogicalSourcePublication> a rml:BaseSource ;
  rml:source <#CSVW_sourcePublication> ;
  rml:referenceFormulation ql:CSV .

<#CSVW_sourcePublication> a csvw:Table;
   csvw:url "OutputdataScientificPublication.csv" ;
   csvw:dialect [ a csvw:Dialect;
       csvw:delimiter ";"
   ] .

### Publications

<#PublicationMapping> a rr:TriplesMap;
  rml:logicalSource <#LogicalSourcePublication> ;

  rr:subjectMap [
    rr:template "http://snf.ch/publication/{ScientificPublicationId}" ;
    rr:class schema:ScholarlyArticle
  ] .

I get the following error message:

17:14:36.117 [main] ERROR be.ugent.rml.cli.Main .main(404) - Unterminated quoted field at end of CSV line. Beginning of lost text: [ Rutter, G. A. ;;;4182;PubMed;;;4194;;0;Peer-reviewed;;The Journal of clinical investigation;;Pub...]

This is the last line of the CSV.

The strange thing is that I can copy this exact line to a small test file and then it works.
OutputdataScientificPublication_test.csv

Is this an OpenCSV issue, see https://stackoverflow.com/questions/70976734/csvmalformedlineexception-unterminated-quoted-field-at-end-of-csv-line and https://stackoverflow.com/questions/70347745/unterminated-quoted-field-at-end-of-csv-line-beginning-of-lost-text? However, the quotes seem ok (opening, closing).

Maybe that is also related to #173 (quoted values).

Thanks a lot for your help.

@tobiasschweizer
Copy link
Author

@DylanVanAssche Do you think this could be an OpenCSV lib issue? Maybe the size of the CSV files?

@AronBuzogany
Copy link
Contributor

@tobiasschweizer I get an error opening the link you provided https://data.snf.ch/Exportcsv/OutputdataScientificPublication.csv

Do you still have the failing example?

@tobiasschweizer
Copy link
Author

Hi @AronBuzogany,

Thanks for looking into this. Unfortunately, I do not have the original data anymore. In any case, I think the CSV was fine since it worked with CARML.

Here is the link (they changed the portal): https://data.snf.ch/exportcsv/OutputdataScientificPublication.csv

@AronBuzogany
Copy link
Contributor

Thanks for your help @tobiasschweizer . I have just executed your current data with the current version of rmlmapper in development and everything seems to work fine. In any case, the error your issues reports hasn't occurred with the data.

@tobiasschweizer
Copy link
Author

Back then, I had the impression that it could be related to memory. Did you update the OpenCSV version since when I filed the issue?

@AronBuzogany
Copy link
Contributor

Yes, I tested your issue in development branch. Here we no longer use openCSV, but rather a library that uses less memory and is way faster. So this issue will probably be fixed in the new release.

@tobiasschweizer
Copy link
Author

That's good news, great!

@DylanVanAssche
Copy link
Contributor

So this issue will probably be fixed in the new release.

We have already released it, so I will close this issue. If you encounter more problems, feel free to create more issues. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants