-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fastText format for text classification #9
Comments
Hi! Started working on this one. I am going to also use label metadata in order to get label names. Would that be allright? |
I agree with you. Where and how does the label metadata pass it? |
Couple of ideas, but here's what comes in my mind: Personally, as a user, I would prefer to use class method of each dataset = read_jsonl(filepath='example.jsonl', dataset=NERDataset, encoding='utf-8') I would suggest to directly use dataset = NERDataset.from_jsonl(filepath='example.jsonl', encoding='utf-8') and when it comes to dataset = TextClassificationDataset.from_jsonl(annotations_filepath='example.jsonl', labels_filepath='project_1_labels.jsonl', encoding='utf-8) ...optional because without the label metadata filepath, annotations could be still converted with appended label id (and warning for information) like that: If you decide to stay with the current implementation, labels path could be passed either as Let me know what you think |
Example:
The text was updated successfully, but these errors were encountered: