New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

tokenizer.model_max_length is incorrect #51

Open

mh-northlander opened this issue Jan 31, 2023 · 1 comment

Collaborator

mh-northlander commented Jan 31, 2023

The parameter model_max_length of chitra tokenizer seems too large.

> sudachitra.sudachitra.BertSudachipyTokenizer.from_pretrained("chiTra-1.0").model_max_length
1000000000000000019884624838656

The text was updated successfully, but these errors were encountered:

Collaborator Author

mh-northlander commented Mar 6, 2023

Adding "model_max_length": 512 to the tokenizer_config.json will solve this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment