A tokenizer is in charge of preparing the inputs for a model.
The tokenizer can tokenize Chinese-English bilingual in Linux.
This project mainly solves some Chinese character encoding problems.
Requirements
- Boost
A tokenizer is in charge of preparing the inputs for a model.
The tokenizer can tokenize Chinese-English bilingual in Linux.
This project mainly solves some Chinese character encoding problems.
Requirements