Click here to expand/collapse content
-
## Introduction
MeloTTS is a **high-quality multi-lingual** text-to-speech library by [MIT](https://www.mit.edu/) and [MyShell.ai](https://myshell.ai). Supported languages include:
- The Chinese speaker supports
mixed Chinese and English
. - Fast enough for
CPU real-time inference
. - Many thanks to @fakerybakery for adding the Web UI and CLI part.
- Wenliang Zhao at Tsinghua University
- Xumin Yu at Tsinghua University
- Zengyi Qin at MIT and MyShell
Language | Example |
---|---|
English (American) | Link |
English (British) | Link |
English (Indian) | Link |
English (Australian) | Link |
English (Default) | Link |
Spanish | Link |
French | Link |
Chinese (mix EN) | Link |
Japanese | Link |
Korean | Link |
Some other features include:
The Python API and model cards can be found in this repo or on HuggingFace.
Discord
Join our Discord community and select the Developer
role upon joining to gain exclusive access to our developer-only channel! Don't miss out on valuable discussions and collaboration opportunities.
Contributing
If you find this work useful, please consider contributing to this repo.
Citation
@software{zhao2024melo,
author={Zhao, Wenliang and Yu, Xumin and Qin, Zengyi},
title = {MeloTTS: High-quality Multi-lingual Multi-accent Text-to-Speech},
url = {https://github.com/myshell-ai/MeloTTS},
year = {2023}
}
This library is under MIT License, which means it is free for both commercial and non-commercial use.
This implementation is based on TTS, VITS, VITS2 and Bert-VITS2. We appreciate their awesome work.
- MeloTTS model supports using openvino to accelerate the inference process. Currently only verified on Linux system.
- TTS and Bert model support int8 quantize.
pip install -r requirements.txt
pip install openvino nncf
python setup.py develop # or pip install -e .
python -m unidic download
python3 test_tts.py
- Now the input will be split and processed serially. This can be optimized to use openvino asynchronous inference, like this:
...
self.tts_model = self.core.read_model(Path(ov_model_path))
self.tts_compiled_model = self.core.compile_model(self.tts_model, 'CPU')
self.tts_request_0 = self.tts_compiled_model.create_infer_request()
self.tts_request_1 = self.tts_compiled_model.create_infer_request()
...
for index, t in enumerate(texts):
...
if index == 0:
self.tts_request_0.start_async(inputs_dict, share_inputs=True)
elif index ==1 :
self.tts_request_1.start_async(inputs_dict, share_inputs=True)
...
self.tts_request_0.wait()
self.tts_request_1.wait()
...