Voice KKuTu

Voice KKuTu - Team Project for Embedded Software Lecture

Getting Started

rye is required for run this project easily.

Required:

# 1. Fetch KKuTu Repository
git module init
git module update

# 2. Install dependencies
rye sync

# 3. Convert word database from KKuTu DB*
rye run convert

# 4. Download and convert whisper model**
rye run make-model

* <Convert word data from KKuTu DB> Section.
** <Download and convert Whisper model or just download from Hugging Face> Section.

(Optional) When you clone for developing:

# Install pre commit script for linting
rye run pre-commit

Details

Commands

If below commands are not executed sucessfully, Try rye sync first.

rye run app
rye run convert
rye run clean
rye run black
rye run pre-commit
rye run make-model
rye run stt

Convert word data from KKuTu DB*

This repository contains KKuTu repository as submodule.

Word data for KKuTu server can be migrated for this project with converting script.

git submodule init
git submodule update
rye run convert

Download and convert Whisper model or just download from Hugging Face**

rye run make-model
rye run make-model --model-size=model

There are converted model at Hugging Face. You can run application just downloading models from there.

git lfs install
git clone https://huggingface.co/ShapeLayer/embedded-voice-kkutu-models models

Note

If you use a model other than the default model, you must pass parameters while running the app.

rye run app --model=model

Model	`model` value
Tiny	`tiny`
(en) Tiny	`tiny.en`
Base	`base`
(en) Base	`base.en`
Small	`small`
(en) Small	`small.en`
Medium	`medium`
(en) Medium	`medium.en`
Large-v1	`large-v1`
Large-v2	`large-v2`
Large-v3	`large-v3`
Large-v3-turbo	`large-v3-turbo`
Turbo	`turbo`
Distil-large-v2	`distil-large-v2`
Distil-large-v3	`distil-large-v3`
(en) Distil-medium	`distil-medium.en`
(en) Distil-small	`distil-small.en`

Troubleshooting

fatal error: 'portaudio.h' file not found

[stderr]
src/pyaudio/device_api.c:9:10: fatal error: 'portaudio.h' file not found
    9 | #include "portaudio.h"
      |          ^~~~~~~~~~~~~
1 error generated.
error: command '/usr/bin/clang' failed with exit code 1

hint: This error likely indicates that you need to install a library that provides "portaudio.h" for `pyaudio@0.2.14`

Install portaudio

sudo apt-get install portaudio19-dev

Warning: When Run in Raspberry Pi

To install PyTorch on a Raspberry Pi, you need to install the CPU-only version. You can do so using the following command:

pip install torch torchvision torchaudio --index-url <https://download.pytorch.org/whl/cpu>

Activate audio

# Enable auido (loads snd_bcm2835)
dtparam=audio=on

Remove comments from boot/config.txt

sudo /etc/init.d/alsa-utils reset
sudo reboot

Set audio output device using `raspi-config`

sudo raspi-config

System Options > Audio > [choose the audio output]

Test audio output is enabled

Run TTS

sudo apt-get install espeak
espeak "hello"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Voice KKuTu

Getting Started

Details

Commands

Convert word data from KKuTu DB*

Download and convert Whisper model or just download from Hugging Face**

Troubleshooting

fatal error: 'portaudio.h' file not found

Warning: When Run in Raspberry Pi

Activate audio

Set audio output device using `raspi-config`

Test audio output is enabled

Files

README.md

Latest commit

History

README.md

File metadata and controls

Voice KKuTu

Getting Started

Details

Commands

Convert word data from KKuTu DB*

Download and convert Whisper model or just download from Hugging Face**

Troubleshooting

fatal error: 'portaudio.h' file not found

Warning: When Run in Raspberry Pi

Activate audio

Set audio output device using raspi-config

Test audio output is enabled

Set audio output device using `raspi-config`