Skip to content

Commit

Permalink
[Example] Add piper example (#145)
Browse files Browse the repository at this point in the history
* Add piper example

Signed-off-by: PeterD1524 <[email protected]>

* update dependencies

Signed-off-by: PeterD1524 <[email protected]>

* fix typo

Signed-off-by: PeterD1524 <[email protected]>

* Add a GitHub workflow to ensure that the example can run successfully

Signed-off-by: PeterD1524 <[email protected]>

* simplify layout

Signed-off-by: PeterD1524 <[email protected]>

* Provide a simple explanation of the related dependency installation

Signed-off-by: PeterD1524 <[email protected]>

* add config description

Signed-off-by: PeterD1524 <[email protected]>

* use -DWASMEDGE_USE_LLVM=OFF to disable all AOT-related components

Signed-off-by: PeterD1524 <[email protected]>

* ask users to download and install the onnx runtime

Signed-off-by: PeterD1524 <[email protected]>

* add sudo for mv

Signed-off-by: PeterD1524 <[email protected]>

* ldconfig for installing onnxruntime

Signed-off-by: PeterD1524 <[email protected]>

* use the install script from the WasmEdge repo to install ONNX Runtime

Signed-off-by: PeterD1524 <[email protected]>

---------

Signed-off-by: PeterD1524 <[email protected]>
  • Loading branch information
PeterD1524 authored Aug 6, 2024
1 parent c6312e8 commit de9c3e6
Show file tree
Hide file tree
Showing 8 changed files with 429 additions and 0 deletions.
68 changes: 68 additions & 0 deletions .github/workflows/piper.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
name: Piper Example

on:
schedule:
- cron: "0 0 * * *"
push:
paths:
- ".github/workflows/piper.yml"
- "wasmedge-piper/**"
pull_request:
paths:
- ".github/workflows/piper.yml"
- "wasmedge-piper/**"

jobs:
build:
runs-on: ubuntu-22.04
steps:
- name: Install Dependencies for building WasmEdge
run: |
sudo apt-get update
sudo apt-get install ninja-build
- name: Checkout WasmEdge
uses: actions/checkout@v4
with:
repository: WasmEdge/WasmEdge
path: WasmEdge

- name: Install ONNX Runtime
run: sudo bash utils/wasi-nn/install-onnxruntime.sh
working-directory: WasmEdge

- name: Build WasmEdge with WASI-NN Piper plugin
run: |
cmake -GNinja -Bbuild -DCMAKE_BUILD_TYPE=Release -DWASMEDGE_USE_LLVM=OFF -DWASMEDGE_PLUGIN_WASI_NN_BACKEND=Piper
cmake --build build
working-directory: WasmEdge

- name: Install Rust target for wasm
run: rustup target add wasm32-wasi

- name: Checkout WasmEdge-WASINN-examples
uses: actions/checkout@v4
with:
path: WasmEdge-WASINN-examples

- name: Build wasm
run: cargo build --target wasm32-wasi --release
working-directory: WasmEdge-WASINN-examples/wasmedge-piper

- name: Download model
run: curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx

- name: Download config
run: curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json

- name: Download espeak-ng-data
run: |
curl -LO https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_linux_x86_64.tar.gz
tar -xzf piper_linux_x86_64.tar.gz piper/espeak-ng-data --strip-components=1
rm piper_linux_x86_64.tar.gz
- name: Execute
run: WASMEDGE_PLUGIN_PATH=WasmEdge/build/plugins/wasi_nn WasmEdge/build/tools/wasmedge/wasmedge --dir .:. WasmEdge-WASINN-examples/wasmedge-piper/target/wasm32-wasi/release/wasmedge-piper.wasm

- name: Verify output
run: test "$(file --brief welcome.wav)" == 'RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 22050 Hz'
10 changes: 10 additions & 0 deletions wasmedge-piper/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[package]
name = "wasmedge-piper"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
serde_json = "1.0.120"
wasmedge-wasi-nn = "0.8.0"
126 changes: 126 additions & 0 deletions wasmedge-piper/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Text to speech example with WasmEdge WASI-NN Piper plugin

This example demonstrates how to use WasmEdge WASI-NN Piper plugin to perform TTS.

## Build WasmEdge with WASI-NN Piper plugin

Overview of WASI-NN Piper plugin dependencies:

![d2 --layout elk dependencies.d2 dependencies.svg](dependencies.svg)

- [piper](https://github.com/rhasspy/piper): A fast, local neural text to speech system.
- [piper-phonemize](https://github.com/rhasspy/piper-phonemize): C++ library for converting text to phonemes for Piper.
- [espeak-ng](https://github.com/rhasspy/espeak-ng): An open source speech synthesizer that supports more than hundred languages and accents. Piper uses it for text to phoneme translation.
- [onnxruntime](https://github.com/microsoft/onnxruntime): A cross-platform inference and training machine-learning accelerator. [ONNX](https://onnx.ai/) is an open format built to represent machine learning models. Piper uses ONNX Runtime as an inference backend for its ONNX models to convert phoneme ids to WAV audio.

The WasmEdge WASI-NN Piper plugin relies on the ONNX Runtime C++ API. For installation instructions, please refer to the installation table on the [official website](https://onnxruntime.ai/getting-started).

Example of installing ONNX Runtime 1.14.1 on Ubuntu:

```bash
curl -LO https://github.com/microsoft/onnxruntime/releases/download/v1.14.1/onnxruntime-linux-x64-1.14.1.tgz
tar zxf onnxruntime-linux-x64-1.14.1.tgz
mv onnxruntime-linux-x64-1.14.1/include/* /usr/local/include/
mv onnxruntime-linux-x64-1.14.1/lib/* /usr/local/lib/
rm -rf onnxruntime-linux-x64-1.14.1.tgz onnxruntime-linux-x64-1.14.1
ldconfig
```

For other dependencies, WasmEdge will download and build them automatically.

Build WasmEdge from source:

```bash
cd /path/to/wasmedge/source/folder

cmake -GNinja -Bbuild -DCMAKE_BUILD_TYPE=Release -DWASMEDGE_USE_LLVM=OFF -DWASMEDGE_PLUGIN_WASI_NN_BACKEND=Piper
cmake --build build
```

Then you will have an executable `wasmedge` runtime at `build/tools/wasmedge/wasmedge` and the WASI-NN with Piper backend plug-in at `build/plugins/wasi_nn/libwasmedgePluginWasiNN.so`.

## Model Download Link

In this example, we will use the [en_US-lessac-medium](https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/lessac/medium) model.

[MODEL CARD](https://huggingface.co/rhasspy/piper-voices/blob/main/en/en_US/lessac/medium/MODEL_CARD):

```
# Model card for lessac (medium)
* Language: en_US (English, United States)
* Speakers: 1
* Quality: medium
* Samplerate: 22,050Hz
## Dataset
* URL: https://www.cstr.ed.ac.uk/projects/blizzard/2013/lessac_blizzard2013/
* License: https://www.cstr.ed.ac.uk/projects/blizzard/2013/lessac_blizzard2013/license.html
## Training
Trained from scratch.
```

It has a model file [en_US-lessac-medium.onnx](https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx) and a config file [en_US-lessac-medium.onnx.json](https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json).

```bash
# Download model
curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx
# Download config
curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json
```

This model uses [eSpeak NG](https://github.com/rhasspy/espeak-ng) to convert text to phonemes, so we also need to download the required espeak-ng-data.

This will download and extract the espeak-ng-data directory to the current working directory:

```bash
curl -LO https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_linux_x86_64.tar.gz
tar -xzf piper_linux_x86_64.tar.gz piper/espeak-ng-data --strip-components=1
```

## Build wasm

Run the following command to build wasm, the output WASM file will be at `target/wasm32-wasi/release/`

```bash
cargo build --target wasm32-wasi --release
```

## Execute

Execute the WASM with the `wasmedge`.

```bash
WASMEDGE_PLUGIN_PATH=/path/to/parent/directory/of/libwasmedgePluginWasiNN.so /path/to/wasmedge --dir .:. /path/to/wasm
```

Example layout:

```
.
├── en_US-lessac-medium.onnx
├── en_US-lessac-medium.onnx.json
├── espeak-ng-data/
├── WasmEdge/build/
│ ├── plugins/wasi_nn/libwasmedgePluginWasiNN.so
│ └── tools/wasmedge/wasmedge
└── WasmEdge-WASINN-examples/wasmedge-piper/target/wasm32-wasi/release/wasmedge-piper.wasm
```

Then the command will be:

```bash
WASMEDGE_PLUGIN_PATH=WasmEdge/build/plugins/wasi_nn WasmEdge/build/tools/wasmedge/wasmedge --dir .:. WasmEdge-WASINN-examples/wasmedge-piper/target/wasm32-wasi/release/wasmedge-piper.wasm
```

The output `welcome.wav` is the synthesized audio.

## Config options

The JSON config options passed to WasmEdge WASI-NN Piper plugin via `bytes_array` in `wasmedge_wasi_nn::GraphBuilder::build_from_bytes` is similar to the Piper command-line program options.

See [config.schema.json](config.schema.json) for available options and [json_input.schema.json](json_input.schema.json) for JSON input.
68 changes: 68 additions & 0 deletions wasmedge-piper/config.schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"properties": {
"model": {
"description": "Path to .onnx voice file",
"type": "string"
},
"config": {
"description": "Path to JSON voice config file, default is model path + .json",
"type": "string"
},
"output_type": {
"default": "wav",
"description": "Type of output to produce",
"enum": [
"raw",
"wav"
]
},
"speaker": {
"default": 0,
"description": "Numerical id of the default speaker (multi-speaker voices)",
"type": "number"
},
"noise_scale": {
"default": 0.667,
"description": "Amount of noise to add during audio generation, default value can be overridden by the value in voice model config",
"type": "number"
},
"length_scale": {
"default": 1.0,
"description": "Speed of speaking (1 = normal, < 1 is faster, > 1 is slower), default value can be overridden by the value in voice model config",
"type": "number"
},
"noise_w": {
"default": 0.8,
"description": "Variation in phoneme lengths, default value can be overridden by the value in voice model config",
"type": "number"
},
"sentence_silence": {
"default": 0.2,
"description": "Seconds of silence to add after each sentence",
"type": "number"
},
"espeak_data": {
"description": "Path to espeak-ng data directory, required for espeak phonemes",
"type": "string"
},
"tashkeel_model": {
"description": "Path to libtashkeel ort model (https://github.com/mush42/libtashkeel), required for Arabic",
"type": "string"
},
"json_input": {
"default": false,
"description": "input is JSON instead of text",
"type": "boolean"
},
"phoneme_silence": {
"additionalProperties": {
"type": "number"
},
"description": "Seconds of extra silence to insert after a single phoneme, this is a mapping from single codepoints to seconds"
}
},
"required": [
"model"
]
}
7 changes: 7 additions & 0 deletions wasmedge-piper/dependencies.d2
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
direction: right
WasmEdge WASI-NN Piper plugin -> piper
piper -> piper-phonemize
piper -> espeak-ng
piper -> onnxruntime
piper-phonemize -> espeak-ng
piper-phonemize -> onnxruntime
Loading

0 comments on commit de9c3e6

Please sign in to comment.