Skip to content

Vosk ASR offline engine transcript APIs for NodeJs developers. With a simple HTTP transcript server.

License

Notifications You must be signed in to change notification settings

believeitcode/voskJs

 
 

Repository files navigation

VoskJs

VoskJs is a NodeJs developers toolkit to use Vosk offline speech recognition engine. It give you:

  • simple sentence-based transcript APIs
  • command line utility voskjs
  • demo HTTP transcript server voskjshttp.

VoskJs can be used for speech recognition processing in different scenarios:

  • Single-user/standalone programs (e.g. perfect for single-user embedded systems)
  • Multi-user/multi-core server architectures

What's Vosk?

Vosk is an open source embedded (offline, on-premise) speech-to-text engine which can run in real time also on small devices. It's based on Kaldi, but Nikolay V. Shmyrev's Vosk offers a smarti, simplified and performant interface!

Documentation:

What's VoskJs?

The goal of the project is to:

  1. Create an simple function API layer on top of already existing Vosk nodejs binding, supplying main sentence-based speech-to-text functionalities:

    • const model = loadModel(modelDirectory)

      Loads once in RAM memory a specific Vosk engine model from a model directory.

    • transcriptFromFile(fileName, model, options)

    • transcriptFromBuffer(buffer, model, options)

      At run-rime, transcripts a speech file or buffer (in WAV/PCM format), through the Vosk engine Recognizer. It supply speech-to-text transcript detailed info.

    • freeModel(mode)

    Using the simple transcript interface you can build your standalone custom application, accessing async functions suitable to run on a usual single thread nodejs program.

  2. voskjs

    command line program to test Vosk transcript with specific models (some tests and command line usage here).

  3. voskjshttp

    a simple demo HTTP server to transcript speech files.

  4. Build your own server. Some usage examples here.

🛍 Install

1. Install Vosk engine and this nodejs module

  • Install vosk-api engine

    pip3 install vosk 

    See also: https://alphacephei.com/vosk/install

  • Install this module, as global package if you want to use CLI command voskjs

    npm install -g @solyarisoftware/voskjs

2. Install/Download Vosk models

mkdir your/path/models && cd models

# English large model
wget https://alphacephei.com/vosk/models/vosk-model-en-us-aspire-0.2.zip
unzip vosk-model-en-us-aspire-0.2.zip

# English small model
wget http://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip

# Italian model model
wget https://alphacephei.com/vosk/models/vosk-model-small-it-0.4.zip
unzip vosk-model-small-it-0.4.zip

More about available Vosk models here: https://alphacephei.com/vosk/models

3. Demo audio files

Directory audio contains some English language speech audio files, coming from a Mozilla DeepSpeech repo. Source: Mozilla DeepSpeech audio samples These files are used for some tests and comparisons.

Usage

Some transcript usage examples here

🛠 Tests

Some tests / notes here:

To do

  • 💣 Important open issue to be solved: solyarisoftware#3 with a temporrary workaround: alphacep/vosk-api#516 (comment)

  • Implement a simplified interface for all Vosk-api functions

  • Deepen grammar usage with examples

  • Review stress and performances tests (especially for the HTTP server)

  • To speedup latencies, rethink transcript interface, maybe with an initialization phases, including Model creation an the Recognizer(s) creation

✋ How to contribute

Any contribute is welcome.

  • Discussions. Please open a new discussion (a publich chat on github) for any specific open topic, for a clarification, change request proposals, etc.
  • Issues Please submit issues for bugs, etc
  • e-mail You can contact me privately, via email

🙏 Credits

Thanks to Nicolay V. Shmyrev, author of Vosk project, for the help about nodeJs API bindings for multi-threading management

See also:

License

MIT (c) Giorgio Robino


top

About

Vosk ASR offline engine transcript APIs for NodeJs developers. With a simple HTTP transcript server.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 74.6%
  • Shell 25.4%