Add browser-based demo

Demonstrates Moonshine models running in the browser using onnxruntime-web
usefulsensors · Nov 13, 2024 · b94849b · b94849b
1 parent 1f6f763
commit b94849b
Show file tree

Hide file tree

Showing 13 changed files with 1,453 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -51,8 +51,9 @@ This repo hosts inference code and demos for Moonshine.
   - [2. Install the Moonshine package](#2-install-the-moonshine-package)
   - [3. Try it out](#3-try-it-out)
 - [Examples](#examples)
-  - [Onnx standalone](#onnx-standalone)
+  - [Onnx Standalone](#onnx-standalone)
   - [Live Captions](#live-captions)
+  - [Running in the Browser](#running-in-the-browser)
   - [CTranslate2](#ctranslate2)
   - [HuggingFace Transformers](#huggingface-transformers)
 - [TODO](#todo)
@@ -126,14 +127,18 @@ Use the `moonshine.transcribe_with_onnx` function to use the ONNX runtime for in
 
 The Moonshine models can be used with a variety of different runtimes and applications, so we've included code samples showing how to use them in different situations. The [`moonshine/demo`](/moonshine/demo/) folder in this repository also has more information on many of them.
 
-### Onnx standalone
+### Onnx Standalone
 
 The latest versions of the Onnx Moonshine models are available on HuggingFace at [huggingface.co/UsefulSensors/moonshine/tree/main/onnx](https://huggingface.co/UsefulSensors/moonshine/tree/main/onnx). You can find [an example Python script](/moonshine/demo/onnx_standalone.py) and more information about running them [in the demo folder](/moonshine/demo/README.md#demo-standalone-file-transcription-with-onnx).
 
 ### Live Captions
 
 You can try the Moonshine models with live input from a microphone on many platforms with the [live captions demo](/moonshine/demo/README.md#demo-live-captioning-from-microphone-input).
 
+### Running in the Browser
+
+You can try out the Moonshine models on your device in a web browser with our [HuggingFace space](https://huggingface.co/spaces/UsefulSensors/moonshine-web). We've included the [source for this demo](/moonshine/demo/moonshine-web/) in this repository; this is a great starting place for those wishing to build web-based applications with Moonshine.
+
 ### CTranslate2
 
 The files for the CTranslate2 versions of Moonshine are available at [huggingface.co/UsefulSensors/moonshine/tree/main/ctranslate2](https://huggingface.co/UsefulSensors/moonshine/tree/main/ctranslate2), but they require [a pull request to be merged](https://github.com/OpenNMT/CTranslate2/pull/1808) before they can be used with the mainline version of the framework. Until then, you should be able to try them with [our branch](https://github.com/njeffrie/CTranslate2/tree/master), with [this example script](https://github.com/OpenNMT/CTranslate2/pull/1808#issuecomment-2439725339).
@@ -167,6 +172,8 @@ print(tokenizer.decode(tokens[0], skip_special_tokens=True))
 
 * [x] HF transformers support
 
+* [x] Demo Moonshine running in the browser
+
 * [ ] CTranslate2 support (complete but [awaiting a merge](https://github.com/OpenNMT/CTranslate2/pull/1808))
 
 * [ ] MLX support

diff --git a/moonshine/demo/README.md b/moonshine/demo/README.md
@@ -4,6 +4,7 @@ This directory contains various scripts to demonstrate the capabilities of the
 Moonshine ASR models.
 
 - [Moonshine Demos](#moonshine-demos)
+- [Demo: Moonshine running in the browser with ONNX](#demo-moonshine-running-in-the-browser-with-onnx)
 - [Demo: Standalone file transcription with ONNX](#demo-standalone-file-transcription-with-onnx)
 - [Demo: Live captioning from microphone input](#demo-live-captioning-from-microphone-input)
   - [Installation.](#installation)
@@ -16,6 +17,10 @@ Moonshine ASR models.
     - [Metrics](#metrics)
 - [Citation](#citation)
 
+# Demo: Moonshine running in the browser with ONNX
+
+The Node.js project in [`moonshine-web`](/moonshine/demo/moonshine-web/) demonstrates how to run the
+Moonshine models in the web browser using `onnxruntime-web`. You can try this demo on your own device using our [HuggingFace space](https://huggingface.co/spaces/UsefulSensors/moonshine-web) without having to run the project from the source here. Of note, the [`moonshine.js`](/moonshine/demo/moonshine-web/src/moonshine.js) script contains everything you need to perform inferences with the Moonshine ONNX models in the browser. If you would like to build on the web demo, follow the instructions in the demo directory to get started.
 
 # Demo: Standalone file transcription with ONNX
 

diff --git a/moonshine/demo/moonshine-web/.gitignore b/moonshine/demo/moonshine-web/.gitignore
@@ -0,0 +1,3 @@
+node_modules
+public/moonshine/base/*
+public/moonshine/tiny/*
diff --git a/moonshine/demo/moonshine-web/README.md b/moonshine/demo/moonshine-web/README.md
@@ -0,0 +1,32 @@
+# moonshine-web
+
+This directory is a self-contained demo of the Moonshine models running directly on a user's device in a web browser using `onnxruntime-web`. You can try this demo out in our [HuggingFace space](https://huggingface.co/spaces/UsefulSensors/moonshine-web) or, alternatively, install and run it on your own device by following the instructions below. If you want to run Moonshine in the browser in your own projects, `src/moonshine.js` provides a bare-bones implementation of inferences using the ONNX models.
+
+## Installation
+
+You must have Node.js (or another JavaScript toolkit like [Bun](https://bun.sh/)) installed to get started. Install [Node.js](https://nodejs.org/en) if you don't have it already.
+
+Once you have your JavaScript toolkit installed, clone the `moonshine` repo and navigate to this directory:
+
+```shell
+git clone [email protected]:usefulsensors/moonshine.git
+cd moonshine/moonshine/demo/moonshine-web
+```
+
+Then install the project's dependencies:
+
+```shell
+npm install
+```
+
+The demo expects the Moonshine Tiny and Base ONNX models to be available in `public/moonshine/tiny` and `public/moonshine/base`, respectively. To preserve space, they are not included here. However, we've included a helper script that you can run to conveniently download them from HuggingFace:
+
+```shell
+npm run get-models
+```
+
+This project uses Vite for bundling and development. Run the following to start a development server and open the demo in your web browser:
+
+```shell
+npm run dev
+```
diff --git a/moonshine/demo/moonshine-web/downloader.js b/moonshine/demo/moonshine-web/downloader.js
@@ -0,0 +1,36 @@
+// Helper script for downloading Moonshine ONNX models from HuggingFace for local development.
+import * as fs from 'fs';
+import * as hub from "@huggingface/hub";
+
+const repo = { type: "model", name: "UsefulSensors/moonshine" };
+
+var models = [
+    "tiny",
+    "base"
+]
+
+var layers = [
+    "preprocess.onnx",
+    "encode.onnx",
+    "uncached_decode.onnx",
+    "cached_decode.onnx"
+]
+
+console.log("Downloading Moonshine ONNX models from HuggingFace...")
+
+models.forEach(model => {
+    var dir = "public/moonshine/" + model
+    if (!fs.existsSync(dir)){
+        fs.mkdirSync(dir, { recursive: true });
+    }
+    layers.forEach(layer => {
+        hub.downloadFile({ repo, path: "onnx/" + model + "/" + layer }).then((file) => {
+            file.arrayBuffer().then((buffer) => {
+                fs.writeFile(dir + "/" + layer, Buffer.from(buffer), () => {
+                    console.log("\tDownloaded " + model + "/" + layer + " successfully.") 
+                });
+            })
+        })
+
+    })
+});
diff --git a/moonshine/demo/moonshine-web/index.html b/moonshine/demo/moonshine-web/index.html
@@ -0,0 +1,54 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <link rel="icon" href="/favicon.png" />
+    <link rel="stylesheet" href="/index.css">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>Moonshine – lightweight ASR by Useful Sensors</title>
+  </head>
+  <body>
+    <div id="root"> 
+      <div class="container">
+        <div class="logo-container">
+          <svg version="1.1" id="logo" xmlns="http://www.w3.org/2000/svg" x="0px" y="0px" viewBox="0 0 800 800" >
+            <path class="st0" d="M409,760C205.6,760,40,594.4,40,390.9C40,228.5,144.4,88.5,294.8,40C236.6,100.9,203,182.9,203,268.3
+                c0,181.7,147.3,329.5,328.3,329.5c85.8,0,168.1-34.2,228.8-93.4C711.9,655.3,571.8,760,409,760L409,760z"/>
+            <line class="st1" id="l1" x1="310.1" y1="293.8" x2="310.1" y2="325.8"/>
+            <line class="st1" id="l2" x1="729.8" y1="293.8" x2="729.8" y2="325.8"/>
+            <line class="st1" id="l3" x1="370" y1="220" x2="370" y2="399.6"/>
+            <line class="st1" id="l4" x1="430" y1="245.9" x2="430" y2="373.7"/>
+            <line class="st1" id="l5" x1="489.9" y1="293.8" x2="489.9" y2="325.8"/>
+            <line class="st1" id="l6" x1="548.1" y1="278.2" x2="548.1" y2="342.1"/>
+            <line class="st1" id="l7" x1="609.9" y1="220.4" x2="609.9" y2="400"/>
+            <line class="st1" id="l8" x1="669.8" y1="245.9" x2="669.8" y2="373.7"/>
+          </svg>
+          <h1>Moonshine</h1>
+          fast, accurate, and lightweight speech-to-text models running in your browser
+        </div>
+        <div class="justify-center">
+          <select name="models" id="models">
+            <option value="tiny" selected>moonshine/tiny</option>
+            <option value="base">moonshine/base</option>
+          </select>          
+          <input type="file" id="upload" style="display: none;" />
+          <button id="browse" onclick="document.getElementById('upload').click();">Browse</button>
+          <button id="startRecord">Record</button>
+          <button id="stopRecord" style="display: none;">Stop</button>
+        </div>
+        <div id="audioPanel">
+          <div class="justify-center">
+            <audio id="sound" type="audio/wav" controls></audio>
+          </div>
+          <div class="justify-center">
+            <button id="transcribe">Transcribe</button>
+          </div>
+        </div>
+        <div class="justify-center">
+          <span id="transcription"></span>
+        </div>
+      </div>
+    </div>
+    <script type="module" src="/src/index.js"></script>
+  </body>
+</html>