Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat:regex corrections #7

Merged
merged 7 commits into from
Dec 15, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 88 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,106 @@
# Utterance Corrections plugin

- "secret speech", map some random utterance to something else so you can furtively give orders to your assistant
- shortcuts, map shorter utterances or slang to utterances you know trigger the correct intent
- manually correct bad STT transcriptions you experimentally determined to be common for you
This plugin provides tools to correct or adjust speech-to-text (STT) outputs for better intent matching or improved user experience.

This plugin checks a user defined json for utterance fixes `~/.local/share/mycroft/corrections.json`
### Key Features:
1. **"Secret Speech"**: Map random utterances to something else so you can furtively give orders to your assistant.
2. **Shortcuts**: Map shorter utterances or slang to known utterances that trigger the correct intent.
3. **Manual STT Fixes**: Correct common STT transcription errors you experimentally determined.

fuzzy matching is used to determine if a utterance matches the transcription
if >=0.85% similarity then the replacement is returned instead of the original transcription
---

## 1. Full Utterance Corrections

This plugin checks a user-defined JSON file for **utterance fixes** at `~/.local/share/mycroft/corrections.json`.

**Fuzzy matching** is used to determine if an utterance matches a transcription. If similarity is greater than or equal to **85%**, the replacement is returned instead of the original transcription.

### Example: `corrections.json`
```json
{
"I hate open source": "I love open source",
"do the thing": "trigger protocol 404"
}
```

you can also define unconditional replacements at word level `~/.local/share/mycroft/word_corrections.json`
**Input**:
`"I hat open source"`

**Output**:
`"I love open source"`

---

## 2. Word-Level Corrections

You can also define unconditional word-level replacements in `~/.local/share/mycroft/word_corrections.json`.

This is particularly useful when STT models repeatedly transcribe specific names or words incorrectly.

for example whisper STT often gets artist names wrong, this allows you to correct them
### Example: `word_corrections.json`
```json
{
"Jimmy Hendricks": "Jimi Hendrix",
"Eric Klapptern": "Eric Clapton",
"Eric Klappton": "Eric Clapton"
}
```
```

**Input**:
`"I love Jimmy Hendricks"`

**Output**:
`"I love Jimi Hendrix"`


> **use case**: whisper STT often does this mistake in it's transcriptions

---

## 3. Regex-Based Corrections

For more complex corrections, you can use **regular expressions** in `~/.local/share/mycroft/regex_corrections.json`.

This is useful for fixing consistent patterns in STT errors, such as replacing incorrect trigraphs.

### Example: `regex_corrections.json`
```json
{
"\\bsh(\\w*)": "sch\\1"
}
```

### Explanation:
- **`\\bsh(\\w*)`**: Matches words starting with `sh` at a word boundary.
- **`sch\\1`**: Replaces `sh` with `sch` and appends the rest of the word.

### Example Usage:
**Input**:
`"shalter is a switch"`

**Output**:
`"schalter is a switch"`

> **use case**: citrinet german model often does this mistake in it's transcriptions

---

## Configuration Paths

| File | Purpose |
|---------------------------|---------------------------------------|
| `corrections.json` | Full utterance replacements. |
| `word_corrections.json` | Word-level replacements. |
| `regex_corrections.json` | Regex-based pattern replacements. |

All correction files are stored under:
`~/.local/share/mycroft/`

---

### Usage Scenarios
- **Improve Intent Matching**: Ensure consistent STT output for accurate intent triggers.
- **Fix Model-Specific Errors**: Handle recurring transcription mistakes in certain STT engines.
- **Shortcut Commands**: Simplify complex commands with shorter phrases or slang.

Let us know how you're using this plugin, and feel free to contribute regex examples to this README or new use cases! 🚀
25 changes: 20 additions & 5 deletions ovos_utterance_corrections_transformer/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
import re
from typing import List, Optional

from json_database import JsonStorage
from ovos_config.meta import get_xdg_base
from ovos_plugin_manager.templates.transformers import UtteranceTransformer
from ovos_utils.parse import match_one
from ovos_utils.parse import match_one, MatchStrategy
from ovos_utils.xdg_utils import xdg_data_home


Expand All @@ -13,17 +14,31 @@ def __init__(self, name="ovos-utterance-corrections", priority=1):
super().__init__(name, priority)
self.db = JsonStorage(path=f"{xdg_data_home()}/{get_xdg_base()}/corrections.json")
self.words_db = JsonStorage(path=f"{xdg_data_home()}/{get_xdg_base()}/word_corrections.json")
self.regex_db = JsonStorage(path=f"{xdg_data_home()}/{get_xdg_base()}/regex_corrections.json")
self.confidence_threshold = 0.85 # Default threshold, configurable
self.match_strategy = MatchStrategy.DAMERAU_LEVENSHTEIN_SIMILARITY

def transform(self, utterances: List[str], context: Optional[dict] = None) -> (list, dict):
context = context or {}

# replace full utterance
# Step 1: Replace full utterance
if utterances and self.db:
replacement, conf = match_one(utterances[0], self.db) # TODO - match strategy from conf
if conf >= 0.85: # TODO make configurable
replacement, conf = match_one(
utterances[0], self.db, strategy=self.match_strategy
)
if conf >= self.confidence_threshold:
return [replacement], context

# replace individual words
# Step 2: Apply regex replacements
if utterances and self.regex_db:
for idx in range(len(utterances)):
for pattern, replacement in self.regex_db.items():
try:
utterances[idx] = re.sub(pattern, replacement, utterances[idx])
except re.error as e:
JarbasAl marked this conversation as resolved.
Show resolved Hide resolved
print(f"Invalid regex pattern: {pattern} -> {e}")
JarbasAl marked this conversation as resolved.
Show resolved Hide resolved

# Step 3: Replace individual words
if utterances and self.words_db:
for idx in range(len(utterances)):
for w, r in self.words_db.items():
Expand Down
Loading