You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I'm the maintainer for the Montreal Forced Aligner (MFA) and currently working on a new Japanese model for speech-to-text alignment. My current prototype uses sudachipy to generate morphemes, post-process these to create phonological words (e.g., "し ちゃっ て” -> "しちゃって”), and then do the rest of the forced alignment pipeline as if this generated transcript was ground truth accurate (i.e., generate utterance FSTs for phone sequences from pronunciation dictionary look up).
Given that a morphological parser has its own lattice that the best path is extracted from, it'd be nice use the lattice as the starting point, compose it with an FST that does post processing for phonological words, and compose that with the dictionary. The latest versions of sudachipy don't return lattices or expose any internal methods to Python, so I'm still looking for a permanent solution.
For all of its FSTs, MFA uses pynini, which are Python bindings for OpenFst (like here). I saw that janome has a pure python implementation for FSTs, and I was curious if there's interest in adding or migrating that to a pynini implementation, which should simplify it a lot and allow for MFA to directly use any lattices.
If there is interest, I'm happy to put together an initial PR for it!
The text was updated successfully, but these errors were encountered:
Hi @mmcauliffe,
Sorry for the late reply. I've been too busy in recent days to be involved in this issue.
I'm not very familiar with the speech-to-text domain, but It sounds exciting!
Janome has a "no-dependencies" policy for flexibility and future maintenance. I'm just curious - is it possible to re-implement Pynini in Janome? Or do you think it'd be better to have a fork (a variant that integrates Pynini for the string matching engine) of Janome for MFA?
Hello, I'm the maintainer for the Montreal Forced Aligner (MFA) and currently working on a new Japanese model for speech-to-text alignment. My current prototype uses sudachipy to generate morphemes, post-process these to create phonological words (e.g., "し ちゃっ て” -> "しちゃって”), and then do the rest of the forced alignment pipeline as if this generated transcript was ground truth accurate (i.e., generate utterance FSTs for phone sequences from pronunciation dictionary look up).
Given that a morphological parser has its own lattice that the best path is extracted from, it'd be nice use the lattice as the starting point, compose it with an FST that does post processing for phonological words, and compose that with the dictionary. The latest versions of sudachipy don't return lattices or expose any internal methods to Python, so I'm still looking for a permanent solution.
For all of its FSTs, MFA uses pynini, which are Python bindings for OpenFst (like here). I saw that janome has a pure python implementation for FSTs, and I was curious if there's interest in adding or migrating that to a pynini implementation, which should simplify it a lot and allow for MFA to directly use any lattices.
If there is interest, I'm happy to put together an initial PR for it!
The text was updated successfully, but these errors were encountered: