You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many thanks for your amazing easy to use STT product!
I have yet to learn how to use your text models, but STT seems to work out-of-the-box really fine.
My language in Russian, and you may know that it features a great deal of obscene words, that people commonly use in some contexts.
In our use-case we have to recognize these words as well as ordinary words.
Looks like your language model on top of acoustic model does not know them.
We could add our own language model, but in this case we would need raw acoustic model outputs.
Is is somehow possible with the current API?
Looks like the pywit it just a requests wrapper and 99% of work is done on server-side.
The text was updated successfully, but these errors were encountered:
Personalized language models is something we want to support down the road. I'll share your input with the team. In the meantime, you can use the voice inbox to correct the transcripts.
Turns our there are much simpler ways to check data at scale:
Check via calculating WER against another source of annotation;
Check the number of words / number of letters vs. duration of the clips - there should be direct correlation, if there is none, then STT quality is low;
Truncate clips that have less than 2 words or 10 symbols;
Truncate clips that have special symbols, latin symbols, etc;
A combination of these basically allows to build fast heuristics to take only the most relevant texts.
Hi!
Many thanks for your amazing easy to use STT product!
I have yet to learn how to use your text models, but STT seems to work out-of-the-box really fine.
My language in Russian, and you may know that it features a great deal of obscene words, that people commonly use in some contexts.
In our use-case we have to recognize these words as well as ordinary words.
Looks like your language model on top of acoustic model does not know them.
We could add our own language model, but in this case we would need raw acoustic model outputs.
Is is somehow possible with the current API?
Looks like the pywit it just a requests wrapper and 99% of work is done on server-side.
The text was updated successfully, but these errors were encountered: