You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried building ASR systems on a very common standard task (LibriSpeech-100h) using the torchaudio ctc decoder. This decoder uses the flashlight/text library as decoding backend. While my subword (BPE) based setups worked fine, the phoneme based did not.
The standard librispeech lexicon includes e.g. those 7 words, that in ARPA notation all get the same phone sequence:
BAE B AY#
BAI B AY#
BI B AY#
BUY B AY#
BY B AY#
BY' B AY#
BYE B AY#
Which resulted e.g. in the word BY not being recognized anymore.
In the log I get the message: [Trie] Trie label number reached limit: 6
which correctly tells if this limit is applied, but I would like to raise that this limit is very low, and not configurable without re-compiling. Also the message did not look to me like a serious issue at first.
Reproduction Steps
Use torchaudio ctc_decoder with a phoneme based lexicon containing homophones with more than 6 variations.
The text was updated successfully, but these errors were encountered:
Hello, is there still some interest to discuss this or get this fixed? With the proposed fix the decoder compares really well to our own decoder implementation, and I would like to use it for a scientific publication given the simplicity of using it. Currently I am providing a patch file with the setup / container image which is fine, but I would prefer if this would be fixed in the repository here directly.
If there is interest I can do the PR, but before I just want to clarify if this limit has any reasoning that I do not know about.
Bug Description
I tried building ASR systems on a very common standard task (LibriSpeech-100h) using the torchaudio ctc decoder. This decoder uses the flashlight/text library as decoding backend. While my subword (BPE) based setups worked fine, the phoneme based did not.
The standard librispeech lexicon includes e.g. those 7 words, that in ARPA notation all get the same phone sequence:
Which resulted e.g. in the word
BY
not being recognized anymore.In the log I get the message:
[Trie] Trie label number reached limit: 6
which correctly tells if this limit is applied, but I would like to raise that this limit is very low, and not configurable without re-compiling. Also the message did not look to me like a serious issue at first.
Reproduction Steps
The text was updated successfully, but these errors were encountered: