-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extending 'lattice-compose.cc' to compose with ark of fsts, #4692
Extending 'lattice-compose.cc' to compose with ark of fsts, #4692
Conversation
- This is a follow-up of kaldi-asr#4571 - Refactoring 'lattice-compose.cc' to support composition with ark of fsts, so that it is done as Dan suggested before: I am thinking this can be done with a string arg called e.g. "--compose-with-fst", defaulting to "auto" which is the old behavior, meaning: rspecifier=lats, rxfilename=FST; and true/True or false/False is FST or lattice respectively. - I added there possibility of rho-composition, which is useful for biasing lattices with word-sequences. Thanks to rho-composition, the biasing graph does not need to contain all words from lexicon. - Would you be interested in an example how to use this? (i.e. create graphs from text file with python script using openfst as library, but that would need to change build of openfst to enable python extensions) - Also which 'egs' recipe would be convenient to use it with?
src/latbin/lattice-compose.cc
Outdated
@@ -121,7 +164,23 @@ int main(int argc, char *argv[]) { | |||
} | |||
} | |||
delete fst2; | |||
} else { | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Would you mind formatting the else in a more standard way that won't upset auto-formatters?
E.g.
} else {
/* comment here */
}
.. I see why you did what you did, but let's be consistent with other code.
Are you confident that this won't break existing code?
src/latbin/lattice-compose.cc
Outdated
@@ -39,22 +39,34 @@ int main(int argc, char *argv[]) { | |||
"or lattices with FSTs (rspecifiers are assumed to be lattices, and\n" | |||
"rxfilenames are assumed to be FSTs, which have their weights interpreted\n" | |||
"as \"graph weights\" when converted into the Lattice format.\n" | |||
"Or, rspecifier can be ark of biasing FSTs, see --compose-with-fst=true.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole message is not very clear, especially with the modification. Can write:
Depending on the command-line arguments, either composes lattices with lattices,
or lattices with a single FST or multiple FSTs (whose weights are interpreted as "graph weights").
Then change the lines below to:
Usage: lattice-compose [options] <lattice-rspecifier1> "
"<lattice-rspecifier2|fst-rxfilename2|fst-rspecifier2> <lattice-wspecifier>\n"
If the 2nd arg is an rspecifier, it is interpreted by default as a table of lattices,
or as a table of FSTs if you specify --compose-with-fst=true.
src/latbin/lattice-compose.cc
Outdated
{ // convert 'compose_with_fst' to lowercase to support: true, True, TRUE | ||
std::string tmp_lc(compose_with_fst); | ||
std::transform(compose_with_fst.begin(), compose_with_fst.end(), | ||
tmp_lc.begin(), ::tolower); // lc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmp_lc.begin(), ::tolower); // lc | |
compose_with_fst.begin(), std::tolower); // lc |
It can be an in-place operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a strange behavior, the std::tolower
version produces error:
/usr/local/include/c++/9.4.0/bits/stl_algo.h:4369:5: note: template argument deduction/substitution failed:
lattice-compose.cc:87:71: note: candidate expects 5 arguments, 4 provided
87 | std::transform(str.begin(), str.end(), str.begin(), std::tolower); // lc
i found that i can go-around it by wrapping it into lambda:
[](unsigned char c){ return std::tolower(c);}
but that looks quite complicated compared to ::tolower
which compiles well,
and i saw it used in: util/parse-options.cc
and nnet/nnet-component.cc
...
src/latbin/lattice-compose.cc
Outdated
* arg2 is rspecifier that contains a table of lattices | ||
* - composing arg1 lattices with arg2 lattices | ||
*/ | ||
else if (not arg2_is_rxfilename && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else if (not arg2_is_rxfilename && | |
else if (!arg2_is_rxfilename && |
was there a message why the 4-param std::transform cannot be used? I wonder
if there should be push_back_iterator(tmp_lc.begin())?
y.
…On Thu, Jan 27, 2022 at 6:34 PM Karel Vesely ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/latbin/lattice-compose.cc
<#4692 (comment)>:
> @@ -63,14 +75,30 @@ int main(int argc, char *argv[]) {
}
KALDI_ASSERT(phi_label > 0 || phi_label == fst::kNoLabel); // e.g. 0 not allowed.
+ KALDI_ASSERT(rho_label > 0 || rho_label == fst::kNoLabel); // e.g. 0 not allowed.
+ if (phi_label > 0 && rho_label > 0) {
+ KALDI_ERR << "You cannot set both 'phi_label' and 'rho_label' at the same time.";
+ }
+
+ { // convert 'compose_with_fst' to lowercase to support: true, True, TRUE
+ std::string tmp_lc(compose_with_fst);
+ std::transform(compose_with_fst.begin(), compose_with_fst.end(),
+ tmp_lc.begin(), ::tolower); // lc
I see a strange behavior, the std::tolower version produces error:
/usr/local/include/c++/9.4.0/bits/stl_algo.h:4369:5: note: template argument deduction/substitution failed:
lattice-compose.cc:87:71: note: candidate expects 5 arguments, 4 provided
87 | std::transform(str.begin(), str.end(), str.begin(), std::tolower); // lc
i found that i can go-around it by wrapping it into lambda:
[](unsigned char c){ return std::tolower(c);}
but that looks quite complicated compared to ::tolower which compiles
well,
and i saw it used in: util/parse-options.cc and nnet/nnet-component.cc ...
—
Reply to this email directly, view it on GitHub
<#4692 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUKYXZH7LLKQPUXGTLGEFDUYF6Y5ANCNFSM5MZFUJEA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
it seems like the symbol |
the messages are:
and
|
okay, i'll put there the lambda, i did not find any better solution... |
interesting, this one works too: |
…of overloaded function
For the testing, i did test the variant: For the other combinations So, out of these 9 combinations, should i test everything? |
also, normally, in lattices there is normally no phi or rho symbol, but the PR code allows it... |
I did some of the tests. I did the variants:
In all these cases it produces lattice ark of about the same size as the input. Otherwise, all the previous remarks to the code sholud be resolved now. |
src/latbin/lattice-compose.cc
Outdated
po.Register("write-compact", &write_compact, "If true, write in normal (compact) form."); | ||
po.Register("phi-label", &phi_label, "If >0, the label on backoff arcs of the LM"); | ||
po.Register("rho-label", &rho_label, | ||
"If >0, the label to forward fst1 paths not present in biasing graph fst2 " | ||
"(rho is input and output symbol on special arc in biasing graph)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps could specify that rho is like phi but the label is rewritten to the specific symbol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, done
OK, I'm going to merge this so that we have some forward movement. |
This is a follow-up of: On the fly contextual adaptation, #4571
Refactoring 'lattice-compose.cc' to support composition with ark
of fsts, so that it is done as Dan suggested before:
I added there possibility of rho-composition, which is useful for
biasing lattices with word-sequences. Thanks to rho-composition,
the biasing graph does not need to contain all words from lexicon.
Would you be interested in an example how to use this?
(i.e. create graphs from text file with python script
using openfst as library, but that would need to change
build of openfst to enable python extensions)
Also which 'egs' recipe would be convenient to use it with?