You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I run your preprocess.py (clean empty lines; I did not run the whole prepare.sh) and then use fast_align to learn an alignment model on the parallel corpus.
I found that the perplexity of alignmens given by the alignment model is higher than the results of the parallel corpus preprocessed by another script wmt.py.
I guess this is due to that they merge the blank lines.
So could you possibly add this merge blank lines function into your script in the future? Thanks a lot!
The text was updated successfully, but these errors were encountered:
Hi Sanxing, thank you for sharing this script!
I run your
preprocess.py
(clean empty lines; I did not run the wholeprepare.sh
) and then usefast_align
to learn an alignment model on the parallel corpus.I found that the perplexity of alignmens given by the alignment model is higher than the results of the parallel corpus preprocessed by another script wmt.py.
I guess this is due to that they merge the blank lines.
So could you possibly add this
merge blank lines
function into your script in the future? Thanks a lot!The text was updated successfully, but these errors were encountered: