OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.

Documentation and installation instructions

https://nvidia.github.io/OpenSeq2Seq/

Instructions for running Inference from this forked Repository

Step 1: Docker setup

First follow the step from this link to run the container which is required to run the OpenSeq2Seq Toolkit. If you are using VM Instance make sure you are using GPU instance with P100 or V100.

Step 2: Keep docker running for development environment

Step 3:Installing OpenSeq2Seq for Speech Recognition

Install requirements:

git clone https://github.com/swapnil3597/OpenSeq2Seq/
cd OpenSeq2Seq
pip install -r requirements.txt

Install CTC decoders:

bash scripts/install_decoders.sh
python scripts/ctc_decoders_test.py

All these above intructions are also available here

Step 4: Downloading Acoustic model(Jasper) and Language Model

Find the links for latest Acoustic model checkpoint and config file from here

To download from drive link follow these commands:

pip3 install gdown
gdown https://drive.google.com/uc?id=12CQvNrTvf0cjTsKjbaWWvdaZb7RxWI6X&export=download # This is an example, use the latest drive link for Jasper checkpoint

To download the Language model follow these steps:

bash scripts/install_kenlm.sh
bash scripts/download_lm.sh

After running this command a language_model/ dir would be created containing the binary file for 4-gram ARPA language model.

Step 5: Running Inference:

First in run_inference.sh script make sure you provide the correct path for --config and --logdir for Acoustic model (Jasper).

There are two ways to run inference:

1. With Greedy Decoder: Make sure that in config file "decoder_params" section has 'infer_logits_to_pickle': False line and that "dataset_files" field of "infer_params" section contains a target CSV file. Then run:

bash run_inference.sh # You will get desired output in model_output.pickle file

1. With Language Model Rescoring: In the file run_decoding.sh provide the correct binary file path for language model in --lm and make sure that in config file "decoder_params" section has 'infer_logits_to_pickle': True line and that "dataset_files" field of "infer_params" section contains a target CSV file. Then run:

bash run_inference.sh # You will get acoustic model logits in model_output.pickle file
# To decode the logits run:
bash run_decoding.sh
# For --mode as 'infer' you will get output in --infer_output_file 'inference_output_lm.csv'

Features

Models for:
1. Neural Machine Translation
2. Automatic Speech Recognition
3. Speech Synthesis
4. Language Modeling
5. NLP tasks (sentiment analysis)
Data-parallel distributed training
1. Multi-GPU
2. Multi-node
Mixed precision training for NVIDIA Volta/Turing GPUs

Software Requirements

Python >= 3.5
TensorFlow >= 1.10
CUDA >= 9.0, cuDNN >= 7.0
Horovod >= 0.13 (using Horovod is not required, but is highly recommended for multi-GPU setup)

Acknowledgments

Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.

Beam search decoder with language model re-scoring implementation (in decoders) is based on Baidu DeepSpeech.

Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.

Disclaimer

This is a research project, not an official NVIDIA product.

Related resources

Paper

If you use OpenSeq2Seq, please cite this paper

@misc{openseq2seq,
    title={Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq},
    author={Oleksii Kuchaiev and Boris Ginsburg and Igor Gitman and Vitaly Lavrukhin and Jason Li and Huyen Nguyen and Carl Case and Paulius Micikevicius},
    year={2018},
    eprint={1805.10387},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1,728 Commits
audio		audio
calibration		calibration
ctc_decoder_with_lm		ctc_decoder_with_lm
data		data
decoders		decoders
docker		docker
docs		docs
example_configs		example_configs
external_lm_rescore		external_lm_rescore
open_seq2seq		open_seq2seq
scripts		scripts
.gitignore		.gitignore
.pylintrc		.pylintrc
.style.yapf		.style.yapf
AUTHORS		AUTHORS
CONTRIBUTING.md		CONTRIBUTING.md
Interactive_Infer_example.ipynb		Interactive_Infer_example.ipynb
LICENSE		LICENSE
README.md		README.md
Streaming-ASR.ipynb		Streaming-ASR.ipynb
calculate_wer.py		calculate_wer.py
data_list.pickle		data_list.pickle
demo_offline_streaming.py		demo_offline_streaming.py
demo_streaming_asr.py		demo_streaming_asr.py
frame_asr.py		frame_asr.py
label.txt		label.txt
model_output.pickle		model_output.pickle
requirements.txt		requirements.txt
run.py		run.py
run_decoding.sh		run_decoding.sh
run_inference.sh		run_inference.sh
tokenizer_wrapper.py		tokenizer_wrapper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

Documentation and installation instructions

Instructions for running Inference from this forked Repository

Step 1: Docker setup

Step 2: Keep docker running for development environment

Step 3:Installing OpenSeq2Seq for Speech Recognition

Step 4: Downloading Acoustic model(Jasper) and Language Model

Step 5: Running Inference:

Features

Software Requirements

Acknowledgments

Disclaimer

Related resources

Paper

About

Releases

Packages

Languages

License

swapnil3597/OpenSeq2Seq

Folders and files

Latest commit

History

Repository files navigation

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

Documentation and installation instructions

Instructions for running Inference from this forked Repository

Step 1: Docker setup

Step 2: Keep docker running for development environment

Step 3:Installing OpenSeq2Seq for Speech Recognition

Step 4: Downloading Acoustic model(Jasper) and Language Model

Step 5: Running Inference:

Features

Software Requirements

Acknowledgments

Disclaimer

Related resources

Paper

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages