Skip to content

Latest commit

 

History

History
60 lines (46 loc) · 1.9 KB

README.md

File metadata and controls

60 lines (46 loc) · 1.9 KB

Listen, Attend and Spell - PyTorch Implementation

My first project of Speech recognition. This is a PyTorch implementation of Listen, Attend and Spell (LAS) and based on Alexander-H-Liu' repository .

Requirements

Chinese Mandarin corpus

Pretrained models (not supported)

Setup

Download four datasets and preprocessing

├── audio_data
│   ├── data_thchs30
│   │   ├── data
│   │   ├── train
│   │   │   ├── ...
│   ├── data_aishell
│   │   ├── transcript
│   │   ├── wav
│   │   │   ├── ...
│   ├── primewords_md_2018_set1
│   │   ├── audio_files
│   │   ├── set1_transcript.json
│   ├── ST-CMDS-20170001_1-OS
│   │   │   ├── ...
│   ├── ...

we should invoke the util/dict_zh_words.py script first, generating Chinese Dict. we can now invoke the util/preprocess_all_datasets.py script, which will read all of this in and create four pickle files. Then, invoke the util/load_datasets.py script.

 $ python util/dict_zh_words.py
 $ python util/preprocess_all_datasets.py 
 $ python util/load_datasets.py 

Start training

bash train.sh

Evaluate on test split

Acknowledgements

Thanks the original LAS, Alexander-H-Liu and awesome PyTorch team.