My first project of Speech recognition. This is a PyTorch implementation of Listen, Attend and Spell (LAS) and based on Alexander-H-Liu' repository .
- Python 3
- PyTorch 1.0.0
- python_speech_features
- editdistance
├── audio_data
│ ├── data_thchs30
│ │ ├── data
│ │ ├── train
│ │ │ ├── ...
│ ├── data_aishell
│ │ ├── transcript
│ │ ├── wav
│ │ │ ├── ...
│ ├── primewords_md_2018_set1
│ │ ├── audio_files
│ │ ├── set1_transcript.json
│ ├── ST-CMDS-20170001_1-OS
│ │ │ ├── ...
│ ├── ...
we should invoke the util/dict_zh_words.py
script first, generating Chinese Dict. we can now invoke the util/preprocess_all_datasets.py
script, which will read all of this in and create four pickle files. Then, invoke the util/load_datasets.py
script.
$ python util/dict_zh_words.py
$ python util/preprocess_all_datasets.py
$ python util/load_datasets.py
bash train.sh
Thanks the original LAS, Alexander-H-Liu and awesome PyTorch team.