Skip to content

Commit

Permalink
Various docs fixes: typos, changing urls to relative links (#8685)
Browse files Browse the repository at this point in the history
* fix typos

Signed-off-by: Elena Rastorgueva <[email protected]>

* rename Further information to NeMo ASR Documentation

Signed-off-by: Elena Rastorgueva <[email protected]>

* fix malformed tables in asr lm docs

Signed-off-by: Elena Rastorgueva <[email protected]>

* fix some :doc: links that weren't working

Signed-off-by: Elena Rastorgueva <[email protected]>

* change doc urls in docs to relative links using :doc: or :ref:

Signed-off-by: Elena Rastorgueva <[email protected]>

* change AAYN asr bib key so its not same as nlp bib

Signed-off-by: Elena Rastorgueva <[email protected]>

---------

Signed-off-by: Elena Rastorgueva <[email protected]>
  • Loading branch information
erastorgueva-nv authored Mar 17, 2024
1 parent 86e331c commit 6cbaf37
Show file tree
Hide file tree
Showing 17 changed files with 68 additions and 68 deletions.
2 changes: 1 addition & 1 deletion docs/source/asr/api.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
NeMo ASR collection API
NeMo ASR Collection API
=======================


Expand Down
2 changes: 1 addition & 1 deletion docs/source/asr/asr_all.bib
Original file line number Diff line number Diff line change
Expand Up @@ -1034,7 +1034,7 @@ @misc{park2022multi
copyright = {Creative Commons Attribution 4.0 International}
}

@inproceedings{vaswani2017attention,
@inproceedings{vaswani2017aayn,
title={Attention is all you need},
author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
booktitle={Advances in Neural Information Processing Systems},
Expand Down
82 changes: 41 additions & 41 deletions docs/source/asr/asr_language_modeling_and_customization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,27 +76,27 @@ it is stored at the path specified by `kenlm_model_file`.

The following is the list of the arguments for the training script:

+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| **Argument** | **Type** | **Default** | **Description** |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| nemo_model_file | str | Required | The path to `.nemo` file of the ASR model, or name of a pretrained NeMo model to extract a tokenizer. |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| train_paths | List[str] | Required | List of training files or folders. Files can be a plain text file or ".json" manifest or ".json.gz". |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| kenlm_model_file | str | Required | The path to store the KenLM binary model file. |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| kenlm_bin_path | str | Required | The path to the bin folder of KenLM. It is a folder named `bin` under where KenLM is installed. |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| ngram_length** | int | Required | Specifies order of N-gram LM. |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| ngram_prune | List[int] | [0] | List of thresholds to prune N-grams. Example: [0,0,1]. See Pruning section on the https://kheafield.com/code/kenlm/estimation |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| cache_path | str | "" | Cache path to save tokenized files. |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| preserve_arpa | bool | ``False`` | Whether to preserve the intermediate ARPA file after construction of the BIN file. |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| verbose | int | 1 | Verbose level. |
+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| **Argument** | **Type** | **Default** | **Description** |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| nemo_model_file | str | Required | The path to `.nemo` file of the ASR model, or name of a pretrained NeMo model to extract a tokenizer. |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| train_paths | List[str] | Required | List of training files or folders. Files can be a plain text file or ".json" manifest or ".json.gz". |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| kenlm_model_file | str | Required | The path to store the KenLM binary model file. |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| kenlm_bin_path | str | Required | The path to the bin folder of KenLM. It is a folder named `bin` under where KenLM is installed. |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| ngram_length** | int | Required | Specifies order of N-gram LM. |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| ngram_prune | List[int] | [0] | List of thresholds to prune N-grams. Example: [0,0,1]. See Pruning section on the https://kheafield.com/code/kenlm/estimation |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| cache_path | str | ``""`` | Cache path to save tokenized files. |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| preserve_arpa | bool | ``False`` | Whether to preserve the intermediate ARPA file after construction of the BIN file. |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
| verbose | int | 1 | Verbose level. |
+------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+

** Note: Recommend to use 6 as the order of the N-gram model for BPE-based models. Higher orders may need the re-compilation of KenLM to support it.

Expand Down Expand Up @@ -184,7 +184,7 @@ The following is the list of the important arguments for the evaluation script:
+--------------------------------------+----------+------------------+-------------------------------------------------------------------------+
| text_processing.do_lowercase | bool | ``False`` | Whether to make the training text all lower case. |
+--------------------------------------+----------+------------------+-------------------------------------------------------------------------+
| text_processing.punctuation_marks | str | "" | String with punctuation marks to process. Example: ".\,?" |
| text_processing.punctuation_marks | str | ``""`` | String with punctuation marks to process. Example: ".\,?" |
+--------------------------------------+----------+------------------+-------------------------------------------------------------------------+
| text_processing.rm_punctuation | bool | ``False`` | Whether to remove punctuation marks from text. |
+--------------------------------------+----------+------------------+-------------------------------------------------------------------------+
Expand Down Expand Up @@ -527,25 +527,25 @@ The following is the list of the arguments for the opengrm script:
| kenlm_bin_path | str | Required | The path to the bin folder of KenLM library. It is a folder named `bin` under where KenLM is installed. |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| ngram_bin_path | str | Required | The path to the bin folder of OpenGrm Ngram. It is a folder named `bin` under where OpenGrm Ngram is installed. |
+----------------------+--------+------------------+-------------------------------------------------------------------------+
| arpa_a | str | Required | Path to the ARPA N-gram model file A |
+----------------------+--------+------------------+-------------------------------------------------------------------------+
| alpha | float | Required | Weight of N-gram model A |
+----------------------+--------+------------------+-------------------------------------------------------------------------+
| arpa_b | int | Required | Path to the ARPA N-gram model file B |
+----------------------+--------+------------------+-------------------------------------------------------------------------+
| beta | float | Required | Weight of N-gram model B |
+----------------------+--------+------------------+-------------------------------------------------------------------------+
| out_path | str | Required | Path for writing temporary and resulting files. |
+----------------------+--------+------------------+-------------------------------------------------------------------------+
| test_file | str | None | Path to test file to count perplexity if provided. |
+----------------------+--------+------------------+-------------------------------------------------------------------------+
| symbols | str | None | Path to symbols (.syms) file. Could be calculated if it is not provided.|
+----------------------+--------+------------------+-------------------------------------------------------------------------+
| nemo_model_file | str | None | The path to '.nemo' file of the ASR model, or name of a pretrained NeMo model. |
+----------------------+--------+------------------+-------------------------------------------------------------------------+
| force | bool | ``False`` | Whether to recompile and rewrite all files |
+----------------------+--------+------------------+-------------------------------------------------------------------------+
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| arpa_a | str | Required | Path to the ARPA N-gram model file A |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| alpha | float | Required | Weight of N-gram model A |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| arpa_b | int | Required | Path to the ARPA N-gram model file B |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| beta | float | Required | Weight of N-gram model B |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| out_path | str | Required | Path for writing temporary and resulting files. |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| test_file | str | None | Path to test file to count perplexity if provided. |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| symbols | str | None | Path to symbols (.syms) file. Could be calculated if it is not provided. |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| nemo_model_file | str | None | The path to '.nemo' file of the ASR model, or name of a pretrained NeMo model. |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
| force | bool | ``False`` | Whether to recompile and rewrite all files |
+----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+


******************
Expand Down
4 changes: 2 additions & 2 deletions docs/source/asr/examples/kinyarwanda_asr.rst
Original file line number Diff line number Diff line change
Expand Up @@ -429,7 +429,7 @@ Training from scratch and finetuning
ASR models
##########

Our goal was to train two ASR models with different architectures: `Conformer-CTC <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/models.html#conformer-ctc>`_ and `Conformer-Transducer <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/models.html#conformer-transducer>`_, with around 120 million parameters.
Our goal was to train two ASR models with different architectures: :ref:`Conformer-CTC <Conformer-CTC_model>` and :ref:`Conformer-Transducer <Conformer-Transducer_model>`, with around 120 million parameters.
The CTC model predicts output tokens for each timestep. The outputs are assumed to be independent of each other. As a result the CTC models work faster but they can produce outputs that are inconsistent with each other. CTC models are often combined with external language models in production. In contrast, the Transducer models contain the decoding part which generates the output tokens one by one and the next token prediction depends on this history. Due to autoregressive nature of decoding the inference speed is several times slower than that of CTC models, but the quality is usually better because it can incorporate language model information within the same model.

Training scripts and configs
Expand Down Expand Up @@ -604,7 +604,7 @@ Error analysis

Still, even WER of 16% is not as good as we usually get for other languages trained with NeMo toolkit, so we may want to look at the errors that the model makes to better understand what's the problem.

We can use `Speech Data Explorer <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/tools/speech_data_explorer.html>`_ to analyze the errors.
We can use :doc:`Speech Data Explorer <../../tools/speech_data_explorer>` to analyze the errors.

If we run

Expand Down
8 changes: 4 additions & 4 deletions docs/source/asr/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ After :ref:`training <train-ngram-lm>` an N-gram LM, you can use it for transcri
decoding_mode=beamsearch_ngram \
decoding_strategy="<Beam library such as beam, pyctcdecode or flashlight>"
See more information about LM decoding :doc:`here <./asr_language_modeling>`.
See more information about LM decoding :doc:`here <./asr_language_modeling_and_customization>`.

Use real-time transcription
---------------------------
Expand Down Expand Up @@ -179,16 +179,16 @@ Preparing ASR datasets
NeMo includes preprocessing scripts for several common ASR datasets. The :doc:`Datasets <./datasets>` section contains instructions on
running those scripts. It also includes guidance for creating your own NeMo-compatible dataset, if you have your own data.

Further information
-------------------
NeMo ASR Documentation
----------------------
For more information, see additional sections in the ASR docs on the left-hand-side menu or in the list below:

.. toctree::
:maxdepth: 1

models
datasets
asr_language_modeling
asr_language_modeling_and_customization
results
scores
configs
Expand Down
2 changes: 1 addition & 1 deletion docs/source/asr/models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Canary-1B is the latest ASR model from NVIDIA NeMo. It sits at the top of the `H

You can `download the checkpoint <https://huggingface.co/nvidia/canary-1b>`__ or try out Canary in action in this `HuggingFace Space <https://huggingface.co/spaces/nvidia/canary-1b>`__.

Canary-1B is an encoder-decoder model with a :ref:`FastConformer Encoder <Fast-Conformer>` and Transformer Decoder :cite:`asr-models-vaswani2017attention`.
Canary-1B is an encoder-decoder model with a :ref:`FastConformer Encoder <Fast-Conformer>` and Transformer Decoder :cite:`asr-models-vaswani2017aayn`.

It is a multi-lingual, multi-task model, supporting automatic speech-to-text recognition (ASR) in 4 languages (English, German, French, Spanish) as well as translation between English and the 3 other supported languages.

Expand Down
6 changes: 3 additions & 3 deletions docs/source/core/core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -174,9 +174,9 @@ via PyTorch Lightning `hooks <https://pytorch-lightning.readthedocs.io/en/stable

For more domain-specific information, see:

- :ref:`Automatic Speech Recognition (ASR) <../asr/intro>`
- :ref:`Natural Language Processing (NLP) <../nlp/models>`
- :ref:`Text-to-Speech Synthesis (TTS) <../tts/intro>`
- :doc:`Automatic Speech Recognition (ASR) <../asr/intro>`
- :doc:`Natural Language Processing (NLP) <../nlp/models>`
- :doc:`Text-to-Speech Synthesis (TTS) <../tts/intro>`

PyTorch Lightning Trainer
~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
Loading

0 comments on commit 6cbaf37

Please sign in to comment.