Various docs fixes: typos, changing urls to relative links (#8685)

* fix typos Signed-off-by: Elena Rastorgueva <[email protected]> * rename Further information to NeMo ASR Documentation Signed-off-by: Elena Rastorgueva <[email protected]> * fix malformed tables in asr lm docs Signed-off-by: Elena Rastorgueva <[email protected]> * fix some :doc: links that weren't working Signed-off-by: Elena Rastorgueva <[email protected]> * change doc urls in docs to relative links using :doc: or :ref: Signed-off-by: Elena Rastorgueva <[email protected]> * change AAYN asr bib key so its not same as nlp bib Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]>
NVIDIA · Mar 17, 2024 · 6cbaf37 · 6cbaf37
1 parent 86e331c
commit 6cbaf37
Show file tree

Hide file tree

Showing 17 changed files with 68 additions and 68 deletions.
diff --git a/docs/source/asr/api.rst b/docs/source/asr/api.rst
@@ -1,4 +1,4 @@
-NeMo ASR collection API
+NeMo ASR Collection API
 =======================
 
 

diff --git a/docs/source/asr/asr_all.bib b/docs/source/asr/asr_all.bib
@@ -1034,7 +1034,7 @@ @misc{park2022multi
     copyright = {Creative Commons Attribution 4.0 International}
 }
 
-@inproceedings{vaswani2017attention,
+@inproceedings{vaswani2017aayn,
   title={Attention is all you need},
   author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
   booktitle={Advances in Neural Information Processing Systems},

diff --git a/docs/source/asr/asr_language_modeling_and_customization.rst b/docs/source/asr/asr_language_modeling_and_customization.rst
@@ -76,27 +76,27 @@ it is stored at the path specified by `kenlm_model_file`.
 
 The following is the list of the arguments for the training script:
 
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| **Argument**     | **Type** | **Default** | **Description**                                                                                                                |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| nemo_model_file  | str      | Required    | The path to `.nemo` file of the ASR model, or name of a pretrained NeMo model to extract a tokenizer.                          |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| train_paths      | List[str] | Required    | List of training files or folders. Files can be a plain text file or ".json" manifest or ".json.gz".                          |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| kenlm_model_file | str      | Required    | The path to store the KenLM binary model file.                                                                                 |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| kenlm_bin_path   | str      | Required    | The path to the bin folder of KenLM. It is a folder named `bin` under where KenLM is installed.                                |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| ngram_length**   | int      | Required    | Specifies order of N-gram LM.                                                                                                  |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| ngram_prune      | List[int] | [0]        | List of thresholds to prune N-grams. Example: [0,0,1]. See Pruning section on the https://kheafield.com/code/kenlm/estimation  |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| cache_path       | str      | ""          | Cache path to save tokenized files.                                                                                            |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| preserve_arpa    | bool     | ``False``   | Whether to preserve the intermediate ARPA file after construction of the BIN file.                                             |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
-| verbose          | int      | 1           | Verbose level.                                                                                                                 |
-+------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| **Argument**     | **Type**  | **Default** | **Description**                                                                                                                |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| nemo_model_file  | str       | Required    | The path to `.nemo` file of the ASR model, or name of a pretrained NeMo model to extract a tokenizer.                          |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| train_paths      | List[str] | Required    | List of training files or folders. Files can be a plain text file or ".json" manifest or ".json.gz".                           |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| kenlm_model_file | str       | Required    | The path to store the KenLM binary model file.                                                                                 |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| kenlm_bin_path   | str       | Required    | The path to the bin folder of KenLM. It is a folder named `bin` under where KenLM is installed.                                |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| ngram_length**   | int       | Required    | Specifies order of N-gram LM.                                                                                                  |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| ngram_prune      | List[int] | [0]         | List of thresholds to prune N-grams. Example: [0,0,1]. See Pruning section on the https://kheafield.com/code/kenlm/estimation  |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| cache_path       | str       | ``""``      | Cache path to save tokenized files.                                                                                            |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| preserve_arpa    | bool      | ``False``   | Whether to preserve the intermediate ARPA file after construction of the BIN file.                                             |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
+| verbose          | int       | 1           | Verbose level.                                                                                                                 |
++------------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------+
 
 ** Note: Recommend to use 6 as the order of the N-gram model for BPE-based models. Higher orders may need the re-compilation of KenLM to support it.
 
@@ -184,7 +184,7 @@ The following is the list of the important arguments for the evaluation script:
 +--------------------------------------+----------+------------------+-------------------------------------------------------------------------+
 | text_processing.do_lowercase         | bool     | ``False``        | Whether to make the training text all lower case.                       |
 +--------------------------------------+----------+------------------+-------------------------------------------------------------------------+
-| text_processing.punctuation_marks    | str      | ""               | String with punctuation marks to process. Example: ".\,?"               |
+| text_processing.punctuation_marks    | str      | ``""``           | String with punctuation marks to process. Example: ".\,?"               |
 +--------------------------------------+----------+------------------+-------------------------------------------------------------------------+
 | text_processing.rm_punctuation       |  bool    | ``False``        | Whether to remove punctuation marks from text.                          |
 +--------------------------------------+----------+------------------+-------------------------------------------------------------------------+
@@ -527,25 +527,25 @@ The following is the list of the arguments for the opengrm script:
 | kenlm_bin_path       | str    | Required         | The path to the bin folder of KenLM library. It is a folder named `bin` under where KenLM is installed.         |
 +----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
 | ngram_bin_path       | str    | Required         | The path to the bin folder of OpenGrm Ngram. It is a folder named `bin` under where OpenGrm Ngram is installed. |
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
-| arpa_a               | str    | Required         | Path to the ARPA N-gram model file A                                    |
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
-| alpha                | float  | Required         | Weight of N-gram model A                                                |
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
-| arpa_b               | int    | Required         | Path to the ARPA N-gram model file B                                    |
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
-| beta                 | float  | Required         | Weight of N-gram model B                                                |
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
-| out_path             | str    | Required         | Path for writing temporary and resulting files.                         |
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
-| test_file            | str    | None             | Path to test file to count perplexity if provided.                      |
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
-| symbols              | str    | None             | Path to symbols (.syms) file. Could be calculated if it is not provided.|
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
-| nemo_model_file      | str    | None             | The path to '.nemo' file of the ASR model, or name of a pretrained NeMo model.  |
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
-| force                | bool   | ``False``        | Whether to recompile and rewrite all files                              |
-+----------------------+--------+------------------+-------------------------------------------------------------------------+
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
+| arpa_a               | str    | Required         | Path to the ARPA N-gram model file A                                                                            |
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
+| alpha                | float  | Required         | Weight of N-gram model A                                                                                        |
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
+| arpa_b               | int    | Required         | Path to the ARPA N-gram model file B                                                                            |
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
+| beta                 | float  | Required         | Weight of N-gram model B                                                                                        |
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
+| out_path             | str    | Required         | Path for writing temporary and resulting files.                                                                 |
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
+| test_file            | str    | None             | Path to test file to count perplexity if provided.                                                              |
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
+| symbols              | str    | None             | Path to symbols (.syms) file. Could be calculated if it is not provided.                                        |
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
+| nemo_model_file      | str    | None             | The path to '.nemo' file of the ASR model, or name of a pretrained NeMo model.                                  |
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
+| force                | bool   | ``False``        | Whether to recompile and rewrite all files                                                                      |
++----------------------+--------+------------------+-----------------------------------------------------------------------------------------------------------------+
 
 
 ******************

diff --git a/docs/source/asr/examples/kinyarwanda_asr.rst b/docs/source/asr/examples/kinyarwanda_asr.rst
@@ -429,7 +429,7 @@ Training from scratch and finetuning
 ASR models
 ##########
 
-Our goal was to train two ASR models with different architectures: `Conformer-CTC <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/models.html#conformer-ctc>`_ and `Conformer-Transducer <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/models.html#conformer-transducer>`_, with around 120 million parameters.
+Our goal was to train two ASR models with different architectures: :ref:`Conformer-CTC <Conformer-CTC_model>` and :ref:`Conformer-Transducer <Conformer-Transducer_model>`, with around 120 million parameters.
 The CTC model predicts output tokens for each timestep. The outputs are assumed to be independent of each other. As a result the CTC models work faster but they can produce outputs that are inconsistent with each other. CTC models are often combined with external language models in production. In contrast, the Transducer models contain the decoding part which generates the output tokens one by one and the next token prediction depends on this history. Due to autoregressive nature of decoding the inference speed is several times slower than that of CTC models, but the quality is usually better because it can incorporate language model information within the same model.
 
 Training scripts and configs
@@ -604,7 +604,7 @@ Error analysis
 
 Still, even WER of 16% is not as good as we usually get for other languages trained with NeMo toolkit, so we may want to look at the errors that the model makes to better understand what's the problem.
 
-We can use `Speech Data Explorer <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/tools/speech_data_explorer.html>`_ to analyze the errors.
+We can use :doc:`Speech Data Explorer <../../tools/speech_data_explorer>` to analyze the errors.
 
 If we run
 

diff --git a/docs/source/asr/intro.rst b/docs/source/asr/intro.rst
@@ -103,7 +103,7 @@ After :ref:`training <train-ngram-lm>` an N-gram LM, you can use it for transcri
         decoding_mode=beamsearch_ngram \
         decoding_strategy="<Beam library such as beam, pyctcdecode or flashlight>"
 
-See more information about LM decoding :doc:`here <./asr_language_modeling>`.
+See more information about LM decoding :doc:`here <./asr_language_modeling_and_customization>`.
 
 Use real-time transcription
 ---------------------------
@@ -179,16 +179,16 @@ Preparing ASR datasets
 NeMo includes preprocessing scripts for several common ASR datasets. The :doc:`Datasets <./datasets>` section contains instructions on
 running those scripts. It also includes guidance for creating your own NeMo-compatible dataset, if you have your own data.
 
-Further information
--------------------
+NeMo ASR Documentation
+----------------------
 For more information, see additional sections in the ASR docs on the left-hand-side menu or in the list below:
 
 .. toctree::
    :maxdepth: 1
 
    models
    datasets
-   asr_language_modeling
+   asr_language_modeling_and_customization
    results
    scores
    configs

diff --git a/docs/source/asr/models.rst b/docs/source/asr/models.rst
@@ -24,7 +24,7 @@ Canary-1B is the latest ASR model from NVIDIA NeMo. It sits at the top of the `H
 
 You can `download the checkpoint <https://huggingface.co/nvidia/canary-1b>`__  or try out Canary in action in this `HuggingFace Space <https://huggingface.co/spaces/nvidia/canary-1b>`__.
 
-Canary-1B is an encoder-decoder model with a :ref:`FastConformer Encoder <Fast-Conformer>` and Transformer Decoder :cite:`asr-models-vaswani2017attention`.
+Canary-1B is an encoder-decoder model with a :ref:`FastConformer Encoder <Fast-Conformer>` and Transformer Decoder :cite:`asr-models-vaswani2017aayn`.
 
 It is a multi-lingual, multi-task model, supporting automatic speech-to-text recognition (ASR) in 4 languages (English, German, French, Spanish) as well as translation between English and the 3 other supported languages.
 

diff --git a/docs/source/core/core.rst b/docs/source/core/core.rst
@@ -174,9 +174,9 @@ via PyTorch Lightning `hooks <https://pytorch-lightning.readthedocs.io/en/stable
 
 For more domain-specific information, see:
 
-- :ref:`Automatic Speech Recognition (ASR) <../asr/intro>`
-- :ref:`Natural Language Processing (NLP) <../nlp/models>`
-- :ref:`Text-to-Speech Synthesis (TTS) <../tts/intro>`
+- :doc:`Automatic Speech Recognition (ASR) <../asr/intro>`
+- :doc:`Natural Language Processing (NLP) <../nlp/models>`
+- :doc:`Text-to-Speech Synthesis (TTS) <../tts/intro>`
 
 PyTorch Lightning Trainer
 ~~~~~~~~~~~~~~~~~~~~~~~~~