Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds explicit requirements to ensure correct dependencies are installed #9890

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion requirements/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
fiddle
huggingface_hub>=0.24
numba
numpy>=1.22
numpy>=1.22,<2.0
onnx>=1.7.0
python-dateutil
ruamel.yaml
Expand Down
1 change: 1 addition & 0 deletions requirements/requirements_asr.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ braceexpand
editdistance
einops
g2p_en
IPython
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this one. IPython is optional.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is. I recently got this error on main after a clean install:

Traceback (most recent call last):
  File "/home/dgalvez/scratch/code/asr/open_asr_leaderboard/nemo_asr/run_eval.py", line 12, in <module>
    from nemo.collections.asr.models import ASRModel
  File "/home/dgalvez/scratch/miniconda3/envs/open_asr_leaderboard/lib/python3.10/site-packages/nemo/collections/asr/__init__.py", line 15, in <module>
    from nemo.collections.asr import data, losses, models, modules
  File "/home/dgalvez/scratch/miniconda3/envs/open_asr_leaderboard/lib/python3.10/site-packages/nemo/collections/asr/models/__init__.py", line 22, in <module>
    from nemo.collections.asr.models.clustering_diarizer import ClusteringDiarizer
  File "/home/dgalvez/scratch/miniconda3/envs/open_asr_leaderboard/lib/python3.10/site-packages/nemo/collections/asr/models/clustering_diarizer.py", line 43, in <module>
    from nemo.collections.asr.parts.utils.vad_utils import (
  File "/home/dgalvez/scratch/miniconda3/envs/open_asr_leaderboard/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/vad_utils.py", line 26, in <module>
    import IPython.display as ipd
ModuleNotFoundError: No module named 'IPython'

We are unconditionally importing IPython in vad_utils.py. IPython is not intended as a required dependency, but in practice it currently is.

Copy link
Author

@richardkmichael richardkmichael Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's right. I did consider leaving out IPython, assuming it was intended to be optional. However, I wanted to point this out and discuss it, so I left it in.

$ python -c 'from nemo.collections.asr.models.clustering_diarizer import ClusteringDiarizer'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/code/nemo/t/lib/python3.10/site-packages/nemo/collections/asr/__init__.py", line 15, in <module>
    from nemo.collections.asr import data, losses, models, modules
  File "/code/nemo/t/lib/python3.10/site-packages/nemo/collections/asr/models/__init__.py", line 22, in <module>
    from nemo.collections.asr.models.clustering_diarizer import ClusteringDiarizer
  File "/code/nemo/t/lib/python3.10/site-packages/nemo/collections/asr/models/clustering_diarizer.py", line 43, in <module>
    from nemo.collections.asr.parts.utils.vad_utils import (
  File "/code/nemo/t/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/vad_utils.py", line 985, in <module>
    ) -> ipd.Audio:
NameError: name 'ipd' is not defined. Did you mean: 'pd'?

In vad_utils.py, there are two plot-related functions which make use of IPython, as return values and in one case, as a declared return type in the function signature. plot() and plot_sample_from_rttm().

I took a quick look..

Those functions not used in many locations. plot_sample_from_rttm() only once. plot() is a bit harder to track down since its name matches plotlib.plot().

$ grep -I -l -r -F plot_sample_from_rttm .
./nemo/collections/asr/parts/utils/vad_utils.py
./examples/asr/speech_classification/README.md
./scripts/speaker_tasks/multispeaker_data_analysis.py

I have never created a Python package, nor dealt with organizing conditional imports.

I wondered about moving those two functions to a new file, vad_utils_visualization.py or ..._plotting.py, and changing the usage sites. This could remove the import in vad_utils.py.

I also wondered about importing inside the functions. That would mean removing the ipd use in the return type signature, and it will still blow up, since IPython isn't a stated requirement for pip.

But, users still need to be able to install IPython if they want to run notebooks or scripts, so it must be declared somewhere for pip. Would a new requirements group solve this? e.g., pip install nemo_toolkit[extras]?

If you offer some guidance, I could make the changes on this PR. Or, I can drop the commit for now, to merge and deal with it in another PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericharper I took out IPython, so this can go into rc1 if you like. Let me know about reorganizing the code, and I can take a stab in another PR.

jiwer
kaldi-python-io
kaldiio
Expand Down
2 changes: 1 addition & 1 deletion tutorials/00_NeMo_Primer.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1249,7 +1249,7 @@
"\n",
"Tutorials are meant to be more in-depth explanation of the workflow in the discussed task - usually involving a small amount of data to train a small model on a task, along with some explanation of the task itself.\n",
"\n",
"While the tutorials are a great example of the simplicity of NeMo, please note for the best performance when training on real datasets, we advice the use of the example scripts instead of the tutorial notebooks. \n",
"While the tutorials are a great example of the simplicity of NeMo, please note for the best performance when training on real datasets, we advise the use of the example scripts instead of the tutorial notebooks. \n",
"\n",
"NeMo Tutorials directory can be found here - https://github.com/NVIDIA/NeMo/tree/main/tutorials"
]
Expand Down
2 changes: 1 addition & 1 deletion tutorials/01_NeMo_Models.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -974,7 +974,7 @@
"-------\n",
"**Remember**:\n",
"\n",
" - The NeMo equivalent of `torch.nn.Module` is the `NeuralModule.\n",
" - The NeMo equivalent of `torch.nn.Module` is the `NeuralModule`.\n",
" - The NeMo equivalent of the `LightningModule` is `ModelPT`.\n"
]
},
Expand Down
Loading