-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds explicit requirements to ensure correct dependencies are installed #9890
Adds explicit requirements to ensure correct dependencies are installed #9890
Conversation
3eea472
to
f7085d9
Compare
@@ -2,6 +2,7 @@ braceexpand | |||
editdistance | |||
einops | |||
g2p_en | |||
IPython |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert this one. IPython is optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is. I recently got this error on main after a clean install:
Traceback (most recent call last):
File "/home/dgalvez/scratch/code/asr/open_asr_leaderboard/nemo_asr/run_eval.py", line 12, in <module>
from nemo.collections.asr.models import ASRModel
File "/home/dgalvez/scratch/miniconda3/envs/open_asr_leaderboard/lib/python3.10/site-packages/nemo/collections/asr/__init__.py", line 15, in <module>
from nemo.collections.asr import data, losses, models, modules
File "/home/dgalvez/scratch/miniconda3/envs/open_asr_leaderboard/lib/python3.10/site-packages/nemo/collections/asr/models/__init__.py", line 22, in <module>
from nemo.collections.asr.models.clustering_diarizer import ClusteringDiarizer
File "/home/dgalvez/scratch/miniconda3/envs/open_asr_leaderboard/lib/python3.10/site-packages/nemo/collections/asr/models/clustering_diarizer.py", line 43, in <module>
from nemo.collections.asr.parts.utils.vad_utils import (
File "/home/dgalvez/scratch/miniconda3/envs/open_asr_leaderboard/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/vad_utils.py", line 26, in <module>
import IPython.display as ipd
ModuleNotFoundError: No module named 'IPython'
We are unconditionally importing IPython in vad_utils.py. IPython is not intended as a required dependency, but in practice it currently is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's right. I did consider leaving out IPython, assuming it was intended to be optional. However, I wanted to point this out and discuss it, so I left it in.
$ python -c 'from nemo.collections.asr.models.clustering_diarizer import ClusteringDiarizer'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/code/nemo/t/lib/python3.10/site-packages/nemo/collections/asr/__init__.py", line 15, in <module>
from nemo.collections.asr import data, losses, models, modules
File "/code/nemo/t/lib/python3.10/site-packages/nemo/collections/asr/models/__init__.py", line 22, in <module>
from nemo.collections.asr.models.clustering_diarizer import ClusteringDiarizer
File "/code/nemo/t/lib/python3.10/site-packages/nemo/collections/asr/models/clustering_diarizer.py", line 43, in <module>
from nemo.collections.asr.parts.utils.vad_utils import (
File "/code/nemo/t/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/vad_utils.py", line 985, in <module>
) -> ipd.Audio:
NameError: name 'ipd' is not defined. Did you mean: 'pd'?
In vad_utils.py
, there are two plot-related functions which make use of IPython, as return values and in one case, as a declared return type in the function signature. plot()
and plot_sample_from_rttm()
.
I took a quick look..
Those functions not used in many locations. plot_sample_from_rttm()
only once. plot()
is a bit harder to track down since its name matches plotlib.plot()
.
$ grep -I -l -r -F plot_sample_from_rttm .
./nemo/collections/asr/parts/utils/vad_utils.py
./examples/asr/speech_classification/README.md
./scripts/speaker_tasks/multispeaker_data_analysis.py
I have never created a Python package, nor dealt with organizing conditional imports.
I wondered about moving those two functions to a new file, vad_utils_visualization.py
or ..._plotting.py
, and changing the usage sites. This could remove the import in vad_utils.py
.
I also wondered about importing inside the functions. That would mean removing the ipd
use in the return type signature, and it will still blow up, since IPython isn't a stated requirement for pip.
But, users still need to be able to install IPython if they want to run notebooks or scripts, so it must be declared somewhere for pip. Would a new requirements group solve this? e.g., pip install nemo_toolkit[extras]
?
If you offer some guidance, I could make the changes on this PR. Or, I can drop the commit for now, to merge and deal with it in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericharper I took out IPython, so this can go into rc1 if you like. Let me know about reorganizing the code, and I can take a stab in another PR.
There was a another comment asking about the numpy < 2 requirement. It's due to the use of This occurs inside the ClusteringDiarizer:
It does not appear to be used in many locations:
I experimented, and I believe this could be changed to use Aside from NeMo code, NumPy 2 will at least need Tensorboard >= 2.18, for its update to NumPy 2 (done but not released). Thanks for the feedback on this PR! |
Signed-off-by: Richard Michael <[email protected]>
Signed-off-by: Richard Michael <[email protected]>
for more information, see https://pre-commit.ci
102c74c
to
946891d
Compare
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
This PR was closed because it has been inactive for 7 days since being marked as stale. |
What does this PR do ?
Numpy 1.x is required, but 2 (latest) will be installed if it is not present. IPython is also required, but not listed as a requirement. Explicit requirements also make it possible to use
pip install nemo_toolkit
without the entire Anaconda distribution installed.This change is similar to #9805 , which upper-bounded
huggingface_hub
.Also include a few trivial typos in the tutorial notebooks.
Collection: [Note which collection this PR will affect]
Changelog
Usage
# Add a code snippet demonstrating how to use this
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information