Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Input and parameter tensors are not at the same device. How to point the input tensor to cuda:2? #3985

Open
yiouyou opened this issue Sep 4, 2024 · 4 comments
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.

Comments

@yiouyou
Copy link

yiouyou commented Sep 4, 2024

Describe the bug

Code:

from TTS.api import TTS
tts = TTS(model_name="voice_conversion_models/multilingual/vctk/freevc24", progress_bar=False).to("cuda:2")
tts.voice_conversion_to_file(source_wav="_t1_source.wav", target_wav="_t1_target.wav", file_path="_t1.wav")

Error:

(tts) songz:~/TTS$ python _t1.py 
 > voice_conversion_models/multilingual/vctk/freevc24 is already downloaded.
 > Using model: freevc
 > Loading pretrained speaker encoder model ...
/home/songz/TTS/TTS/utils/io.py:51: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  return torch.load(f, map_location=map_location, **kwargs)
Loaded the voice encoder model on cuda in 0.75 seconds.
/home/songz/TTS/TTS/vc/modules/freevc/wavlm/__init__.py:26: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(output_path, map_location=torch.device(device))
/home/songz/TTS/TTS/utils/io.py:54: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  return torch.load(f, map_location=map_location, **kwargs)
Traceback (most recent call last):
  File "/home/songz/TTS/__t1.py", line 6, in <module>
    tts.voice_conversion_to_file(source_wav="_t1_source.wav", target_wav="_t1_target.wav", file_path="_t1.wav")
  File "/home/songz/TTS/TTS/api.py", line 377, in voice_conversion_to_file
    wav = self.voice_conversion(source_wav=source_wav, target_wav=target_wav)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/songz/TTS/TTS/api.py", line 358, in voice_conversion
    wav = self.voice_converter.voice_conversion(source_wav=source_wav, target_wav=target_wav)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/songz/TTS/TTS/utils/synthesizer.py", line 254, in voice_conversion
    output_wav = self.vc_model.voice_conversion(source_wav, target_wav)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/songz/miniconda3/envs/tts/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/songz/TTS/TTS/vc/models/freevc.py", line 522, in voice_conversion
    g_tgt = self.enc_spk_ex.embed_utterance(wav_tgt)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/songz/TTS/TTS/vc/modules/freevc/speaker_encoder/speaker_encoder.py", line 155, in embed_utterance
    partial_embeds = self(mels).cpu().numpy()
                     ^^^^^^^^^^
  File "/home/songz/miniconda3/envs/tts/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/songz/miniconda3/envs/tts/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/songz/TTS/TTS/vc/modules/freevc/speaker_encoder/speaker_encoder.py", line 60, in forward
    _, (hidden, _) = self.lstm(mels)
                     ^^^^^^^^^^^^^^^
  File "/home/songz/miniconda3/envs/tts/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/songz/miniconda3/envs/tts/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/songz/miniconda3/envs/tts/lib/python3.11/site-packages/torch/nn/modules/rnn.py", line 917, in forward
    result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Input and parameter tensors are not at the same device, found input tensor at cuda:0 and parameter tensor at cuda:2

Question:
How to move input tensor to cuda:2?

To Reproduce

python _t1.py

Expected behavior

No response

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [
            "NVIDIA A100-SXM4-40GB",
            "NVIDIA A100-SXM4-40GB",
            "NVIDIA A100-SXM4-40GB"
        ],
        "available": true,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.4.0+cu121",
        "TTS": "0.22.0",
        "numpy": "1.26.4"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.11.9",
        "version": "#29~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Apr  4 14:39:20 UTC 2"
    }
}

Additional context

No response

@yiouyou yiouyou added the bug Something isn't working label Sep 4, 2024
@isatyamks
Copy link

@yiouyou can You assign this issue to me ?

@isatyamks
Copy link

from TTS.api import TTS
tts = TTS(model_name="voice_conversion_models/multilingual/vctk/freevc24", progress_bar=False).to("cuda:2")
tts.voice_conversion_to_file(source_wav="_t1_source.wav", target_wav="_t1_target.wav", file_path="_t1.wav")

@yiouyou can you please share the file directory for this code block?

@andrea-mucci
Copy link

i have a similar error:

# the first audio is generated with a clone voice
path = self.model.tts_to_file(text=text, speaker_wav=speaker_wav, language=language,
                                      file_path=f"/tmp/output_{output_random}.wav")
# I got the audfio generated with the text to speech and i force to be converted with the speacker_wav
# the target is the path variable and the source is speacker_wav
self.conversion.voice_conversion_to_file(path, speaker_wav, file_path=new_output_path)

Copy link

stale bot commented Nov 10, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Nov 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.
Projects
None yet
Development

No branches or pull requests

3 participants