Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance on Streaming Example #68

Open
ehossai2 opened this issue Jan 4, 2024 · 7 comments
Open

Performance on Streaming Example #68

ehossai2 opened this issue Jan 4, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@ehossai2
Copy link

ehossai2 commented Jan 4, 2024

You did a great great work, thanks for this. I want to share one experience with the Streaming version of the system. The performance seems poor, I think it is due not using VAD. I might be wrong, but just sharing.

I was running it on CPU, do you think that could be the reason?

I tried following models:

  1. gmml-tiny, fast but performance is really poor
  2. ggml-small.en, fast enough, performance is bad though
  3. ggml-base, ggml-medium, really slow, as I am running it on CPU, that is why it is slow, but it is also not so accurate. That is why I felt maybe using VAD might improve the performance.

thanks.

@Macoron
Copy link
Owner

Macoron commented Jan 5, 2024

If I understand you right, by performance you mean transcription quality.

I want to share one experience with the Streaming version of the system. The performance seems poor, I think it is due not using VAD.

Yes, Whisper tends to severally hallucinate on silent segments, especially when prompted by previous transcription. VAD helps to skip silence segments and improve overall quality. I highly recommend to use streaming with VAD.

Try to enable VAD and see if it improves transcription quality.

@Macoron Macoron added the enhancement New feature or request label Jan 5, 2024
@i-s-t-e-m-i
Copy link

i-s-t-e-m-i commented Jul 14, 2024

I was running it on CPU, do you think that could be the reason?

as I am running it on CPU, that is why it is slow, but it is also not so accurate.

A noob question: How do you choose between using the CPU or the GPU? I mean is there a flag for this somewhere? Is the "Enable CUDA" option in the Project Settings what enables/disables the use of the GPU?

@Macoron
Copy link
Owner

Macoron commented Jul 14, 2024

I was running it on CPU, do you think that could be the reason?

as I am running it on CPU, that is why it is slow, but it is also not so accurate.

A noob question: How do you choose between using the CPU or the GPU? I mean is there a flag for this somewhere? Is the "Enable CUDA" option in the Project Settings what enables/disables the use of the GPU?

Yes, there is a flag in Project Settings. Check readme for more info.

@i-s-t-e-m-i
Copy link

Thank you Macoron, I’ve already checked the readme and tried the “Enable CUDA” option. I suppose this is the flag you mentioned. 

So I’ll assume checking “Enable CUDA” makes it run on the GPU, while unchecking makes it run on the CPU. 

When I run with CUDA enabled though, the app crashes at the inference stage. Do you think this might be due to my GTX960M?

@Macoron
Copy link
Owner

Macoron commented Jul 16, 2024

Thank you Macoron, I’ve already checked the readme and tried the “Enable CUDA” option. I suppose this is the flag you mentioned. 

So I’ll assume checking “Enable CUDA” makes it run on the GPU, while unchecking makes it run on the CPU. 

When I run with CUDA enabled though, the app crashes at the inference stage. Do you think this might be due to my GTX960M?

Hard to say for sure. What version of the CUDA Toolkit you have installed? Does the original whisper.cpp builds works on your PC?

@i-s-t-e-m-i
Copy link

I have CUDA Toolkit version 12.2

Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2

It is not exactly "12.2.0" but this is what you get installed through the link for 12.2.0

I tried the original whisper.cpp build now. It works OK with the default settings, but when I change the GGML_CUDA to 1, I get errors.

The errors are " A single input file is required for a non-link phase when an outputfile is specified".

This error has been discussed in one of the issues, and removal of a few lines from the CMakeLists.txt is suggested. However I don't see those lines in my CMakeLists.txt file. There are several of these CMakeLists.txt files in different folders though.

@Macoron
Copy link
Owner

Macoron commented Jul 17, 2024

I tried the original whisper.cpp build now. It works OK with the default settings, but when I change the GGML_CUDA to 1, I get errors.
The errors are " A single input file is required for a non-link phase when an outputfile is specified".

I see. Unfortunately, no idea what might be wrong. Maybe the GPU is indeed just too old and doesn't support instructions need for whisper.cpp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants