Replies: 3 comments
-
$ time ../whisper -l en -fa -bs 2 -m ../models/ggml-small.en-q5_1.bin -osrt -of test.srt audiofile.wav $ ffprobe audiofile.wav $ time ../whisper -l en -fa -bs 2 -m ../models/ggml-large-v2.bin -osrt -of test.srt audiofile.wav 2>/dev/null [00:00:00.000 --> 00:00:10.000] [BLANK_AUDIO] $ time ../whisper -l en -m ../models/ggml-large-v2.bin -osrt -of test.srt audiofile.wav 2>/dev/null [00:00:00.000 --> 00:00:10.000] [BLANK_AUDIO] $ time ../whisper -m ../models/ggml-tiny.en-q5_1.bin -of test.txt audiofile.wav 2>/dev/null [00:00:00.000 --> 00:00:09.080] [BLANK_AUDIO] $ time ../whisper.may15 -m ../models/ggml-tiny.en-q5_1.bin -of test.txt audiofile.wav 2>/dev/null [00:00:00.000 --> 00:00:02.580] (upbeat music) |
Beta Was this translation helpful? Give feedback.
-
With -ng i am seeing function and correct transcription. Something maybe broken with my CUDA? Edit: Nope! 2024.05.15 compiles and runs fine. |
Beta Was this translation helpful? Give feedback.
-
git bisect run bash -c ' git clean -fdx ; git submodule update --init --recursive ; WHISPER_CUDA=1 make -j $(nproc) ; time ./main -m /pr/Neural/Voice_Recognition_Whispr_GGML/temp/good-whisper.cpp/models/ggml-small.en-q5_1.bin -l en -bs 2 -of test.txt /pr/Neural/Voice_Recognition_Whispr_GGML/audiodump.wav' 1b51fdf is the first bad commit
CMakeLists.txt |
Beta Was this translation helpful? Give feedback.
-
Default beam size has increased from 2 to 5 which has major performance impact.
With -bs 2 identical to old beam size, I get a good boost with flash attention:
2024.05.15 whisper.cpp, without -fa
real 1m6.259s
user 1m33.432s
sys 0m1.561s
2024.06.08., With -fa -bs 2
real 0m50.597s
user 0m57.888s
sys 0m1.684s
Pretty great! THANKS!
Further testing...
2024.06.08, With -fa -bs 5
real 2m21.298s
user 2m50.949s
sys 0m4.076s
2024.06.08, -bs 2 without -fa
real 2m5.511s
user 2m41.488s
sys 0m4.112s
no way? Rerun...
real 2m4.830s
user 2m40.821s
sys 0m3.799s
Well, that's -bs 2. Why is it so much slower?
(linux, cuda 11, rtx 3090 temp-limited to 75°C, /models/ggml-small.en-tdrz.bin )
Looking at the output of 2024.06.08,
I get only one word
1
[00:00:00.000 --> 00:00:02.060] you
2
[00:00:02.060 --> 00:00:04.120] you
3
[00:00:04.120 --> 00:00:06.180] you
4
[00:00:06.180 --> 00:00:08.240] you
5
[00:00:08.240 --> 00:00:10.320] you
6
[00:00:10.320 --> 00:00:12.380] you
7
Back to may15 i guess.
Beta Was this translation helpful? Give feedback.
All reactions