Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request - Voices addition. #703

Open
michieal opened this issue Dec 20, 2024 · 1 comment
Open

Feature Request - Voices addition. #703

michieal opened this issue Dec 20, 2024 · 1 comment

Comments

@michieal
Copy link

Why
The story behind this: I added in the Eleven Labs api key because I wanted my AI chats to have a voice. While checking out the site and the features I accidentally blew through all of my credits within an hour, and really got nothing for it. So, I have 3 tests from my AI chat that did the first paragraph. And, then I remembered that KoboldAI has a plethora of free to use voices. Now, I've not looked at how those voices are added in, but I remember when I was checking it out that it was able to create hours of TTS without issue.
I think that if it's possible to add in how they did it, that this will be a great addition to BigAGI. Even if one has to download the voice packs to the hosting system, having a TTS option that doesn't run out of credits in 5 minutes would be awesome.

Description
I would like to see (if it is possible) additional non-pay as you go voices added to the system. (See "Why").
UI Changes: I see adding in an additional setup for the voice selection, and maybe a download voice item / pop up.

Requirements
If you can, Please break-down the changes use cases, UX, technology, architecture, etc.
This I am not sure, I know that KoboldCpp is written in Visual C++ but, it does work on linux too... so it might be ansi c++ for the calls. as far as integration into TypeScript, that I have no clue about.

If I can help out in any way on this, let me know, and I will do my best.

@enricoros
Copy link
Owner

Take a look at Pull Request #661 . It's using the Web Speech API for free TTS and in some cases it works really well and the speed is so fast that you hear the audio before the message shows up on screen :)

That's on hold just because of very deep changes in the code bases that would conflict with the pull request (and vice versa) but I see this being shipped once the final version 2 is out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants