Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whats an Voice prompting #244

Open
AnonymousCoderArtist opened this issue Nov 11, 2024 · 13 comments
Open

Whats an Voice prompting #244

AnonymousCoderArtist opened this issue Nov 11, 2024 · 13 comments
Labels
enhancement New feature or request

Comments

@AnonymousCoderArtist
Copy link

##Whats Voice prompting

I am willing to contribute and help in voice prompting and UI and can improve your prompts if u want.

-CoFounder of HelpingAI and Webscout

@dustinwloring1988 dustinwloring1988 added the stale The pull / issue is stale and will be closed soon label Dec 2, 2024
@AnonymousCoderArtist
Copy link
Author

Why did you completed this without letting me know? This is not good. I just wanted to know about voice prompting.

@dustinwloring1988
Copy link
Collaborator

@AnonymousCoderArtist sorry the issue was stale here are 2 pull requests to follow:
#281
#510

@thecodacus thecodacus reopened this Dec 3, 2024
@thecodacus
Copy link
Collaborator

thecodacus commented Dec 3, 2024

Will close this issue once the the feature is implemented.
@AnonymousCoderArtist you are welcome to contribute if you like. but also check the related PRs that are in queue if these are something you are thinking of adding

@thecodacus thecodacus added enhancement New feature or request and removed stale The pull / issue is stale and will be closed soon labels Dec 3, 2024
@AnonymousCoderArtist
Copy link
Author

Thanks a lot for the time and support @thecodacus

@AnonymousCoderArtist
Copy link
Author

I think we can add voice prompting as the normal javascript voice speech recognition is quite fast and usable. I would surely work on it. @thecodacus @dustinwloring1988 Whats your's opinion?

@thecodacus
Copy link
Collaborator

thanks, I am not sure how fast any external speech recognition library will be, as most of the stuff this project does is local to browser and very few things happens from server-side.
most modern browsers already has a speech recognition api, if we can leverage that it would be great.

@dustinwloring1988
Copy link
Collaborator

@AnonymousCoderArtist I have used whisper with transformers.js and yes it is near real time

@AnonymousCoderArtist
Copy link
Author

@dustinwloring1988 @thecodacus I am talking about the inbuilt speech recognition of javascript. Its quite fast and accurate.

@thecodacus
Copy link
Collaborator

@dustinwloring1988 , ohh nice how much heavy it is for browser. can we run it on mobile devices ?

@AnonymousCoderArtist I don't think there is any built in speech recognition as far as I remember. I guess you are talking about browser API

@dustinwloring1988
Copy link
Collaborator

@AnonymousCoderArtist are you referring to the Android one?
@thecodacus it is lite weight and runs on mobile.
There GitHub repo has some examples projects you can even get a small llm running in js on a edge device.
It should be 'easy' to bring in. (I used the transformers library all the time in python so I was excited to see the transformers.js version)

@AnonymousCoderArtist
Copy link
Author

Ywah lik see @thecodacus @dustinwloring1988 Here is what i am talking about.

const click_to_record = document.getElementById('click_to_record');
          const convert_text = document.getElementById('convert_text');
          const is_recording = document.getElementById('is_recording');
          const confidence_id = document.getElementById('confidence_id');
          const language_select = document.getElementById('language_select');
  
          click_to_record.addEventListener('click',function(){
              var speech = true;
              window.SpeechRecognition = window.webkitSpeechRecognition;
  
              const recognition = new SpeechRecognition();
              recognition.interimResults = true;
              recognition.lang = language_select.value; 
  
              recognition.addEventListener('start', () => {
                  is_recording.innerHTML = "Recording: True";
              });
  
              recognition.addEventListener('end', () => {
                  is_recording.innerHTML = "Recording: False";
              });
  
              recognition.addEventListener('result', e => {
                  const transcript = Array.from(e.results)
                      .map(result => result[0])
                      .map(result => result.transcript)
                      .join('');
  
                  convert_text.innerHTML = transcript;
                  console.log(transcript);
  
                  const confidence = Array.from(e.results)
                  .map(result => result[0])
                  .map(result => result.confidence)
                  .join('');
                  confidence_id.innerHTML = `Confidence: ${confidence}`;
                  console.log(confidence);
              });
  
              if (speech == true) {
                  recognition.start();
              }
          })

I am talkingabout this the inbuilt webkitspeechrecognition.

@thecodacus
Copy link
Collaborator

yes thats browser api. its only available in browser thats using webkit like chrome

@AnonymousCoderArtist
Copy link
Author

yes its fast and free i think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants