Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

HideyoshiNakazone · 2024-02-15T19:56:48Z

Adds the parameters use_enhanced and model to the recognize_google_cloud method for more customizable options for the user and better results in specific cases

HideyoshiNakazone · 2024-02-22T00:36:13Z

Hello @Uberi and @ftnext, i was wondering if it's possible for someone to review my merge request.

Thank you very much,
Vitor Hideyoshi.

HideyoshiNakazone · 2024-04-22T20:18:16Z

Hello @ftnext, is there any interest in this feature? It doesn't break any of GoogleCloudSpeech python api, only extends it. I'm currently already using this implementation in the company i work in, but would love to have this feature merged.
If there is anything blocking the merge please tell me :)

Uberi · 2024-04-26T18:39:31Z

Hi @HideyoshiNakazone!

Looks good overall, but would it be possible to document these parameters in the docs for that function? If so, happy to merge this!

…ry Reference File

HideyoshiNakazone · 2024-04-26T19:40:41Z

@Uberi, thanks a lot! I added the parameters to the Docstring of the method Recognizer.recognize_google_cloud and added them to the library reference file.
If there is any other places you'd like me to add documentation i'll be happy to :)

ftnext · 2024-04-29T15:16:53Z

reference/library-reference.rst

@@ -238,6 +238,10 @@ The recognition language is determined by ``language``, which is a BCP-47 langua

 If ``preferred_phrases`` is an iterable of phrase strings, those given phrases will be more likely to be recognized over similar-sounding alternatives. This is useful for things like keyword/command recognition or adding new phrases that aren't in Google's vocabulary. Note that the API imposes certain `restrictions on the list of phrase strings <https://cloud.google.com/speech/limits#content>`__.

+The ``use_enhanced`` is a boolean option that sets a flag with the same name on the Google Cloud Speech API, it will make the API uses the enhanced version of the model. More information can be found in the `Google Cloud Speech API documentation <https://cloud.google.com/speech-to-text/docs/enhanced-models>` __.


@HideyoshiNakazone Thanks! Would you like to remove space?

-<https://cloud.google.com/speech-to-text/docs/enhanced-models>` __ +<https://cloud.google.com/speech-to-text/docs/enhanced-models>`__

ftnext · 2024-04-29T15:35:11Z

@HideyoshiNakazone Thank you very much for this pull request! I'm very sorry to respond too late.
@Uberi Thanks your comment!

In my opinion, it seems to be better to introduce keyword arguments (a.k.a. **kwargs)
https://docs.python.org/3/tutorial/controlflow.html#keyword-arguments

Certainly, adding use_enhanced and model as arguments would implement this feature.
However, if there are additional arguments to be added in the future, there is a concern that they could be added again (not easy to extend).

I think it would be preferable for Cloud Speech API-specific arguments to be specified as variant keyword arguments.

def recognize_google_cloud(self, audio_data, credentials_json=None, language="en-US", preferred_phrases=None, show_all=False, **api_params):
    """
    If ``preferred_phrases`` is an iterable of phrase strings, ...

    api_params: Cloud Speech API-specific parameters as dict (optional)

        The ``use_enhanced`` is a boolean option ...

        Furthermore, you can use the option ``model`` to set your desired model,

    Returns the most likely transcription if ``show_all`` is False (the default).
    """

    config = {
        'encoding': speech.RecognitionConfig.AudioEncoding.FLAC,
        'sample_rate_hertz': audio_data.sample_rate,
        'language_code': language,
        **api_params,
    }

(It seems that preferred_phrases might be included in api_params too, but this is another issue)

This implementation is needed for the configuration of Cloud Speech API-specific parameters. This implementation only validates and creates assertions for the two most used params: use_enhanced and model.

…le-cloud

Adds Parameter use_enhanced and model to GoogleCloudSpeech

c845904

Adds the parameters use_enhanced and model to the recognize_google_cloud method for more customizable options for the user and better results in specific cases

HideyoshiNakazone mentioned this pull request Feb 15, 2024

Feature Request: GoogleCloudSpeech - Add method parameters to set use_enhanced and model options #734

Open

Adds Parameters use_enhanced and model to GoogleSpeechAPI docstring

8e0fa40

HideyoshiNakazone force-pushed the add-parameters-google-cloud branch from 052dec3 to 8e0fa40 Compare April 26, 2024 19:13

HideyoshiNakazone added 2 commits April 26, 2024 19:27

Adds Missing Models to Docstring and Adds Missing Parameters to Libra…

daca000

…ry Reference File

Fixes Broken Formatting

abb35fe

ftnext reviewed Apr 29, 2024

View reviewed changes

HideyoshiNakazone force-pushed the add-parameters-google-cloud branch from fc26183 to e82fd4d Compare November 28, 2024 00:29

Better Implementation of API Params Configuration

4be8026

This implementation is needed for the configuration of Cloud Speech API-specific parameters. This implementation only validates and creates assertions for the two most used params: use_enhanced and model.

HideyoshiNakazone force-pushed the add-parameters-google-cloud branch from e82fd4d to 4be8026 Compare November 28, 2024 00:31

Merge remote-tracking branch 'origin/master' into add-parameters-goog…

2366761

…le-cloud

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

HideyoshiNakazone commented Feb 15, 2024

HideyoshiNakazone commented Feb 22, 2024 •

edited

Loading

HideyoshiNakazone commented Apr 22, 2024

Uberi commented Apr 26, 2024

HideyoshiNakazone commented Apr 26, 2024

ftnext Apr 29, 2024

ftnext commented Apr 29, 2024

		@@ -238,6 +238,10 @@ The recognition language is determined by ``language``, which is a BCP-47 langua

		If ``preferred_phrases`` is an iterable of phrase strings, those given phrases will be more likely to be recognized over similar-sounding alternatives. This is useful for things like keyword/command recognition or adding new phrases that aren't in Google's vocabulary. Note that the API imposes certain `restrictions on the list of phrase strings <https://cloud.google.com/speech/limits#content>`__.

		The ``use_enhanced`` is a boolean option that sets a flag with the same name on the Google Cloud Speech API, it will make the API uses the enhanced version of the model. More information can be found in the `Google Cloud Speech API documentation <https://cloud.google.com/speech-to-text/docs/enhanced-models>` __.

Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

Are you sure you want to change the base?

Adds Parameter use_enhanced and model to GoogleCloudSpeech #735

Conversation

HideyoshiNakazone commented Feb 15, 2024

HideyoshiNakazone commented Feb 22, 2024 • edited Loading

HideyoshiNakazone commented Apr 22, 2024

Uberi commented Apr 26, 2024

HideyoshiNakazone commented Apr 26, 2024

ftnext Apr 29, 2024

Choose a reason for hiding this comment

ftnext commented Apr 29, 2024

HideyoshiNakazone commented Feb 22, 2024 •

edited

Loading