Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: WhisperTranscriber for v2 #4723

Closed
wants to merge 17 commits into from
Closed

feat: WhisperTranscriber for v2 #4723

wants to merge 17 commits into from

Conversation

ZanSara
Copy link
Contributor

@ZanSara ZanSara commented Apr 21, 2023

Related Issues

Proposed Changes:

  • Implements a WhisperTranscriber for Pipelines v2
  • implements some basic unit tests for the transcriber
  • implements an e2e test

How did you test it?

  • Manual runs
  • CI

Notes for the reviewer

n/a

@github-actions github-actions bot added topic:tests type:documentation Improvements on the docs labels Apr 21, 2023
@ZanSara ZanSara added the 2.x Related to Haystack v2.0 label Apr 21, 2023
@ZanSara ZanSara requested a review from silvanocerza April 26, 2023 15:14
@ZanSara ZanSara marked this pull request as ready for review May 8, 2023 15:56
@ZanSara ZanSara requested a review from a team as a code owner May 8, 2023 15:56
@ZanSara ZanSara requested review from masci and removed request for a team May 8, 2023 15:56
@ZanSara ZanSara mentioned this pull request May 8, 2023
16 tasks
@coveralls
Copy link
Collaborator

coveralls commented May 8, 2023

Coverage Status

Coverage: 37.233% (-0.1%) from 37.367% when pulling a23b56c on v2-whisper into 3a6db68 on main.

@ZanSara ZanSara removed the request for review from masci May 10, 2023 14:44
logger = logging.getLogger(__name__)


OPENAI_TIMEOUT = float(os.environ.get("HAYSTACK_OPENAI_TIMEOUT_SEC", 30))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is never used.

transcriptions.append(transcription)
return transcriptions

@retry(retry=retry_if_not_result(bool), wait=wait_exponential(min=1, max=10))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the request_with_retry util method in here.



@component
class WhisperTranscriber:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of splitting this into two different components? One local only and one remote, something like WhisperLocalTranscriber and WhisperOpenAITranscriber. 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would simplify the logic quite a bit, also the testing.

If we go this way I think it would be better to open two separate PRs and ditch this one.

@ZanSara ZanSara closed this May 15, 2023
@masci masci deleted the v2-whisper branch September 13, 2023 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port WhisperTranscriber to v2 pipelines
3 participants