Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added encoder to whisper function in LLMWhisperClient #11

Merged
merged 4 commits into from
Oct 30, 2024

Conversation

ShoaibMajidDar
Copy link
Contributor

When processing documents in Arabic, the expected Arabic text was not returned correctly.
I added an encoder argument to the whisper function in LLMWhisperClient. The issue was resolved by encoding the response in UTF-8, which correctly handled the Arabic text. The encoder is set to default to ISO-8859-1, but can now be adjusted as needed.

@hari-kuriakose
Copy link
Contributor

@ShoaibMajidDar Thanks for the contribution!
This would really help. Just couple of minor suggestions and rest looks good.

Copy link
Contributor

@jaseemjaskp jaseemjaskp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chandrasekharan-zipstack
Copy link
Collaborator

Ideally we need to forward the encoding in the request headers itself, so that it is understood by LLMWhisperer itself and is handled subsequently by the requests library. The current approach would help handle UTF-8 correctly which should cover most of the usecases and any requirement to support different encoding schemes will be properly tackled in the client and server in the future

@hari-kuriakose
Copy link
Contributor

@chandrasekharan-zipstack Agree, let's take up the improvements as required later.

@hari-kuriakose hari-kuriakose merged commit 2929529 into Zipstack:main Oct 30, 2024
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants