Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Community: LlamaCppEmbeddings embed_documents and embed_query #28827

Merged
merged 12 commits into from
Dec 23, 2024

Conversation

keenborder786
Copy link
Contributor

Copy link

vercel bot commented Dec 19, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Dec 23, 2024 2:38pm

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Dec 19, 2024
@dosubot dosubot bot added community Related to langchain-community Ɑ: embeddings Related to text embedding models module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Dec 19, 2024
@ccurme ccurme self-assigned this Dec 19, 2024
Copy link
Collaborator

@ccurme ccurme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test?

@keenborder786
Copy link
Contributor Author

@ccurme OKAY

@ccurme ccurme marked this pull request as draft December 20, 2024 15:50
@keenborder786 keenborder786 marked this pull request as ready for review December 20, 2024 16:00
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Dec 20, 2024
@keenborder786
Copy link
Contributor Author

@ccurme please review

Copy link
Collaborator

@ccurme ccurme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@@ -116,7 +117,14 @@ def embed_documents(self, texts: List[str]) -> List[List[float]]:
List of embeddings, one for each text.
"""
embeddings = self.client.create_embedding(texts)
return [list(map(float, e["embedding"])) for e in embeddings["data"]]
if not isinstance(embeddings["data"][0]["embedding"][0], list):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible for this line to raise IndexError where it previously would not? e.g., if texts is empty or contains empty strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion. I didn't mean to request that we raise an error where it previously would not.

From an ignorant reading of the code (i.e., no knowledge of the behavior of the client) the update introduced the potential to raise IndexError where it previously wouldn't. I've pushed a change that passes your tests. Let me know if you see any issues with it.

@keenborder786
Copy link
Contributor Author

@ccurme please review

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Dec 23, 2024
@ccurme ccurme merged commit 41b6a86 into langchain-ai:master Dec 23, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature community Related to langchain-community Ɑ: embeddings Related to text embedding models module lgtm PR looks good. Use to confirm that a PR is ready for merging. size:M This PR changes 30-99 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants