-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FastEmbed: Sparse Embedding crash if empty chunk #918
Comments
Maybe this should be fixed at the root of the issue in the FastEmbed package itself (@Anush008 ? 👀) |
Note: I guess this also happens when the document content is only space for example (due to some splitter behaviour). |
Hi @lambda-science.
Can you share a reproducible snippet with FastEmbed. Because I tried doing,
I got empty vectors as expected. |
So probably it comes from the Haystack integrations. Can you do this: from haystack_integrations.components.embedders.fastembed import FastembedSparseTextEmbedder
query_sparse_embedder = FastembedSparseTextEmbedder(model="Qdrant/bm25")
query_sparse_embedder.warm_up()
query_sparse_embedder.run("") |
Crash: |
Works too.
|
huh ? fastembed==0.3.1 right now |
@Anush008 fastembed==0.3.4 solved it ! Thanks your very much, I'm sorry for bothering ! (Tbh this bug was not mentionned in 0.3.4 patch note ahahah) |
Describe the bug
Today while embedding a huge document I got this error:
Basically sometimes my chunking can be empty (who knows why) which leads to a crash of the component.
It's a behaviour I already noticed in the past when migrating my data to FastEmbed Sparse embedding.
I solved it by doing this
To add an empty sparse embedding in this case. But in the case of this component, I'm not sure how to do this.
To Reproduce
Describe your environment (please complete the following information):
The text was updated successfully, but these errors were encountered: