added top_k argument in the run function of ElasticSearcBM25Retriever #130

sahusiddharth · 2023-12-20T10:18:45Z

The top_k can only be defined at initialization, It would allow users to change the top_k at the pipeline runtime too.

CLAassistant · 2023-12-20T10:18:51Z

All committers have signed the CLA.

masci

Thanks for the PR! I left a couple of comments but it looks good. Would you mind singing the CLA before we merge this? Thanks in advance!

masci · 2023-12-21T07:43:06Z

integrations/elasticsearch/src/elasticsearch_haystack/bm25_retriever.py

@@ -48,12 +48,12 @@ def from_dict(cls, data: Dict[str, Any]) -> "ElasticsearchBM25Retriever":
        return default_from_dict(cls, data)

    @component.output_types(documents=List[Document])
-    def run(self, query: str):
+    def run(self, query: str, top_k: int=None):


top_k should be typed as Optional

masci · 2023-12-21T07:43:31Z

integrations/elasticsearch/src/elasticsearch_haystack/bm25_retriever.py

        docs = self._document_store._bm25_retrieval(
            query=query,
            filters=self._filters,
            fuzziness=self._fuzziness,
-            top_k=self._top_k,
+            top_k=self._top_k if top_k == None else top_k,


you can simplify this statement with top_k = top_k or self.top_k

masci · 2023-12-21T07:44:16Z

integrations/elasticsearch/src/elasticsearch_haystack/embedding_retriever.py

@@ -64,17 +64,18 @@ def from_dict(cls, data: Dict[str, Any]) -> "ElasticsearchEmbeddingRetriever":
        return default_from_dict(cls, data)

    @component.output_types(documents=List[Document])
-    def run(self, query_embedding: List[float]):
+    def run(self, query_embedding: List[float], top_k:int = None):


Same as above, top_k is optional

masci · 2023-12-21T07:45:20Z

integrations/elasticsearch/src/elasticsearch_haystack/embedding_retriever.py

        """
        Retrieve documents using a vector similarity metric.

        :param query_embedding: Embedding of the query.
+        :param top_k: Maximum number of Documents to return


Thanks for fixing the docs! It's out of the scope of this PR, but would you mind adding a similar docstring to the run method of the other retriever component?

@masci

The changes you suggested will be done soon.

Question: When you mentioned other retriever components you mean in the main haystack right?

@sahusiddharth no I mean the bm25 retriever in this integration, see https://github.com/deepset-ai/haystack-core-integrations/blob/main/integrations/elasticsearch/src/elasticsearch_haystack/bm25_retriever.py#L51

…riever

…-ElasticSearshBM25Retriever

sahusiddharth · 2023-12-23T10:09:52Z

@masci All done

anakin87

I made some little changes.
Thanks for this PR!

changes addressed

added top_k arguement in the run function of ElasticSearcBM25Retriever

82d2af3

sahusiddharth requested a review from a team as a code owner December 20, 2023 10:18

sahusiddharth requested review from masci and removed request for a team December 20, 2023 10:18

github-actions bot added the integration:elasticsearch label Dec 20, 2023

sahusiddharth mentioned this pull request Dec 20, 2023

ElasticSearchBM25Retriever doesn't accept top_k in the run function deepset-ai/haystack#6600

Closed

masci previously requested changes Dec 21, 2023

View reviewed changes

masci self-assigned this Dec 21, 2023

masci changed the title ~~added top_k arguement in the run function of ElasticSearcBM25Retriever~~ added top_k argument in the run function of ElasticSearcBM25Retriever Dec 22, 2023

sahusiddharth and others added 2 commits December 22, 2023 22:54

added top_k arguement in the the run function of ElasticSearchBM25Ret…

35eb043

…riever

Merge branch 'deepset-ai:main' into feat/accept-top_k-in-run-function…

c074b0a

…-ElasticSearshBM25Retriever

sahusiddharth requested a review from masci December 22, 2023 17:30

Update bm25_retriever.py

02e83ac

anakin87 added 3 commits December 27, 2023 16:41

Update bm25_retriever.py

ca36def

Update bm25_retriever.py

836cb71

Update embedding_retriever.py

3adb30c

anakin87 self-requested a review December 27, 2023 15:43

anakin87 approved these changes Dec 27, 2023

View reviewed changes

anakin87 merged commit 34ada7a into deepset-ai:main Dec 27, 2023
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added top_k argument in the run function of ElasticSearcBM25Retriever #130

added top_k argument in the run function of ElasticSearcBM25Retriever #130

sahusiddharth commented Dec 20, 2023

CLAassistant commented Dec 20, 2023 •

edited

Loading

masci left a comment

masci Dec 21, 2023

masci Dec 21, 2023

masci Dec 21, 2023

masci Dec 21, 2023

sahusiddharth Dec 22, 2023

masci Dec 23, 2023 •

edited

Loading

sahusiddharth commented Dec 23, 2023

anakin87 left a comment

added top_k argument in the run function of ElasticSearcBM25Retriever #130

added top_k argument in the run function of ElasticSearcBM25Retriever #130

Conversation

sahusiddharth commented Dec 20, 2023

CLAassistant commented Dec 20, 2023 • edited Loading

masci left a comment

Choose a reason for hiding this comment

masci Dec 21, 2023

Choose a reason for hiding this comment

masci Dec 21, 2023

Choose a reason for hiding this comment

masci Dec 21, 2023

Choose a reason for hiding this comment

masci Dec 21, 2023

Choose a reason for hiding this comment

sahusiddharth Dec 22, 2023

Choose a reason for hiding this comment

masci Dec 23, 2023 • edited Loading

Choose a reason for hiding this comment

sahusiddharth commented Dec 23, 2023

anakin87 left a comment

Choose a reason for hiding this comment

CLAassistant commented Dec 20, 2023 •

edited

Loading

masci Dec 23, 2023 •

edited

Loading