feat: Add Optimum Embedders #379

awinml · 2024-02-08T14:47:57Z

Related Issues

fixes Add OptimumEmbedder #137

Proposed Changes:

Add support for inferencing embedding models using the Hugging Face Optimum Library. These components are designed to seamlessly inference models using the high speed ONNX runtime.

Introduces two components:

OptimumTextEmbedder, a component for embedding strings.
OptimumDocumentEmbedder, a component for computing Document embeddings.

Additional optimizations implemented to bring down inference time:

Sorting by sequence length: The text sequences are sorted in descending order based on their length, before creating embeddings. Please see the Sentence Transformers Implementation for reference.
Dynamic Padding: The text sequences are padded to the longest sequence in the batch. This is achieved by setting padding=True for the AutoTokenizer. Please see the Transformers Documentation for reference.

For the TensorRT ONNX runtime, it is recommended to cache the TensorRT engine since it takes time to build. Instructions to pass the necessary parameters for caching have been added to the docstrings.

The conversion to ONNX is cached by default, similar to how Transformers caches the models. If the user wishes to modify the caching parameters or tweak it for the other runtimes, the necessary parameters can be passed using the model_kwargs parameter.

The implementation for the different Pooling Methods is based on the Sentence Transformers implementation.

How did you test it?

Tests were added in optimum_document_embedder.py and optimum_text_embedder.py.

Notes for the reviewer

For using optimum with GPU based runtimes, the optimum[onnxruntime-gpu] package is required. To install this package optimum has to be fully uninstalled first. To overcome this limitation, two optional dependencies can been set:

For the CPU version: pip install optimum-haystack[cpu], which installs optimum[onnxruntime].
For the GPU version: pip install optimum-haystack[gpu], which installs optimum[onnxruntime-gpu].

This approach is not very user friendly, since the component is not usable when using just pip install optimum-haystack. I am not very happy with this approach, please suggest a better approach.

Currently, only optimum[onnxruntime] has been added to the dependencies, without the support for the gpu package.

Usage Examples:

These examples demonstrate how the embedders can be used to inference the BAAI/bge-small-en-v1.5 embedding model using different ONNX runtimes.

On CPU:

from haystack_integrations.components.embedders import OptimumTextEmbedder

text_to_embed = "I love pizza!"

text_embedder = OptimumTextEmbedder(model="sentence-transformers/all-mpnet-base-v2")
text_embedder.warm_up()

print(text_embedder.run(text_to_embed))

# {'embedding': [0.017020374536514282, -0.023255806416273117, ...],

On GPU using the CUDAExecutionProvider:

from haystack_integrations.components.embedders import OptimumTextEmbedder

text_to_embed = "I love pizza!"

text_embedder = OptimumTextEmbedder(
    model="sentence-transformers/all-mpnet-base-v2", 
    onnx_execution_provider="CUDAExecutionProvider"
)
text_embedder.warm_up()

print(text_embedder.run(text_to_embed))

# [0.017020374536514282, -0.023255806416273117, ...]

shadeMe

Thanks for the comprehensive PR! I've added a few comments.

...grations/optimum/src/haystack_integrations/components/embedders/optimum_document_embedder.py

integrations/optimum/src/haystack_integrations/components/embedders/optimum_text_embedder.py

integrations/optimum/pyproject.toml

...grations/optimum/src/haystack_integrations/components/embedders/optimum_document_embedder.py

sjrl · 2024-02-15T07:08:13Z

I took a look into this

The versions of optimum and transformers has been pinned to optimum==1.15.0 and transformers==4.36.2, due to bugs in the latest optimum release. Please refer to the following issues for more information: huggingface/optimum#1673 and huggingface/optimum#1675.

to see if there was a way we could overcome pinning the dependencies.

Thankfully, it looks like huggingface/optimum#1675 has been resolved with version 1.16.2, but the other issue still remains.

Another error that I found while running the tests with versions optimum>=1.16.2 and transformers>=4.37.2 was

onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid input name: token_type_ids

which can be solved with the following code block added right before calling the model

# Only pass required inputs otherwise onnxruntime can raise an error
inputs_to_remove = set(encoded_input.keys()).difference(self.embedding_model.inputs_names)
for key in inputs_to_remove:
    encoded_input.pop(key)

Even though this isn't needed with the current pinned dependencies (at least for the tested models) I think this would be good to add to avoid needing to solve this error in the future.

integrations/optimum/src/haystack_integrations/components/embedders/optimum_text_embedder.py

shadeMe · 2024-02-19T13:23:49Z

@awinml Just wanted to give you an update that we'll move this PR on to our plate to expedite its merging.

awinml · 2024-02-19T14:37:26Z

Thanks! @sjrl

I added this code block and unpinned the optimum and transformers versions.

# Only pass required inputs otherwise onnxruntime can raise an error
inputs_to_remove = set(encoded_input.keys()).difference(self.embedding_model.inputs_names)
for key in inputs_to_remove:
    encoded_input.pop(key)

All the tests pass successfully!

awinml · 2024-02-19T14:57:38Z

@shadeMe Thanks for the detailed review! I think this PR is nearly there.

I have pushed most of the changes requested in the review. I will be adding the other pooling methods (as mentioned in #379 (comment)) very shortly.

integrations/optimum/src/haystack_integrations/components/embedders/optimum_text_embedder.py

sjrl · 2024-02-20T08:49:22Z

@awinml Thanks for adding the Pooling sub module! Would it also be possible to add tests for each of the pooling methods to check their implementations?

integrations/optimum/src/haystack_integrations/components/embedders/backends/optimum_backend.py

.github/workflows/optimum.yml

integrations/optimum/src/haystack_integrations/components/embedders/backends/optimum_backend.py

...grations/optimum/src/haystack_integrations/components/embedders/optimum_document_embedder.py

integrations/optimum/src/haystack_integrations/components/embedders/optimum_text_embedder.py

integrations/optimum/src/haystack_integrations/components/embedders/pooling.py

integrations/optimum/pydoc/config.yml

Co-authored-by: Daria Fokina <[email protected]>

shadeMe

Thanks again for the contribution!

Add Optimum Embedders

18a16ab

awinml requested a review from a team as a code owner February 8, 2024 14:47

awinml requested review from davidsbatista and removed request for a team February 8, 2024 14:47

github-actions bot added the type:documentation Improvements or additions to documentation label Feb 8, 2024

Add CI workflow

7112573

github-actions bot added the topic:CI label Feb 8, 2024

shadeMe self-requested a review February 8, 2024 15:17

Update dependencies

aa1d04d

shadeMe suggested changes Feb 8, 2024

View reviewed changes

sjrl reviewed Feb 14, 2024

View reviewed changes

integrations/optimum/src/haystack_integrations/components/embedders/optimum_text_embedder.py Show resolved Hide resolved

sjrl reviewed Feb 14, 2024

View reviewed changes

integrations/optimum/src/haystack_integrations/components/embedders/optimum_text_embedder.py Outdated Show resolved Hide resolved

sjrl reviewed Feb 14, 2024

View reviewed changes

integrations/optimum/src/haystack_integrations/components/embedders/optimum_text_embedder.py Outdated Show resolved Hide resolved

sjrl reviewed Feb 14, 2024

View reviewed changes

integrations/optimum/pyproject.toml Show resolved Hide resolved

shadeMe reviewed Feb 14, 2024

View reviewed changes

...grations/optimum/src/haystack_integrations/components/embedders/optimum_document_embedder.py Outdated Show resolved Hide resolved

sjrl reviewed Feb 15, 2024

View reviewed changes

integrations/optimum/src/haystack_integrations/components/embedders/optimum_text_embedder.py Outdated Show resolved Hide resolved

Add embedding backend

fe7fa36

Add pooling sub module; Update pyproject.toml with dependency info

9eb400f

awinml requested review from shadeMe and sjrl February 20, 2024 07:36

sjrl reviewed Feb 20, 2024

View reviewed changes

integrations/optimum/src/haystack_integrations/components/embedders/optimum_text_embedder.py Outdated Show resolved Hide resolved

shadeMe suggested changes Feb 20, 2024

View reviewed changes

shadeMe added the integration:optimum label Feb 20, 2024

shadeMe removed the request for review from davidsbatista February 20, 2024 12:18

Remove backend factory; Address review comments

f0256b7

Add API docs generation workflow

de9bd77

awinml requested a review from shadeMe February 21, 2024 10:06

awinml added 2 commits February 21, 2024 16:44

Add additional tests

096b1d3

Update order to 185 for API docs

98bc903

dfokina reviewed Feb 21, 2024

View reviewed changes

integrations/optimum/pydoc/config.yml Outdated Show resolved Hide resolved

Update integrations/optimum/pydoc/config.yml

ce4bb76

Co-authored-by: Daria Fokina <[email protected]>

shadeMe approved these changes Feb 21, 2024

View reviewed changes

shadeMe merged commit ad5a290 into deepset-ai:main Feb 21, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Optimum Embedders #379

feat: Add Optimum Embedders #379

awinml commented Feb 8, 2024 •

edited

Loading

shadeMe left a comment

sjrl commented Feb 15, 2024

shadeMe commented Feb 19, 2024

awinml commented Feb 19, 2024

awinml commented Feb 19, 2024

sjrl commented Feb 20, 2024

shadeMe left a comment

feat: Add Optimum Embedders #379

feat: Add Optimum Embedders #379

Conversation

awinml commented Feb 8, 2024 • edited Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Usage Examples:

shadeMe left a comment

Choose a reason for hiding this comment

sjrl commented Feb 15, 2024

shadeMe commented Feb 19, 2024

awinml commented Feb 19, 2024

awinml commented Feb 19, 2024

sjrl commented Feb 20, 2024

shadeMe left a comment

Choose a reason for hiding this comment

awinml commented Feb 8, 2024 •

edited

Loading