Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multimodal retrieval with Amazon reviews dataset and LLVM reranking #1477

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

stefanwebb
Copy link
Contributor

This is a notebook I used for an upcoming AWS Open Source Developers YouTube video. It modifies an existing one to only use open-source models

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@sre-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: stefanwebb

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:11Z
----------------------------------------------------------------

Need a better title. e.g "Multimodal retrieval with Amazon reviews dataset and LLVM reranking" sounds better than this.


stefanwebb commented on 2024-12-20T21:06:29Z
----------------------------------------------------------------

Done!

Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:12Z
----------------------------------------------------------------

Please add more elaboration on what this notebook is trying to do, to attract the reader and layout background. A good opening is very important to retain reader's attention.

Good examples are:

https://milvus.io/docs/graph_rag_with_milvus.md

https://milvus.io/docs/contextual_retrieval_with_milvus.md


stefanwebb commented on 2024-12-20T21:06:42Z
----------------------------------------------------------------

Done!

Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:13Z
----------------------------------------------------------------

Does this notebook has a pairing blog? Why is here only clipped ones but no overview diagram?


stefanwebb commented on 2024-12-20T21:07:38Z
----------------------------------------------------------------

This is from a slide on a presentation I gave. I'll. include the entire figure with the irrelevant bits greyed out each time

stefanwebb commented on 2024-12-20T21:26:13Z
----------------------------------------------------------------

Done!

Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:14Z
----------------------------------------------------------------

the product review text?


stefanwebb commented on 2024-12-20T21:08:05Z
----------------------------------------------------------------

I.e. the text from the customer reviews

stefanwebb commented on 2024-12-20T21:08:20Z
----------------------------------------------------------------

I'll make it clearer by rewording

stefanwebb commented on 2024-12-20T21:27:32Z
----------------------------------------------------------------

Done!

Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:15Z
----------------------------------------------------------------

what about adding a little description like "It can embed text and image information into the same latent space thus enable multimodal search."


Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:15Z
----------------------------------------------------------------

nit, what about

easy methods -> convenient util functions

imgs -> img


stefanwebb commented on 2024-12-20T21:39:56Z
----------------------------------------------------------------

Done! I think img's seems more natural than img. (You can use 's to separate a word and an -s for plural)

Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:16Z
----------------------------------------------------------------

For each image in the downloaded dataset, we pass it through the embedding model to obtain the output vector. Embedding may take some time. For example, a MacBook Pro M3 embeds around nine images per second. The throughput is likely to be much higher if running on a more powerful hardware such as Nvidia GPU.


Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:17Z
----------------------------------------------------------------

In order to perform efficient similarity search over embeddings, we need to store them in a vector database. In this demo, we use Milvus Lite, a lightweight version of a popular open-source vector database Milvus.

By specifying uri to a file path, it persists all data to the local file.


Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:18Z
----------------------------------------------------------------

nit, I feel this and the following 4 code blocks can be merged. the logic is straightforward.


Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:19Z
----------------------------------------------------------------

nit, delete before creating? (in case people repeated run the notebook and insert duplicate data)

https://zilliverse.feishu.cn/wiki/RrxgwEVooidRpEkH3pqcJMpTnA6#Gs2udxUMboOc0sxeGEJcHnF1nlc


Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:20Z
----------------------------------------------------------------

since we use auto id we can omit the id field


Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:21Z
----------------------------------------------------------------

can we hide the lengthy output (heard there is a trick to do that, or you can clear the output)


Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:21Z
----------------------------------------------------------------

Now, all data is ingested into Milvus vector database. We are ready for multi-modal search!


Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:22Z
----------------------------------------------------------------

nit, why is the leapard squeezed horizontally? maybe due to "display(combined_image.resize((512, 512)))"?


stefanwebb commented on 2024-12-20T21:50:51Z
----------------------------------------------------------------

I think because we have 3 rows but 4 columns... I'm going to keep it unchanged since changing it would involve rerunning everything

Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:23Z
----------------------------------------------------------------

Nit, is it LLVM or VLM (vision language model)?

And seems the following code isn't using "HuggingFace's Transformers library."


stefanwebb commented on 2024-12-20T21:52:59Z
----------------------------------------------------------------

Reworded to make it clearer that I'm using phi_3_vision_mlx library

stefanwebb commented on 2024-12-20T21:53:26Z
----------------------------------------------------------------

I think LLVM is more more common

Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:24Z
----------------------------------------------------------------

This part gets me a little confused. Previously it was using Milvus Lite, why here it becomes a Zilliz Cloud commercial?

I feel it's fine to focus on Milvus and list it as a blog post on milvus.io.

Need to anchor on either one and polish the marketing language here.


stefanwebb commented on 2024-12-20T21:54:18Z
----------------------------------------------------------------

That's a typo... Fixed now

Copy link

review-notebook-app bot commented Dec 16, 2024

View / edit / reply to this conversation on ReviewNB

codingjaguar commented on 2024-12-16T13:29:25Z
----------------------------------------------------------------

Ditto here, if targeting milvus.io linking to a milvus content aggregation page on zilliz.com is a little weird. (Linking to zilliz signup is fine and that's what we want for driving traffic there)


stefanwebb commented on 2024-12-20T21:57:16Z
----------------------------------------------------------------

Fixed

Copy link
Contributor Author

Done!


View entire conversation on ReviewNB

Copy link
Contributor Author

Done!


View entire conversation on ReviewNB

Copy link
Contributor Author

This is from a slide on a presentation I gave. I'll. include the entire figure with the irrelevant bits greyed out each time


View entire conversation on ReviewNB

Copy link
Contributor Author

I.e. the text from the customer reviews


View entire conversation on ReviewNB

Copy link
Contributor Author

I'll make it clearer by rewording


View entire conversation on ReviewNB

Copy link
Contributor Author

Done!


View entire conversation on ReviewNB

Copy link
Contributor Author

Done!


View entire conversation on ReviewNB

Copy link
Contributor Author

Done! I think img's seems more natural than img. (You can use 's to separate a word and an -s for plural)


View entire conversation on ReviewNB

Copy link
Contributor Author

I think because we have 3 rows but 4 columns... I'm going to keep it unchanged since changing it would involve rerunning everything


View entire conversation on ReviewNB

Copy link
Contributor Author

Reworded to make it clearer that I'm using phi_3_vision_mlx library


View entire conversation on ReviewNB

Copy link
Contributor Author

I think LLVM is more more common


View entire conversation on ReviewNB

Copy link
Contributor Author

That's a typo... Fixed now


View entire conversation on ReviewNB

Copy link
Contributor Author

Fixed


View entire conversation on ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants