Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: update VDMS vectorstore with latest VDMS release #28443

Closed
wants to merge 26 commits into from

Conversation

cwlacewe
Copy link
Contributor

@cwlacewe cwlacewe commented Dec 2, 2024

Description:

  • Update to use changes made to VDMS v2.10.0 and include some optimizations
  • Use test suite for testing as suggested in PR #23729.
  • Reran VDMS related notebooks using changes

Issue: N/A

Dependencies: N/A

Copy link

vercel bot commented Dec 2, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 10, 2024 11:20pm

@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. community Related to langchain-community Ɑ: vector store Related to vector store module 🤖:docs Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder labels Dec 2, 2024
@cwlacewe
Copy link
Contributor Author

cwlacewe commented Dec 4, 2024

@efriis, @ccurme,
Could someone please review this PR?

Copy link
Collaborator

@baskaryan baskaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the update!

before fully reviewing, could you provide a bit more context? seems like there's some breaking changes, are these necessary for compatibility with the new vdms release? or are those two separate things

embedding: Optional[Embeddings] = None,
collection_name: str = DEFAULT_COLLECTION_NAME, # DescriptorSet name
distance_strategy: DISTANCE_METRICS = "L2",
embedding: Embeddings,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

breaking change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@baskaryan, this change wasn't necessary. I reverted it

Copy link

vercel bot commented Dec 10, 2024

Deployment failed with the following error:

The provided GitHub repository does not contain the requested branch or commit reference. Please ensure the repository is not empty.

@cwlacewe
Copy link
Contributor Author

@baskaryan could you please review this?

Copy link
Collaborator

@ccurme ccurme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cwlacewe I'm going to close this for now with the recommendation that you / Intel implement the VDMS vector store in a new package (e.g., langchain-vdms or langchain-intel).

We've written a walkthrough on this process here and are happy to answer your questions as you go through it:

https://python.langchain.com/docs/contributing/how_to/integrations/

We are encouraging contributors of LangChain integrations to go this route. This way we don't have to be in the loop for reviews, you're able to properly integration test the code, and you have control over versioning.

I think this route is preferable to reviewing a large code change (thousands of lines), only to have the implementation deprecated if/when the standalone package is released.

Docs would continue to be maintained in the langchain repo.

Let me know what you think!

"FindDescriptor", {"entities": list(), "returned": 0, "status": 0}
)
new_response = [
# {"FindDescriptor": {"returned": 0, "status": 0, "entities": []}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# {"FindDescriptor": {"returned": 0, "status": 0, "entities": []}}

Comment on lines +531 to +532
# else:
# kwargs["normalize_distance"] = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we handle the else condition here?

ids=ids,
metadatas=metadatas,
batch_size=batch_size,
# Remove IDs if exist-TEST_REMOVAL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant by -TEST_REMOVAL?

@ccurme ccurme closed this Dec 19, 2024
@cwlacewe
Copy link
Contributor Author

@ccurme is there a way to get this one integrated as we have customers waiting and we can work on the package for new year?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community 🤖:docs Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder size:XXL This PR changes 1000+ lines, ignoring generated files. Ɑ: vector store Related to vector store module
Projects
Status: Closed
Development

Successfully merging this pull request may close these issues.

3 participants