Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to return only specific properties from VectorRetriever.search results #13

Merged
merged 2 commits into from
Apr 22, 2024
Merged

Conversation

oskarhane
Copy link
Member

To not return the full nodes on plain vector search for a faster and leaner retrieval and minimize the needed post processing.

Usage:

index_name = "movie_plots"
pluck = ["title", "plot"]

retriever = VectorRetriever(driver, index_name, custom_embeddings, pluck=pluck)

query_text = "What's a movie about a wedding?"
results = retriever.search(query_text=query_text, top_k=2)
# [
#   Neo4jRecord(node={'title': 'The Wedding Ringer', 'plot': 'Two weeks shy of his wedding, a socially awkward guy enters into a charade by hiring the owner of a company that provides best men for grooms in need.'}, score=0.9319988489151001),
#   Neo4jRecord(node={'title': "My Best Friend's Wedding", 'plot': "When a woman's long-time friend says he's engaged, she realizes she loves him herself... and sets out to get him, with only days before the wedding."}, score=0.9314284324645996)
# ]

To not return the full nodes on plain vector search for a faster and leaner retrieval.
@oskarhane oskarhane changed the title Add pluck argument to VectorRetriever constructor Add pluck argument to VectorRetriever Apr 19, 2024
@oskarhane oskarhane requested a review from willtai April 22, 2024 06:54
@willtai
Copy link
Contributor

willtai commented Apr 22, 2024

I'm curious why do we call it pluck? I'm wondering if something like return_node_properties or return_properties is more descriptive

@oskarhane
Copy link
Member Author

oskarhane commented Apr 22, 2024

I'm curious why do we call it pluck? I'm wondering if something like return_node_properties or return_properties is more descriptive

It comes from https://lodash.com/docs/3.10.1#pluck which is a very popular utility library. Maybe there 's a better word for it, but pluck is pretty descriptive. Maybe pluck_properties is better.

return_* don't sit well with me because there's no exclusivity in the name. return_only_* would be better but that's too long and basically what pluck means.

pick_properties might be an option.

@oskarhane oskarhane changed the title Add pluck argument to VectorRetriever Option to only specific properties from VectorRetriever.search results Apr 22, 2024
@oskarhane oskarhane changed the title Option to only specific properties from VectorRetriever.search results Option to return only specific properties from VectorRetriever.search results Apr 22, 2024
@oskarhane
Copy link
Member Author

After some consideration, I agree with you. Let's go with return_properties @willtai @stellasia

Copy link
Contributor

@willtai willtai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@oskarhane oskarhane merged commit b1f63a8 into neo4j:main Apr 22, 2024
4 checks passed
@oskarhane oskarhane deleted the pluck branch April 22, 2024 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants