New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

docs: add samples to migrate pinecone to alloy db #292

Draft

vishwarajanand wants to merge 10 commits into main from code-snippets

Contributor

vishwarajanand commented Dec 17, 2024 •

edited

Loading

Adding code snippets to migrate from Pinecone to Alloy DB.

Upcoming PRs:

More DBs
Cloud build file

Reviewer points:

Region tags
Code structure & readability
Use cases coverage

How to add an index into Pinecone:

https://paste.googleplex.com/5444295470088192

Test Log:

go/pinecone-alloydb-migration


          chore: add samples to migrate pinecone to alloy db

ec9a0b5

product-auto-label bot added api: alloydb samples labels

vishwarajanand changed the title ~~chore: add samples to migrate pinecone to alloy db~~ docs: add samples to migrate pinecone to alloy db


          fix: add Google file header

7d6f68d

averikitsch requested changes

View reviewed changes

Collaborator

averikitsch left a comment

We may need to outline the changes to the tutorial, currently how would users run both the get and add?

samples/migrations/snippets/alloydb_snippets.py Outdated Show resolved Hide resolved

samples/migrations/snippets/pinecone_snippets.py Outdated Show resolved Hide resolved

averikitsch reviewed

View reviewed changes

samples/migrations/snippets/snippets_test.py Outdated Show resolved Hide resolved

vishwarajanand added 2 commits

December 18, 2024 18:13


          fix: address PR comments

0ee4df7


          fix: address pr comments

ec7abce

Changes:
1. Made snippets as standalone files
2. Compressed snippet functions into a single file.

averikitsch requested changes

View reviewed changes

.github/workflows/lint.yml

@@ @@ -44,6 +44,9 @@ jobs: @@
                     - name: Install Sample requirements
                       run: pip install -r samples/requirements.txt
+                    - name: Install Migration snippets requirements

Collaborator

averikitsch Dec 19, 2024

nit: I have been adding all sample reqs to https://github.com/googleapis/langchain-google-alloydb-pg-python/blob/main/samples/requirements.txt so this file doesn't need to be updated. I am also ok with this pattern of adding the new req file to the workflow

Contributor Author

vishwarajanand Dec 23, 2024

Tried to follow this snippet in the current version of of snippets

samples/migrations/requirements.txt

Comment on lines +8 to +9

		protobuf==5.29.1
		grpcio-tools==1.67.1

Collaborator

averikitsch Dec 19, 2024

Why are these needed?

Contributor Author

vishwarajanand Dec 22, 2024

For Milvus (ref), seems these are required.

samples/migrations/pinecone_migration.py Outdated Show resolved Hide resolved

samples/migrations/pinecone_migration.py Outdated Show resolved Hide resolved

samples/migrations/pinecone_migration.py Outdated

+              """
+              # TODO(dev): Replace the values below
+              pinecone_api_key = os.environ["PINECONE_API_KEY"]

Collaborator

averikitsch Dec 19, 2024

We discussed that these would be variables to be set like https://github.com/GoogleCloudPlatform/python-docs-samples/blob/140b9dae356a8ffb4aa587571c4ee1eb1ae99e39/automl/snippets/get_model.py#L21, not environment variables.
We also discussed outline the instructions that we would give to the user/TW. Did you document in our notes that we would require users to set all the environment variables?

I would prefer that this is updated to use variables so there is not additional time and friction to understand and validate the environment variable values.

Contributor Author

vishwarajanand Dec 23, 2024

Limited the use of env vars to only the tests

samples/migrations/pinecone_migration.py Outdated

+                  alloydb_engine = await aget_alloydb_client()
+                  # [START pinecone_alloydb_migration_get_alloydb_vectorstore]
+                  from alloydb_snippets import aget_vector_store, get_embeddings_service

Collaborator

averikitsch Dec 19, 2024

I don't want the region tag to include the new methods. Please update this so it's clean only using the langchain methods.

samples/migrations/pinecone_migration.py Outdated

Comment on lines 143 to 144

		embeddings_service = get_embeddings_service(pinecone_vector_size)
		vs = await aget_vector_store(

Collaborator

averikitsch Dec 19, 2024

region tags should include the new wrapper methods

samples/migrations/pinecone_migration.py Outdated Show resolved Hide resolved

samples/migrations/alloydb_snippets.py Outdated

Comment on lines 21 to 31

+              project_id = os.environ["PROJECT_ID"]
+              region = os.environ["REGION"]
+              cluster = os.environ["CLUSTER_ID"]
+              instance = os.environ["INSTANCE_ID"]
+              db_name = os.environ["DATABASE_ID"]
+              # TODO(dev): (optional values) Replace the values below
+              db_user = os.environ.get("DB_USER", "")
+              db_pwd = os.environ.get("DB_PASSWORD", "")
+              table_name = os.environ.get("TABLE_NAME", "alloy_db_migration_table")
+              vector_size = int(os.environ.get("VECTOR_SIZE", "768"))

Collaborator

averikitsch Dec 19, 2024

See note on variables not env vars

samples/migrations/alloydb_snippets.py Outdated Show resolved Hide resolved

vishwarajanand added 6 commits

December 22, 2024 04:00


          Merge branch 'main' into code-snippets

3c38878


          chore: address some pr comments

7644c9b


          fix: lint

b17a017


          fix: lint


          fix: lint add type hints to params of main method

4daacc5


          chore: remove custom id column requirement

ab89d03

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: alloydb samples