Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kaggle document loader #13788

Closed
wants to merge 6 commits into from

Conversation

leo-gan
Copy link
Collaborator

@leo-gan leo-gan commented Nov 23, 2023

Added the Kaggle datasets as a DocumentLoader

Copy link

vercel bot commented Nov 23, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Nov 23, 2023 11:32pm

@leo-gan leo-gan marked this pull request as ready for review November 23, 2023 23:35
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. Ɑ: doc loader Related to document loader module (not documentation) 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features labels Nov 23, 2023
@leo-gan leo-gan requested a review from efriis November 23, 2023 23:36
@leo-gan leo-gan changed the title kaggle data loader kaggle document loader Nov 24, 2023
Copy link
Member

@efriis efriis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little suggestion to make this even more convenient for folks using datasets from kaggle!

"""Load from `Kaggle` datasets.

Follow these steps to use this loader:
- Register a Kaggle account and create an API token to use this loader.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used the kaggle package before, and would it be possible to automate these steps in the loader instead of having it be a csv loader with some logic around the kaggle dataset format?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible. But I don't have time for it right now. I'm switching this PR to draft. Maybe I'll find time for automate it.


## Installation and Setup

You need to install `kaggle` python package.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, I don't think this is true.

Copy link
Collaborator Author

@leo-gan leo-gan Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package is used to manually download the kaggle dataset locally. I'd prefer do keep it here for clarity.

@leo-gan leo-gan marked this pull request as draft November 27, 2023 17:05
@hwchase17 hwchase17 closed this Jan 30, 2024
@baskaryan baskaryan reopened this Jan 30, 2024
@leo-gan leo-gan closed this Mar 1, 2024
@leo-gan leo-gan deleted the kaggle-document-loader branch March 1, 2024 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: doc loader Related to document loader module (not documentation) 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants