-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Source Connector: 🤗 "Hugging Face Datasets" (optionally via DuckDB 🦆 ) #30
Comments
@aaronsteers I am interested in working on this and also willing to work on #31 which is closely related to this! |
Awesome! You are the first to chime in so I think this one is yours! Can you also drop a comment in the other issue. (GitHub won't let me assign otherwise.) |
@aaronsteers I've started working on this issue and started buiilding a connector for hugging face datasets in python cdk. |
Hi, @ombhardwajj . I apologize for any confusion. I've put this and #31 into the Do you need any assist on this item or on #31? |
@aaronsteers Thanks for the concern. Regarding #31, I am first going to solve for this issue then I'll start solving #31. |
Over the past week, I tried to build this but, unfortunately, I have been facing some errors. Despite my efforts to resolve them, I have not been successful. Therefore, I am un-assigning myself from this issue. |
Hi @aaronsteers, |
@ombhardwajj - I understand. Thanks for looping back. @bala-ceg - If you still are wanting to pick this up, it is yours. 👍 |
@marcosmarxm @aaronsteers can you please let me know which connector development method i should follow - python cdk or lowcode cdk |
Low-code if possible but if it isn't you need to you Python CDK |
Overview
This blog post came out 2 weeks ago, announcing a new feature where DuckDB can now extract from hugging face datasets using the
hf://
URI prefix.We think this would make an awesome connector for users in our community.
https://duckdb.org/2024/05/29/access-150k-plus-datasets-from-hugging-face-with-duckdb.html
Technical spec
You would write a new source connector which can connect to Hugging Face source datasets and emit records from them, allowing Airbyte users to send these to any Airbyte destination.
Notes:
Cache
andSQLProcessor
.Definition of Done
The text was updated successfully, but these errors were encountered: