-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update dependencies and configuration files
- Loading branch information
1 parent
8d3cf27
commit 3550895
Showing
71 changed files
with
28,435 additions
and
565 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,48 +1,91 @@ | ||
# Data Economy Hackathon | ||
IPFS Huggingface Bridge | ||
|
||
Author - Benjamin Barber @endomorphosis | ||
for transformers.js visit: | ||
https://github.com/endomorphosis/ipfs_transformers_js | ||
|
||
QA / website - Kevin De Haan @coregod360 | ||
for huggingface datasets python library visit | ||
https://github.com/endomorphosis/ipfs_datasets | ||
|
||
CLEANUP / Windows compatibility / Breakfix 03/31/2024 - 04/07/2024 | ||
for orbitdbkit nodejs library visit | ||
https://github.com/endomorphosis/orbitdb-benchmark/ | ||
|
||
Author - Benjamin Barber | ||
QA - Kevin De Haan | ||
|
||
# About | ||
|
||
This is a model manager and wrapper for huggingface, looks up a index of models from an collection of models, and will download a model from either https/s3/ipfs, depending on which source is the fastest. | ||
|
||
# How to use | ||
~~~shell | ||
pip install . | ||
~~~ | ||
|
||
to install | ||
|
||
python3 setup.py | ||
|
||
In your python script | ||
look run ``python3 example.py`` for examples of usage. | ||
|
||
from transformers import AutoModelForSeq2SeqLM | ||
this is designed to be a drop in replacement, which requires only 2 lines to be changed | ||
|
||
from ipfs_transformers import AutoModelForSeq2SeqLM | ||
|
||
model = AutoModelForSeq2SeqLM.from_auto_download("google/t5_11b_trueteacher_and_anli") | ||
In your python script | ||
~~~shell | ||
from transformers import AutoModel | ||
from ipfs_transformers import AutoModel | ||
model = AutoModel.from_auto_download("bge-small-en-v1.5") | ||
~~~ | ||
|
||
or | ||
|
||
from transformers import AutoModelForSeq2SeqLM | ||
~~~shell | ||
from transformers import AutoModel | ||
from ipfs_transformers import AutoModel | ||
model = AutoModel.from_ipfs("QmccfbkWLYs9K3yucc6b3eSt8s8fKcyRRt24e3CDaeRhM1") | ||
~~~ | ||
|
||
or to use with with s3 caching | ||
~~~shell | ||
from transformers import AutoModel | ||
from ipfs_transformers import AutoModel | ||
model = T5Model.from_auto_download( | ||
model_name="google-bert/t5_11b_trueteacher_and_anli", | ||
s3cfg={ | ||
"bucket": "cloud", | ||
"endpoint": "https://storage.googleapis.com", | ||
"secret_key": "", | ||
"access_key": "" | ||
} | ||
) | ||
~~~ | ||
|
||
# To scrape huggingface | ||
|
||
with interactive prompt: | ||
|
||
~~~shell | ||
node scraper.js [source] [model name] | ||
~~~ | ||
|
||
~~~shell | ||
node scraper.js | ||
~~~ | ||
|
||
from ipfs_transformers import AutoModelForSeq2SeqLM | ||
import a model already defined: | ||
|
||
model = AutoModelForSeq2SeqLM.from_ipfs("QmWJr4M1VN5KpJjqCsJsJg7PDmFoqQYs1BKpYxcdMY1qkh") | ||
~~~shell | ||
node scraper.js hf "modelname" (as defined in your .json files) | ||
~~~ | ||
|
||
To scrape huggingface | ||
import all models previously defined: | ||
|
||
interactive prompt: | ||
~~~shell | ||
node scraper.js hf | ||
~~~ | ||
|
||
node scraper.js | ||
## TODO integrate orbitDB | ||
|
||
import a model: | ||
## TODO finish translating model manager to node.js and replace existing ipfs-cluster wrapper | ||
|
||
node scraper.js hf "modelname" (as defined in your .json files) | ||
## TODO finish finish translating model manager to browser js and replace existing ipfs-cluster wrapper | ||
|
||
import all models | ||
## TODO integrate transformers.js (browser implementation) | ||
|
||
node scraper.js hf | ||
## TODO integrate bacalhau dockerfile |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
from transformers import AutoModel | ||
from ipfs_transformers import AutoModel | ||
|
||
model = AutoModel.from_auto_download("bge-small-en-v1.5") | ||
print(dir(model)) | ||
model = AutoModel.from_ipfs("QmccfbkWLYs9K3yucc6b3eSt8s8fKcyRRt24e3CDaeRhM1") | ||
print(dir(model)) | ||
|
||
|
||
## OPTIONAL S3 Caching ## | ||
|
||
#model = T5Model.from_auto_download( | ||
# model_name="google-bert/t5_11b_trueteacher_and_anli", | ||
# s3cfg={ | ||
# "bucket": "cloud", | ||
# "endpoint": "https://storage.googleapis.com", | ||
# "secret_key": "", | ||
# "access_key": "", | ||
# } | ||
#) | ||
#print(dir(model)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.