-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add visual-question-answering / multimodal support to gradio notebook tasks #1392
Comments
Thanks @Bedrovelsen! Would love your help adding that, and messages you on discord so our team can work with you to make sure you can get this set up! |
Sounds good |
Just copying over the quick implementation overview from discord here:
I believe the helpers about validating/retrieving the image from attachments can just be kept the same. With the parser implemented, we can expose it in the extension here: https://github.com/lastmile-ai/aiconfig/blob/main/extensions/HuggingFace/python/src/aiconfig_extension_hugging_face/__init__.py For testing the extension, please see README instructions - https://github.com/lastmile-ai/aiconfig/blob/main/extensions/HuggingFace/python/README.md Then, I would recommend importing and registering the new parser in https://github.com/lastmile-ai/aiconfig/blob/main/cookbooks/Gradio/aiconfig_model_registry.py with id "Visual Question Answering" and then following the Getting Started instructions in https://github.com/lastmile-ai/aiconfig/edit/main/cookbooks/Gradio/README.md to open the huggingface.aiconfig.json file with the new parser registered. On the UI side, we will need to add a new PromptSchema to the client for rendering the parser's input and settings nicely. I can implement that shortly |
# Implement HuggingFaceVisualQuestionAnsweringRemoteInferencePromptSchema For #1392 This will add the prompt schema so that visual question answering prompts have the nice UI for input and settings
…ma (#1396) Implement HuggingFaceVisualQuestionAnsweringRemoteInferencePromptSchema # Implement HuggingFaceVisualQuestionAnsweringRemoteInferencePromptSchema For #1392 This will add the prompt schema so that visual question answering prompts have the nice UI for input and settings --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/lastmile-ai/aiconfig/pull/1396). * #1397 * __->__ #1396
Whoops, linked #1396 which has the schema changes and it auto-closed. This issue is still open |
Enjoying the recent gradio notebook stuff!
Was curious about when/if support for an additional hugging face task option of "visual question answering“" is planned?
If not currently planning to add this could a quick overview on how to add a new task category to the gradio notebook codebase (beside just manually reading over the current code for gradio notebooks myself to figure it out on my own which I can do of course but guidance from the team is preferred for best practices in contributing etc)
The text was updated successfully, but these errors were encountered: