Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft]Add Multimodal RAG notebook #2497

Open
wants to merge 18 commits into
base: latest
Choose a base branch
from

Conversation

openvino-dev-samples
Copy link
Collaborator

@openvino-dev-samples openvino-dev-samples commented Nov 1, 2024

image

image

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@openvino-dev-samples openvino-dev-samples changed the title [Draft]Add Multimodal RAG [Draft]Add Multimodal RAG notebook Nov 4, 2024
transfer to optimum-intel

transfer to optimum-intel
@eaidova
Copy link
Collaborator

eaidova commented Nov 18, 2024

@openvino-dev-samples for me everything looks good, thanks.

Couple of comments:
Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

Is there any plans to integrate OV Visual Language models directly in llama-index?

@openvino-dev-samples
Copy link
Collaborator Author

openvino-dev-samples commented Nov 18, 2024

@openvino-dev-samples for me everything looks good, thanks.

Couple of comments: Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

Is there any plans to integrate OV Visual Language models directly in llama-index?

Thanks for your review, the integration is already done in llama-index

https://docs.llamaindex.ai/en/stable/examples/multi_modal/openvino_multimodal/

BTW is there an example for phi3-vision's accuracy aware quantization ?

@nikita-savelyevv
Copy link
Collaborator

Couple of comments: Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

I'll add that an algorithm itself needs to be specified, e.g. --weight-format int4 --dataset contextual --awq.

Also, the default number of samples of 128 might be too large, so it can be reduced with --num-samples 32.

@openvino-dev-samples
Copy link
Collaborator Author

Couple of comments: Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

I'll add that an algorithm itself needs to be specified, e.g. --weight-format int4 --dataset contextual --awq.

Also, the default number of samples of 128 might be too large, so it can be reduced with --num-samples 32.

Hi, as my test, the accuracy with this configuration is not satisfied:
optimum-cli export openvino --model {vlm_model_id} {vlm_model_path} --trust-remote-code --weight-format int4 --dataset contextual --awq --num-samples 32

add load image function
@nikita-savelyevv
Copy link
Collaborator

nikita-savelyevv commented Nov 19, 2024

Couple of comments: Possibly it is better to move on accuracy aware quantization for VLM using optimum-cli, need to provide --weight-format int4 --dataset contextual options for that (fyi @nikita-savelyevv)

I'll add that an algorithm itself needs to be specified, e.g. --weight-format int4 --dataset contextual --awq.
Also, the default number of samples of 128 might be too large, so it can be reduced with --num-samples 32.

Hi, as my test, the accuracy with this configuration is not satisfied: optimum-cli export openvino --model {vlm_model_id} {vlm_model_path} --trust-remote-code --weight-format int4 --dataset contextual --awq --num-samples 32

Thanks for the information! Have you compared it against the configuration below?

compression_config = {
    "mode": nncf.CompressWeightsMode.INT4_SYM,
    "group_size": 64,
    "ratio": 0.6,
}

Yes, this configuration brings more reasonable responses compared to optimum-cli

update the method of audio extraction
@openvino-dev-samples
Copy link
Collaborator Author

hi @eaidova looks the CI is out of resource to validate this notebook, any suggestions ? thanks

@eaidova
Copy link
Collaborator

eaidova commented Nov 21, 2024

@openvino-dev-samples I can suggest to try internvl2-1b-instruct or nano-llava for ci testing, it is small enough to fit into precommit

fix ci issues
@openvino-dev-samples
Copy link
Collaborator Author

--trust-remote-code --weight-format int4 --dataset contextual --awq --num-samples 32

Sorry, I made a mistake before. The result looks good with this accuracy aware config now

update with accruaracy aware quantization
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants