-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Models to port to MLX-VLM #39
Comments
Next release of Llava-Next TODO: class TextConfig:
model_type: str
hidden_size: int = 4096
num_hidden_layers: int = 32
intermediate_size: int = 11008
num_attention_heads: int = 32
rms_norm_eps: float = 1e-05
vocab_size: int = 32064
num_key_value_heads: int = 32
rope_theta: float = 1000000
rope_traditional: bool = False
rope_scaling: Optional[Dict[str, Union[float, str]]] = None |
Thanks for the great repo. This should also be on the list: https://github.com/THUDM/CogVLM2 |
Hey @BoltzmannEntropy and @jrp2014, Thanks for the suggestions! I have added them to the backlog |
MiniCPM-V v2.6 |
1 similar comment
MiniCPM-V v2.6 |
Do you have a link to Florence-2? |
Is the above list the ultimate and up-to-date list of supported models @Blaizzy? Thanks for your hard work! |
Hey @ChristianWeyer |
Thanks! |
Yes, they have the same arch so there are no changes needed :) |
Hey @Blaizzy, thanks for this great framework. Is there any priority for InternVL? I can see it is present in your list. Just wanted to know if it planned in your near term. Want to make the model run on my macbook and mlx-vlm looks to be the best way for that. |
Qwen2-VL-72B would be amazing! |
This recipe seems to work for Qwen2-VL-2B-Instruct: python -m mlx_vlm.generate \
--model Qwen/Qwen2-VL-2B-Instruct \
--max-tokens 100 \
--temp 0.0 \
--image django-roadmap.png \
--prompt "Describe image in detail, include all text" My results here: https://gist.github.com/simonw/9e02d425cacb902260ec1307e0671e17 |
Yep they just merged Qwen2-vl support this weekend. |
Molmo please |
Nvidia just dropped multimodal NVLM-D-72B. The benchmark looks pretty good. |
Yap, that's a pretty awesome model! |
Pixtral-12B now has Base model. |
Hey @Blaizzy, could you add ColQwen support? As there already is qwen2-vl and ColQwen is just an additional linear layer on top this seems to be a low hanging fruit, also considering Col* is a really hot topic right now. I could really use this for my projects (e.g. local private document search + qa) 😊 |
Working on Idefics 3 here: #124 |
@Benjoyo, ColQwen and CoPali are awesome models. At the moment, I'm going working on refactoring and some optimisations. New model ports by me are on hold. However, I appreaciate any PRs. I'm here to review and help when needed. |
Thanky you very much, @pcuenca! It means a lot 🚀 I left a few comments. |
is it possible to bring this under mlx-vlm |
Instructions:
If the model you want is not listed, please suggest it and I will add it.
The text was updated successfully, but these errors were encountered: