Fix VRAM required by Qwen2.5-Coder-1.5B-Instruct model #632

felladrin · 2024-11-20T11:51:37Z

Currently, it has the same VRAM values as the Qwen2.5-Coder-7B-Instruct model.

This change fixes it using the same values from the Qwen2.5-1.5B-Instruct model, as shown in the screenshot below:

Currently, it has the same VRAM values as the `Qwen2.5-Coder-7B-Instruct` model. This change fixes it using the same values from the `Qwen2.5-1.5B-Instruct` model.

CharlieFRuan

LGTM, thanks for the catch and the fix!

### Change - #635 - Integrate with `web-xgrammar` - Support `ResponseFormat.type == "grammar"`, where you specify an EBNF grammar string - Add `grammar_init_ms` and `grammar_per_token_ms` to `CompletionUsage.extra` when using grammar - Add `time_to_first_token_s` (TTFT) and `time_per_output_token_s` (TPOT), `e2e_latency_s` to `CompletionUsage.extra` - Add `ignore_eos` to `Completion` and `ChatCompletion` requests - #632 - Fixes of vram requirement for Qwen2.5-Coder-1.5B-Instruct model ### TVMjs - No change, version `0.18.0-dev2` just like 0.2.71

Fix VRAM required by Qwen2.5-Coder-1.5B-Instruct model

e8a326a

Currently, it has the same VRAM values as the `Qwen2.5-Coder-7B-Instruct` model. This change fixes it using the same values from the `Qwen2.5-1.5B-Instruct` model.

CharlieFRuan approved these changes Nov 22, 2024

View reviewed changes

CharlieFRuan merged commit 6504047 into mlc-ai:main Nov 22, 2024
1 check passed

felladrin deleted the patch-1 branch November 22, 2024 12:09

CharlieFRuan mentioned this pull request Nov 22, 2024

[Version] Bump version to 0.2.76 #637

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix VRAM required by Qwen2.5-Coder-1.5B-Instruct model #632

Fix VRAM required by Qwen2.5-Coder-1.5B-Instruct model #632

felladrin commented Nov 20, 2024 •

edited

Loading

CharlieFRuan left a comment

Fix VRAM required by Qwen2.5-Coder-1.5B-Instruct model #632

Fix VRAM required by Qwen2.5-Coder-1.5B-Instruct model #632

Conversation

felladrin commented Nov 20, 2024 • edited Loading

CharlieFRuan left a comment

Choose a reason for hiding this comment

felladrin commented Nov 20, 2024 •

edited

Loading