问一个关于推理速度的问题 #267
Closed
lingyezhixing
started this conversation in
General
Replies: 2 comments
-
|
Beta Was this translation helpful? Give feedback.
0 replies
-
给一个邮箱,讨论一下 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
假如我要使用int4量化,那么直接加载int4模型与加载全量模型再量化两者的推理速度在理论上是否会有差别?为什么我在实际使用中感觉前者的推理速度极快而后者的推理速度和全量模型没什么明显区别
Beta Was this translation helpful? Give feedback.
All reactions