-
Notifications
You must be signed in to change notification settings - Fork 38
Conversation
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
<style>
</style>
|
<style> </style> here is gemma7b&2b accuray,the inference result of gemma7b is bad when quantize config is weight_dtype int4 compute dtype int8 group_size 128, but gemma2b's result looks good. |
Signed-off-by: intellinjun <[email protected]>
https://huggingface.co/google/gemma-7b-it/discussions/38 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
https://inteltf-jenk.sh.intel.com/job/neural_speed_extension/103/artifact/report.html gemma-2b performance, looks so bad. |
new gemma2 performance, @luoyu-intel ,thanks. |
Type of Change
feature or bug fix or documentation or others
API changed or not
Description
detail description
Issues: xxx
Expected Behavior & Potential Risk
the expected behavior that triggered by this PR
How has this PR been tested?
how to reproduce the test (including hardware information)
Dependency Change?
any library dependency introduced or removed