Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

[Model Enabling] Support ChatGLM3 #182

Merged
merged 15 commits into from
Mar 21, 2024
Merged

[Model Enabling] Support ChatGLM3 #182

merged 15 commits into from
Mar 21, 2024

Conversation

Zhenzhong1
Copy link
Contributor

@Zhenzhong1 Zhenzhong1 commented Mar 20, 2024

Type of Change

New Feature.

Description

Support ChatGLM3 in the Neural Speed.

Expected Behavior & Potential Risk

N/A

How has this PR been tested?

numactl -m 0 -C 0-55 python scripts/run.py /home/zhenzhong/model/chatglm3-6b/ -p "你好" --model_type=chatglm3
image

Dependency Change?

N/A

@Zhenzhong1 Zhenzhong1 marked this pull request as ready for review March 21, 2024 03:22
Copy link
Contributor

@a32543254 a32543254 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@a32543254
Copy link
Contributor

Could you also add extension test, and post the benchmark data here?

@a32543254
Copy link
Contributor

Could you also add extension test, and post the benchmark data here?

No need, seems chat GLM3 share same structure with chat GLM2. so we can treat them as one.

@Zhenzhong1
Copy link
Contributor Author

Zhenzhong1 commented Mar 21, 2024

Could you also add extension test, and post the benchmark data here?

No need, seems chat GLM3 share same structure with chat GLM2. so we can treat them as one.

yes. It's true. It's ok. I have added it. It also helps us to check convert / queantize / inference pipeline for GLM3.

https://github.com/intel-innersource/frameworks.ai.lpot.lpot-validation/pull/623/files

@VincyZhang VincyZhang merged commit 94e74d7 into main Mar 21, 2024
10 of 11 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants