Name		Name	Last commit message	Last commit date
parent directory ..
files/chatglm3-6b		files/chatglm3-6b
CMakeLists.txt		CMakeLists.txt
README.md		README.md
chat.cpp		chat.cpp
compile.sh		compile.sh
export_onnx.py		export_onnx.py
pipeline.py		pipeline.py

README.md

Command

Export onnx

pip install sentencepiece transformers==4.30.2

export onnx

../compile/chatglm3-6b是你torch模型的位置

cp files/chatglm3-6b/modeling_chatglm.py ../compile/chatglm3-6b

python export_onnx.py --model_path ../compile/chatglm3-6b --device cpu --seq_length 512 --num_threads 8

Compile bmodel

pushd /path_to/tpu-mlir
source envsetup.sh
popd

compile bmodel

./compile.sh --mode int4 --name chatglm3-6b --num_device 2

也可以直接下载编译好的模型，不用自己编译

pip3 install dfss
python3 -m dfss [email protected]:/ext_model_information/LLM/LLM-TPU/chatglm3-6b_int4_2dev_512.bmodel
python3 -m dfss [email protected]:/ext_model_information/LLM/LLM-TPU/chatglm3-6b_int8_2dev_512.bmodel

python demo

首先准备环境

sudo pip3 install pybind11[global] sentencepiece

之后编译库文件，并运行

mkdir build
cd build && cmake .. && make && cp *cpython* .. && cd ..

python3 pipeline.py --model_path chatglm3-6b_int4_2dev_512.bmodel --tokenizer_path ../support/token_config/ --devid 0,1 --generation_mode greedy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallel_demo

parallel_demo

README.md

Command

Export onnx

export onnx

Compile bmodel

compile bmodel

python demo

Files

parallel_demo

Directory actions

More options

Directory actions

More options

Latest commit

History

parallel_demo

Folders and files

parent directory

README.md

Command

Export onnx

export onnx

Compile bmodel

compile bmodel

python demo