Skip to content

Latest commit

 

History

History
 
 

parallel_demo

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Command

Export onnx

pip install sentencepiece transformers==4.30.2

export onnx

../compile/chatglm3-6b是你torch模型的位置

cp files/chatglm3-6b/modeling_chatglm.py ../compile/chatglm3-6b

python export_onnx.py --model_path ../compile/chatglm3-6b --device cpu --seq_length 512 --num_threads 8

Compile bmodel

pushd /path_to/tpu-mlir
source envsetup.sh
popd

compile bmodel

./compile.sh --mode int4 --name chatglm3-6b --num_device 2

也可以直接下载编译好的模型,不用自己编译

pip3 install dfss
python3 -m dfss [email protected]:/ext_model_information/LLM/LLM-TPU/chatglm3-6b_int4_2dev_512.bmodel
python3 -m dfss [email protected]:/ext_model_information/LLM/LLM-TPU/chatglm3-6b_int8_2dev_512.bmodel

python demo

首先准备环境

sudo pip3 install pybind11[global] sentencepiece

之后编译库文件,并运行

mkdir build
cd build && cmake .. && make && cp *cpython* .. && cd ..

python3 pipeline.py --model_path chatglm3-6b_int4_2dev_512.bmodel --tokenizer_path ../support/token_config/ --devid 0,1 --generation_mode greedy