Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add ci convert model to onnx #21

Closed
wants to merge 63 commits into from
Closed
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
f3e8540
chore: add ci convert model to onnx
dungpham91 Jul 31, 2024
0e47754
chore: bump genai to a7ca019 (#22)
vansangpfiev Aug 6, 2024
e900b21
Merge branch 'chore/convert-onnx' of github.com:janhq/cortex.onnx int…
nguyenhoangthuan99 Aug 9, 2024
15b9029
convert model using latest change on upstream instead of release pip …
nguyenhoangthuan99 Aug 9, 2024
26438dc
Merge branch 'chore/convert-onnx' of github.com:janhq/cortex.onnx int…
nguyenhoangthuan99 Aug 9, 2024
c79fcf4
test onpush
nguyenhoangthuan99 Aug 9, 2024
834b99b
test onpush
nguyenhoangthuan99 Aug 9, 2024
ea1f1d7
test onpush
nguyenhoangthuan99 Aug 9, 2024
96e1beb
test onpush
nguyenhoangthuan99 Aug 9, 2024
ce7c242
test onpush
nguyenhoangthuan99 Aug 9, 2024
fc2a73c
test onpush
nguyenhoangthuan99 Aug 9, 2024
0de3779
test onpush
nguyenhoangthuan99 Aug 9, 2024
63a8c70
fix test on push
nguyenhoangthuan99 Aug 9, 2024
0987699
fix test on push
nguyenhoangthuan99 Aug 9, 2024
03d5e54
fix test on push
nguyenhoangthuan99 Aug 9, 2024
b64ae16
fix test on push
nguyenhoangthuan99 Aug 9, 2024
1015a5c
fix test on push
nguyenhoangthuan99 Aug 9, 2024
2991acb
fix test on push
nguyenhoangthuan99 Aug 9, 2024
b58ef48
fix test on push
nguyenhoangthuan99 Aug 9, 2024
1edda70
fix test on push
nguyenhoangthuan99 Aug 9, 2024
920e4d9
fix test on push
nguyenhoangthuan99 Aug 9, 2024
4c885e2
fix test on push
nguyenhoangthuan99 Aug 9, 2024
737d044
fix test on push
nguyenhoangthuan99 Aug 9, 2024
4de0d33
fix test on push
nguyenhoangthuan99 Aug 9, 2024
4272d84
fix test on push
nguyenhoangthuan99 Aug 9, 2024
f161d0f
fix test on push
nguyenhoangthuan99 Aug 9, 2024
85ecee8
fix test on push
nguyenhoangthuan99 Aug 9, 2024
f390224
fix test on push
nguyenhoangthuan99 Aug 9, 2024
84dbed2
fix test on push
nguyenhoangthuan99 Aug 9, 2024
f7f9062
fix test on push
nguyenhoangthuan99 Aug 9, 2024
71eff8b
fix test on push
nguyenhoangthuan99 Aug 9, 2024
bcce433
fix test on push
nguyenhoangthuan99 Aug 9, 2024
51701f2
fix test on push
nguyenhoangthuan99 Aug 9, 2024
4de74c2
fix test on push
nguyenhoangthuan99 Aug 9, 2024
4f9ac62
fix test on push
nguyenhoangthuan99 Aug 9, 2024
e844247
fix test on push
nguyenhoangthuan99 Aug 9, 2024
97a2bcf
fix test on push
nguyenhoangthuan99 Aug 9, 2024
5c28e4f
fix test on push
nguyenhoangthuan99 Aug 9, 2024
b842721
fix test on push
nguyenhoangthuan99 Aug 9, 2024
591d652
fix test on push
nguyenhoangthuan99 Aug 9, 2024
b607bc8
fix test on push
nguyenhoangthuan99 Aug 9, 2024
c19108f
fix test on push
nguyenhoangthuan99 Aug 9, 2024
a28f303
fix test on push
nguyenhoangthuan99 Aug 9, 2024
3137605
fix test on push
nguyenhoangthuan99 Aug 9, 2024
3d3e033
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
8c3392b
test: CI convert gemma2
nguyenhoangthuan99 Aug 11, 2024
42ffbeb
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
0343da5
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
d37afb9
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
20ab726
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
6ed80bb
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
edc23d9
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
cd793e7
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
a56183a
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
6daa67e
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
ee59c1a
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
337ea9d
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
ecbbee5
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
4d7551b
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
2afd462
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
0c1c5cc
test: CI convert llama3.1
nguyenhoangthuan99 Aug 11, 2024
d4c76e4
finalize request
nguyenhoangthuan99 Aug 11, 2024
32db7a0
remove unused env variable
nguyenhoangthuan99 Aug 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 127 additions & 0 deletions .github/workflows/convert-model.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
name: Convert model to ONNX

on:
# push:
# branches:
# - 'chore/convert-onnx'
workflow_dispatch:
inputs:
source_model_id:
description: "Source HuggingFace model ID to pull. For ex: meta-llama/Meta-Llama-3.1-8B-Instruct"
required: true
source_model_size:
description: "The model size. For ex: 8b"
required: true
type: string
target_model_id:
description: "Target HuggingFace model ID to push. For ex: llama3.1"
required: true
type: string

# concurrency:
# group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
# cancel-in-progress: true

env:
USER_NAME: cortexso
SOURCE_MODEL_ID: ${{ inputs.source_model_id }}
SOURCE_MODEL_SIZE: ${{ inputs.source_model_size }}
TARGET_MODEL_ID: ${{ inputs.target_model_id }}
PRECISION: int4 # Valid values: int4,fp16,fp3
EXECUTOR: dml # Valid values: cpu,cuda,dml,web
ONNXRUNTIME_GENAI_VERSION: 0.3.0 # Check version from: https://github.com/microsoft/onnxruntime-genai/releases
vansangpfiev marked this conversation as resolved.
Show resolved Hide resolved

jobs:
converter:
runs-on: windows-onnx
steps:
- name: Checkout
uses: actions/checkout@v4 # v4.1.7
with:
submodules: recursive

- name: Set up Python
uses: actions/setup-python@v5 # v5.1.1
with:
python-version: '3.10'
# architecture: 'x64'

- name: Cache Python packages
uses: actions/cache@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9 # v4.0.2
with:
path: |
~/.cache/pip
~/.local/share/pip
.venv
key: ${{ runner.os }}-pip-${{ github.sha }}
restore-keys: |
${{ runner.os }}-pip-

- name: Install dependencies
shell: powershell
run: |
pip3 install -I --user huggingface_hub hf-transfer numpy==1.26.4 torch==2.3.1 transformers==4.43.4 onnx==1.16.1 onnxruntime==1.18.0 sentencepiece==0.2.0

- name: Extract MODEL_NAME
shell: powershell
run: |
$SOURCE_MODEL_ID = "${{ env.SOURCE_MODEL_ID }}"
$ADDR = $SOURCE_MODEL_ID -split '/'
$MODEL_NAME = $ADDR[-1]
$MODEL_NAME_LOWER = $MODEL_NAME.ToLower()
echo "MODEL_NAME=$MODEL_NAME_LOWER" >> $env:GITHUB_ENV
echo "MODEL_NAME_LOWER=$MODEL_NAME_LOWER" # For debugging

- name: Print environment variables
run: |
echo "SOURCE_MODEL_ID: ${{ env.SOURCE_MODEL_ID }}"
echo "PRECISION: ${{ env.PRECISION }}"
echo "EXECUTOR: ${{ env.EXECUTOR }}"
echo "MODEL_NAME: ${{ env.MODEL_NAME }}"
- name: Check file existence
id: check_files
uses: andstor/file-existence-action@v1
with:
files: "C:\\models\\${{ env.MODEL_NAME }}/hf"


- name: Prepare folders
if: steps.check_files.outputs.files_exists == 'false'
run: |
mkdir -p C:\\models\\${{ env.MODEL_NAME }}/hf
mkdir -p C:\\models\\${{ env.MODEL_NAME }}/onnx
mkdir -p C:\\models\\${{ env.MODEL_NAME }}/cache

- name: Download Hugging Face model
id: download_hf
if: steps.check_files.outputs.files_exists == 'false'
run: |
huggingface-cli login --token ${{ secrets.HUGGINGFACE_TOKEN_READ }} --add-to-git-credential
huggingface-cli download --repo-type model --local-dir C:\\models\\${{ env.MODEL_NAME }}/hf ${{ env.SOURCE_MODEL_ID }}
huggingface-cli logout

# - name: Remove Failure Download
# if: steps.download_hf.outcome == 'failure'
# run: |
# Remove-Item -Recurse -Force -Path "$C:\\models\\{{ env.MODEL_NAME }}"


- name: Convert to ONNX - DirectML - INT4
shell: powershell
run: |
mkdir -p ${{ env.MODEL_NAME }}/onnx
huggingface-cli login --token ${{ secrets.HUGGINGFACE_TOKEN_READ }} --add-to-git-credential
python3 "onnxruntime-genai/src/python/py/models/builder.py" -i "c:/models/${{ env.MODEL_NAME }}/hf" -o "c:/models/${{ env.MODEL_NAME }}/onnx" -p ${{ env.PRECISION }} -e ${{ env.EXECUTOR }} -c "c:/models/${{ env.MODEL_NAME }}/cache"
huggingface-cli logout

- name: Upload to Hugging Face
run: |
Get-ChildItem -Path "C:\\models\\${{ env.MODEL_NAME }}/onnx" -Force
huggingface-cli login --token ${{ secrets.HUGGINGFACE_TOKEN_WRITE }} --add-to-git-credential
huggingface-cli upload "${{ env.USER_NAME }}/${{ env.TARGET_MODEL_ID }}" "c:/models/${{ env.MODEL_NAME }}/onnx" . --revision "${{ env.SOURCE_MODEL_SIZE }}-onnx"
huggingface-cli logout

- name: Cleanup
if: always()
run: |
Remove-Item -Recurse -Force -Path "${{ env.MODEL_NAME }}"
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ endif

build-onnxruntime:
ifeq ($(OS),Windows_NT) # Windows
@powershell -Command "cmake -S .\onnxruntime-genai\ -B .\onnxruntime-genai\build -DUSE_DML=ON -DUSE_CUDA=OFF -DENABLE_PYTHON=OFF -DORT_HOME=\".\build_deps\ort\";"
@powershell -Command "cmake -S .\onnxruntime-genai\ -B .\onnxruntime-genai\build -DUSE_DML=ON -DUSE_CUDA=OFF -DUSE_ROCM=OFF -DENABLE_PYTHON=OFF -DORT_HOME=\".\build_deps\ort\";"
@powershell -Command "cmake --build .\onnxruntime-genai\build --config Release -j4;"
else # Unix-like systems (Linux and MacOS)
@echo "Skipping install dependencies"
Expand Down
2 changes: 1 addition & 1 deletion build_cortex_onnx.bat
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
cmake -S ./third-party -B ./build_deps/third-party
cmake --build ./build_deps/third-party --config Release -j4

cmake -S .\onnxruntime-genai\ -B .\onnxruntime-genai\build -DUSE_DML=ON -DUSE_CUDA=OFF -DORT_HOME="./build_deps/ort" -DENABLE_PYTHON=OFF -DENABLE_TESTS=OFF -DENABLE_MODEL_BENCHMARK=OFF
cmake -S .\onnxruntime-genai\ -B .\onnxruntime-genai\build -DUSE_DML=ON -DUSE_CUDA=OFF -DUSE_ROCM=OFF -DORT_HOME="./build_deps/ort" -DENABLE_PYTHON=OFF -DENABLE_TESTS=OFF -DENABLE_MODEL_BENCHMARK=OFF
cmake --build .\onnxruntime-genai\build --config Release -j4
2 changes: 1 addition & 1 deletion onnxruntime-genai
Submodule onnxruntime-genai updated 169 files
Loading