Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: convert med E2E CI job to L4 GPU #305

Merged
merged 1 commit into from
Oct 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/mergify.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,18 +28,18 @@ pull_request_rules:
# e2e medium workflow
- or:
- and:
# note this should match the triggering criteria in 'e2e-nvidia-a10g-x1.yml'
# note this should match the triggering criteria in 'e2e-nvidia-l4-x1.yml'
- check-success~=e2e-medium-workflow-complete
- or:
- files~=\.py$
- files=pyproject.toml
- files~=^requirements.*\.txt$
- files=.github/workflows/e2e-nvidia-a10g-x1.yml
- files=.github/workflows/e2e-nvidia-l4-x1.yml
- and:
- -files~=\.py$
- -files=pyproject.toml
- -files~=^requirements.*\.txt$
- -files=.github/workflows/e2e-nvidia-a10g-x1.yml
- -files=.github/workflows/e2e-nvidia-l4-x1.yml

# code lint workflow
- or:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# SPDX-License-Identifier: Apache-2.0

name: E2E (NVIDIA A10G x1)
name: E2E (NVIDIA L4 x1)

on:
# run against every merge commit to 'main' and release branches
Expand All @@ -18,7 +18,7 @@ on:
- '**.py'
- 'pyproject.toml'
- 'requirements**.txt'
- '.github/workflows/e2e-nvidia-a10g-x1.yml' # This workflow
- '.github/workflows/e2e-nvidia-l4-x1.yml' # This workflow

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
Expand Down Expand Up @@ -55,7 +55,7 @@ jobs:
mode: start
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
ec2-image-id: ${{ vars.AWS_EC2_AMI }}
ec2-instance-type: g5.4xlarge
ec2-instance-type: g6.8xlarge
subnet-id: subnet-02d230cffd9385bd4
security-group-id: sg-06300447c4a5fbef3
iam-role-name: instructlab-ci-runner
Expand Down Expand Up @@ -117,19 +117,19 @@ jobs:
nvidia-smi
python3.11 -m pip cache remove llama_cpp_python

CMAKE_ARGS="-DLLAMA_CUDA=on" python3.11 -m pip install .
CMAKE_ARGS="-DLLAMA_CUDA=on" python3.11 -m pip install -v .

# https://github.com/instructlab/instructlab/issues/1821
# install with Torch and build dependencies installed
python3.11 -m pip install packaging wheel setuptools-scm
python3.11 -m pip install .[cuda] -r requirements-vllm-cuda.txt
python3.11 -m pip install -v packaging wheel setuptools-scm
python3.11 -m pip install -v .[cuda] -r requirements-vllm-cuda.txt

- name: Update instructlab-training library
working-directory: ./training
run: |
. ../instructlab/venv/bin/activate
pip install .
pip install .[cuda]
pip install -v .
pip install -v .[cuda]

- name: Check disk
run: |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
![Release](https://img.shields.io/github/v/release/instructlab/training)
![License](https://img.shields.io/github/license/instructlab/training)

![`e2e-nvidia-a10g-x1.yml` on `main`](https://github.com/instructlab/training/actions/workflows/e2e-nvidia-a10g-x1.yml/badge.svg?branch=main)
![`e2e-nvidia-l4-x1.yml` on `main`](https://github.com/instructlab/training/actions/workflows/e2e-nvidia-l4-x1.yml/badge.svg?branch=main)
![`e2e-nvidia-l40s-x4.yml` on `main`](https://github.com/instructlab/training/actions/workflows/e2e-nvidia-l40s-x4.yml/badge.svg?branch=main)

- [Installing](#installing-the-library)
Expand Down