MLPerf inference LLAMA 2 70B #16

Workflow file for this run

.github/workflows/test-mlperf-inference-llama2.yml at dfbf6c2

	# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
	# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

	name: MLPerf inference LLAMA 2 70B

	on:
	schedule:
	- cron: "59 04 * * *"

	jobs:
	build_reference:
	if: github.repository_owner == 'gateoverflow'
	runs-on: [ self-hosted, GO-spr, linux, x64 ]
	strategy:
	fail-fast: false
	matrix:
	python-version: [ "3.12" ]
	backend: [ "pytorch" ]
	device: [ "cpu" ]
	precision: [ "bfloat16" ]

	steps:
	- name: Test MLPerf Inference LLAMA 2 70B reference implementation
	run: \|
	source gh_action/bin/deactivate \|\| python3 -m venv gh_action
	source gh_action/bin/activate
	export CM_REPOS=$HOME/GH_CM
	pip install cm4mlops
	pip install tabulate
	cm pull repo
	pip install "huggingface_hub[cli]"
	git config --global credential.helper store
	huggingface-cli login --token ${{ secrets.HF_TOKEN }} --add-to-git-credential
	cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --model=llama2-70b-99 --implementation=reference --backend=${{ matrix.backend }} --precision=${{ matrix.precision }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker --quiet --test_query_count=1 --target_qps=0.001 --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --env.CM_MLPERF_MODEL_LLAMA2_70B_DOWNLOAD_TO_HOST=yes --clean
	cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_test_submissions_v5.0 --repo_branch=main --commit_message="Results from self hosted Github actions" --quiet --submission_dir=$HOME/gh_action_submissions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLPerf inference LLAMA 2 70B #16

Workflow file

MLPerf inference LLAMA 2 70B #16

Jobs

Run details

Workflow file for this run