AWS Neuron EKS Samples

This repository contains samples for Amazon Elastic Kubernetes Service (EKS) and AWS Neuron, the software development kit (SDK) that enables machine learning (ML) inference and training workloads on the AWS ML accelerator chips Inferentia and Trainium.

The samples in this repository demonstrate the types of patterns that can be used to deliver inference and distributed training on EKS using Inferentia and Trainium. The samples can be used as-is, or easily modified to support additional models and use cases.

Samples are organized by use case below:

Training

Link	Description	Instance Type
BERT pretraining	End-end workflow for creating an EKS cluster with 2 trn1.32xl nodes and running BERT phase1 pretraining (64-worker DataParallel)	Trn1
MLP training	Introductory workflow for creating an EKS cluster with 1 node and running a simple MLP training job	Trn1
Llama 3.1 8B finetuning with Ray+PTL	End-end workflow for creating a Ray cluster with 2 trn1.32xlarge nodes on EKS and running Llama 3.1 8B finetuning	Trn1

Inference

Link	Description	Instance Type
SD inference	SD Inference workflow for creating an inference endpoint forwarded by ALB LoadBalancer powered by Karpenter's NodePool	Inf2

Getting Help

If you encounter issues with any of the samples in this repository, please open an issue via the GitHub Issues feature.

Contributing

Please refer to the CONTRIBUTING document for details on contributing additional samples to this repository.

Release Notes

Please refer to the Change Log.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

AWS Neuron EKS Samples

Training

Inference

Getting Help

Contributing

Release Notes

Files

README.md

Latest commit

History

README.md

File metadata and controls

AWS Neuron EKS Samples

Training

Inference

Getting Help

Contributing

Release Notes