Releases: kubeflow/arena
Releases · kubeflow/arena
v0.8.1
Release 0.8.1
Added
- Support mpi job support gpu topology scheduling
Changed
- Support both containerd and docker in gpu exporter
- update client-go to v0.18.5
Fixed
- Fix the bug of submitting Spark Job
- Fix CVE-2020-8570
Please follow the Get started Guide to install.
v0.8.0
Release 0.8.0
Added
- Support using APIs to manage training or serving jobs for Python(arena-python-sdk)
- Support using APIs to manage training or serving jobs for Java(arena-java-sdk)
- Support submitting a seldon serving job
- Support generating the kubeconfig file for the specified user
- Support specifying the starting sequence of the tfjob
Changed
- Refactor the documentation and move documentation to readthedocs
- Reduce execution time of arena
- Remove the deprecated code
Fixed
- Fix the bug of submitting Spark Job
- Fix the bug of viewing logs when chief pod is missing
Please follow the Get started Guide to install.
v0.7.1
- Make et-opertor in arena-system
Please follow the Get started Guide to install.
v0.7.0
- Support using apis to manage training or serving jobs(arena-go-sdk)
- Support getting gpu metrics from Alibaba Cloud ARMS Prometheus
- Support getting node gpu metrics
- Command of "arena get" supports "-g" option
- Support the arena daemon mode, reduce the api-server pressure, arena can listen k8s objects in this mode
- Command of "arena logs" supports "-c" to specify container
- Support to attach a job container and execute some commands("arena attach")
- Command of "arena top node" supports "-r" option
Please follow the Get started Guide to install.
v0.6.0
- Add Support of Elastic Training, such Elastic Horovod
- Support using private image
Please follow the Get started Guide to install.
v0.5.0
- Add Support of Pytorch
- Add tarball installation for Linux and Mac
- Support GangScheduling Native in MPIJob
Please follow the Get started Guide to install.
v0.4.0
- Add GPU support for PS
- Support Kubernetes 1.18 and above
- Fix the bug of deploying Prometheus
Please follow the Get started Guide to install.
v0.3.3
- Support non-root installation
- Add train init framework
- Fix the bug of using Estimator
Please follow the Get started Guide to install.
v0.3.2
- Fix evaluator & chief validation
- Fix incorrect cpu resource variable, should be psCPU
- Set exit code as 2 when delete job failed
Please follow the Get started Guide to install.
v0.3.1
- Upgrade Deployment version from
extensions/v1beta1
toapps/v1
- Fix the issue of incorrect number of allocated GPUs
- Upgrade Helm to v2.14.1
Please follow the Get started Guide to install.