Skip to content

Releases: leaf-ai/studio-go-runner

0.14.3-main-aaaagseyvek

29 Sep 00:14
Compare
Choose a tag to compare
Pre-release
Merge branch 'main' of github.com:leaf-ai/studio-go-runner

0.14.3-main-aaaagsdvzdw

23 Sep 01:15
b3072e3
Compare
Choose a tag to compare
Pre-release
Upgraded cards

Introduced new guCount value as well

0.14.3-main-aaaagsbfdfr

16 Sep 21:56
Compare
Choose a tag to compare
Pre-release
Fix the template file so it gets rewound on each pass of the job gene…

0.14.2

02 Jul 19:33
Compare
Choose a tag to compare
Change history and release

0.14.1

23 Jun 00:36
Compare
Choose a tag to compare

IMPROVEMENTS:

  • The queue-status is now called the queue-scaler due to its extended functionality
  • cosign support for Image verification on dockerhub and AWS ECR

FIXES:

  • Provisioning of hosts with the queue-scaler tool can cause overly powerful machines to be allocated
  • The dockerhub release images for this version have been signed. Please review the instructions in the README.md A note concerning security and privacy.

0.14.0

10 Jun 19:09
Compare
Choose a tag to compare

IMPROVEMENTS:

  • Upgrades to the AWS cli, and prometheus common libraries
  • Introduce queue-status, a tool for use with Job dispatching deployments using AutoScaling
  • Ubuntu 18.04 migrated to Ubuntu 20.04
  • TensorFlow 1.x support removed, versions now supported are 2.3-2.5
  • Python support bumped to include 3.9, 3.8.10 is the default
  • gRPC and protobuf upgrades
  • Go 1.16.4 support
  • CUDA 11.2 Migration

FIXES:

  • GPU Memory usage could result in 2 cards being allocated 1 for memory 1 for compute incorrectly

It is worth reminding that the Go module feature now being used provides module authentication using checksums against a database of modules hosted by google. Please review the following privacy notice in regards to this feature, https://proxy.golang.org/privacy. A vendor directory is provided as a means of avoiding Go module proxies performing integrity checking if you wish to run in a air-gaped configuration.

0.13.2

27 Apr 20:22
Compare
Choose a tag to compare

IMPROVEMENTS:

  • Storage limitations now used when downloading artifacts, based on the requested disk space from the StudioML client
  • Idle Time limits added, new options -limit-idle-duration duration, -limit-interval duration with string values such as 10m for 10 minutes
  • Jobs completed limit option added, -limit-tasks
  • Document auto scaling, down to 0, in docs/aws_k8s.md, for the EKS use case.
  • Go 1.16.3 support
  • A100 support in non mig mode only for AWS, mixed, and single mig mode for on-premises Kubernetes
  • RabbitMQ Rabbit Hole and many other dependency upgrades

FIXES:

  • Security changes made for file escape when unpacking artifact archives
  • When using multiple GPUs the CUDA_VISIBLE_DEVICES was getting overwritten by the addition of new GPU devices

KNOWN BUGS:

  • AWS A100 (p4d.24xlarge) mixed, and single mig support is waiting on AWS fixes

0.13.1

25 Feb 05:18
Compare
Choose a tag to compare

IMPROVEMENTS:

  • Go 1.16 support
  • Docker file for the stack introduced to improve build times
  • AWS MMQ support for RabbitMQ, specific instructions can be found at docs/aws_k8s.md

FIXES:

  • TestTFXCfgGenerator timeout was too small causing the test to be flaky and timeout
  • Prevent releases overwritting older versions
  • Fix CWE-22 code blocks for symbolic links in tarfiles, https://cwe.mitre.org/data/definitions/22.html
  • CVE impacted package upgrades

0.13.0

10 Feb 22:14
Compare
Choose a tag to compare

IMPROVEMENTS:

  • Code base pkg components used by multiple projects refactored into a new repository, github.com/leaf-ai/go-service

  • Go 1.15.8 support with modules

  • Remove deprecated Google Cloud storage proprietary API and use S3 mode to interact with the Google Cloud Storage offering

  • S3 Credential migration to being per artifact, also environment variables are no longer used, except when the --allow-env-secrets is specified

0.12.1

14 Jan 21:18
Compare
Choose a tag to compare

IMPROVEMENTS:

  • CUDA 11.0 migration
  • Go 1.15.6 support with modules
  • AWS Support stack refresh, with AWS Managed Rabbit MQ support