Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CreateQueue disabled by default #294

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions charts/clearml-agent/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: clearml-agent
description: MLOps platform Task running agent
type: application
version: "5.2.0"
version: "5.2.1"
appVersion: "1.24"
kubeVersion: ">= 1.21.0-0 < 1.31.0-0"
home: https://clear.ml
Expand All @@ -21,6 +21,4 @@ keywords:
annotations:
artifacthub.io/changes: |
- kind: fixed
description: Support for 1.30 kubernetes
- kind: added
description: Support for multiple queues
description: Create queue must be disabled by default to preserve backward compatibility
6 changes: 3 additions & 3 deletions charts/clearml-agent/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ClearML Kubernetes Agent

![Version: 5.2.0](https://img.shields.io/badge/Version-5.2.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)
![Version: 5.2.1](https://img.shields.io/badge/Version-5.2.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.24](https://img.shields.io/badge/AppVersion-1.24-informational?style=flat-square)

MLOps platform Task running agent

Expand Down Expand Up @@ -53,7 +53,7 @@ Kubernetes: `>= 1.21.0-0 < 1.31.0-0`

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| agentk8sglue | object | `{"additionalClusterRoleBindings":[],"additionalRoleBindings":[],"affinity":{},"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"affinity":{},"annotations":{},"containerSecurityContext":{},"env":[],"fileMounts":[],"hostAliases":[],"initContainers":[],"labels":{},"nodeSelector":{},"podSecurityContext":{},"priorityClassName":"","resources":{},"schedulerName":"","tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerSecurityContext":{},"createQueueIfNotExists":true,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"registry":"","repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"initContainers":{"resources":{}},"labels":{},"nodeSelector":{},"podSecurityContext":{},"queue":"default","replicaCount":1,"resources":{},"serviceExistingAccountName":"","tolerations":[],"volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue | object | `{"additionalClusterRoleBindings":[],"additionalRoleBindings":[],"affinity":{},"annotations":{},"apiServerUrlReference":"https://api.clear.ml","basePodTemplate":{"affinity":{},"annotations":{},"containerSecurityContext":{},"env":[],"fileMounts":[],"hostAliases":[],"initContainers":[],"labels":{},"nodeSelector":{},"podSecurityContext":{},"priorityClassName":"","resources":{},"schedulerName":"","tolerations":[],"volumeMounts":[],"volumes":[]},"clearmlcheckCertificate":true,"containerSecurityContext":{},"createQueueIfNotExists":false,"defaultContainerImage":"ubuntu:18.04","extraEnvs":[],"fileMounts":[],"fileServerUrlReference":"https://files.clear.ml","image":{"registry":"","repository":"allegroai/clearml-agent-k8s-base","tag":"1.24-21"},"initContainers":{"resources":{}},"labels":{},"nodeSelector":{},"podSecurityContext":{},"queue":"default","replicaCount":1,"resources":{},"serviceExistingAccountName":"","tolerations":[],"volumeMounts":[],"volumes":[],"webServerUrlReference":"https://app.clear.ml"}` | This agent will spawn queued experiments in new pods, a good use case is to combine this with GPU autoscaling nodes. https://github.com/allegroai/clearml-agent/tree/master/docker/k8s-glue |
| agentk8sglue.additionalClusterRoleBindings | list | `[]` | additional existing ClusterRoleBindings |
| agentk8sglue.additionalRoleBindings | list | `[]` | additional existing RoleBindings |
| agentk8sglue.affinity | object | `{}` | affinity setup for Agent pod (example in values.yaml comments) |
Expand All @@ -78,7 +78,7 @@ Kubernetes: `>= 1.21.0-0 < 1.31.0-0`
| agentk8sglue.basePodTemplate.volumes | list | `[]` | volumes definition for pods spawned to consume ClearML Task (example in values.yaml comments) |
| agentk8sglue.clearmlcheckCertificate | bool | `true` | Check certificates validity for evefry UrlReference below. |
| agentk8sglue.containerSecurityContext | object | `{}` | container securityContext setup for Agent pod (example in values.yaml comments) |
| agentk8sglue.createQueueIfNotExists | bool | `true` | if ClearML queue does not exist, it will be create it if the value is set to true |
| agentk8sglue.createQueueIfNotExists | bool | `false` | if ClearML queue does not exist, it will be create it if the value is set to true |
| agentk8sglue.defaultContainerImage | string | `"ubuntu:18.04"` | default container image for ClearML Task pod |
| agentk8sglue.extraEnvs | list | `[]` | Extra Environment variables for Glue Agent |
| agentk8sglue.fileMounts | list | `[]` | file definition for Glue Agent (example in values.yaml comments) |
Expand Down
6 changes: 4 additions & 2 deletions charts/clearml-agent/templates/agentk8sglue-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -99,11 +99,13 @@ spec:
--ports-mode --num-of-services {{ .Values.sessions.maxServices }} \
--base-port {{ .Values.sessions.startingPort }} \
--gateway-address {{ .Values.sessions.externalIP }} \
{{- if .Values.agentk8sglue.createQueueIfNotExists }} --create-queue{{- end }}"
{{- if .Values.agentk8sglue.createQueueIfNotExists }} --create-queue{{- end }}
"
{{- else}}
- name: K8S_GLUE_EXTRA_ARGS
value: "--namespace {{ .Release.Namespace }} --template-yaml /root/template/template.yaml \
{{- if .Values.agentk8sglue.createQueueIfNotExists }} --create-queue{{- end }}"
{{- if .Values.agentk8sglue.createQueueIfNotExists }} --create-queue{{- end }}
"
{{- end }}
{{ if or (.Values.clearml.clearmlConfig) (.Values.clearml.existingClearmlConfigSecret) }}
- name: CLEARML_CONFIG_FILE
Expand Down
2 changes: 1 addition & 1 deletion charts/clearml-agent/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ agentk8sglue:
# -- ClearML queue this agent will consume. Multiple queues can be specified with the following format: queue1,queue2,queue3
queue: default
# -- if ClearML queue does not exist, it will be create it if the value is set to true
createQueueIfNotExists: true
createQueueIfNotExists: false
# -- labels setup for Agent pod (example in values.yaml comments)
labels: {}
# schedulerName: scheduler
Expand Down
Loading