Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate dask cluster name to contain only allowed characters #826

Open
dbalabka opened this issue Oct 3, 2023 · 2 comments
Open

Validate dask cluster name to contain only allowed characters #826

dbalabka opened this issue Oct 3, 2023 · 2 comments

Comments

@dbalabka
Copy link
Contributor

dbalabka commented Oct 3, 2023

Having a "." in the cluster name leads to a hung cluster in the "Created" state:

❯ kubectl get daskclusters
NAME                                     WORKERS   STATUS    AGE
dmitry.balabka-cluster              1         Created   17h

While it is described in official k8s docs about Object Names, it would be great to have a validation in the dask library that raises at least a warning.
https://kubernetes.io/docs/concepts/overview/working-with-objects/names/

Code example:

from dask_kubernetes.operator import KubeCluster
 
# Create a Dask cluster by calling Kubernetes API via kubectl command based on provided specificatoin
cluster = KubeCluster(
    name="dmitry.balakba-cluster",
    namespace="dask-operator",
)
 
client = cluster.get_client()
 
# Renders cluster configuration in the cell output
client
@dbalabka dbalabka changed the title Validate dask cluster name to contian only allowed characters Validate dask cluster name to contain only allowed characters Oct 3, 2023
@dbalabka
Copy link
Contributor Author

dbalabka commented Oct 3, 2023

It is possible that the Linux user name contains .. Basically, it might happen even w/o specifying a custom name because generated cluster name does not sanitize the name as well:

name = name.format(
user=getpass.getuser(), uuid=str(uuid.uuid4())[:10], **os.environ
)

Not sure should the validation happen in dask-kubernetes or kr8s/kopf

@jacobtomlinson
Copy link
Member

It's interesting that the resource gets created, but I'm guessing if you look at the controller logs there are errors there when it tried to create the child components. Ideally the Kubernetes API would reject this name and give us an error that we can propagate back to the user. Maybe this is something we can configure in the CRD?

Otherwise yeah we will need to add some client-side checking. Although maybe we should do that in kr8s so that more folks can benefit from this.

jo-migo pushed a commit to jo-migo/dask-kubernetes that referenced this issue Feb 27, 2024
This commit introduces cluster name validation in order to avoid the
invalid state in which a `DaskCluster` resource with a too-long or
RFC-1123-noncompliant name is created but cannot be
deleted while the operator retries infinitely to create a scheduler service (see dask#826 for more details on this bug).

Issues fixed:
dask#870
dask#826
jacobtomlinson pushed a commit that referenced this issue Mar 1, 2024
* fix(validation): Validate Dask Cluster Names

This commit introduces cluster name validation in order to avoid the
invalid state in which a `DaskCluster` resource with a too-long or
RFC-1123-noncompliant name is created but cannot be
deleted while the operator retries infinitely to create a scheduler service (see #826 for more details on this bug).

Issues fixed:
#870
#826

* Actually, stop removing the dask cluster automatically. It can be manually deleted.

* Move the cluster name validation into a common module, add it to KubeCluster init, and add tests

---------

Co-authored-by: Johanna Goergen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants