You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are trying to create a DASK cluster secured by tls with KubeCluster (both classic and operator) in our K8s cluster with limited success. Self-signed certificate and key are generated and inserted into the secret as .pem files.
The secret is mounted to the different pods, client, scheduler, and worker.
Classic scenario
The following Python commands are executed (classic) in pod based on dask:latest-py3.8 where the secret is mounted under /certs and environment variables related to tls are also set.
The dask_kubernetes.classic.Scheduler is created and seems to listen on tls on 8786 port, but the dask_kubernetes.classic.KubeCluster throws the following exception:
RuntimeError: encryption required by Dask configuration, refusing communication from/to 'tcp://dask-root-36eed16d-9.cswopt-proto:8786'
This seems to be caused by the mismatch of connection_args and address (the address says tcp in the exception) The scheduler can be connected with dask.distributed.Client by giving the proper address with tls.
NOTE: However, if the KubeCluster is started up in local deploy mode, all starts to work all good.
The scheduler is created and listening on proper tls port on 8786. Workers are created too but cannot connect to the scheduler.
RuntimeError: encryption required by Dask configuration, refusing communication from/to 'tcp://10.240.116.72:0
Additionally, the scheduler reports periodic TLS handshake issues with the client.
Listener on 'tls://0.0.0.0:8786': TLS handshake failed with remote 'tls://10.240.80.34:55512': TLS/SSL connection has been closed (EOF)
Expected result
No exceptions are thrown and workers as well as clients are communicating properly with TLS.
Any suggestions to get it working? I am aware that the classic is no longer supported and will be phased out, so no fix is expected, but how about the operator-based KubecCluster?
Anything else we need to know?
The same happens with dask and dask-kubernetes latest (2024.1.0)
Environment
Dask versions: 2023.5.0, 2024.1.0
Dask-kubernetes versions: 2023.3.2, 2024.1.0
Python versions: 3.8, 3.10
Operating System: Linux
Install method (conda, pip, source): pip
The text was updated successfully, but these errors were encountered:
Describe the issue
We are trying to create a DASK cluster secured by tls with
KubeCluster
(both classic and operator) in our K8s cluster with limited success. Self-signed certificate and key are generated and inserted into the secret as.pem
files.The secret is mounted to the different pods, client, scheduler, and worker.
Classic scenario
The following Python commands are executed (classic) in pod based on
dask:latest-py3.8
where the secret is mounted under/certs
and environment variables related to tls are also set.where the
worker-spec.yaml
is:Current result
The
dask_kubernetes.classic.Scheduler
is created and seems to listen on tls on8786
port, but thedask_kubernetes.classic.KubeCluster
throws the following exception:RuntimeError: encryption required by Dask configuration, refusing communication from/to 'tcp://dask-root-36eed16d-9.cswopt-proto:8786'
This seems to be caused by the mismatch of
connection_args
andaddress
(theaddress
saystcp
in the exception) The scheduler can be connected withdask.distributed.Client
by giving the proper address withtls
.NOTE: However, if the
KubeCluster
is started up in local deploy mode, all starts to work all good.No exception, cluster can be scaled, etc.
Operator scenario
When it comes to the operator the following Python script is executed:
where the
daskcluster-spec.yaml
is:Current result
The scheduler is created and listening on proper tls port on
8786
. Workers are created too but cannot connect to the scheduler.RuntimeError: encryption required by Dask configuration, refusing communication from/to 'tcp://10.240.116.72:0
Additionally, the scheduler reports periodic TLS handshake issues with the client.
Expected result
No exceptions are thrown and workers as well as clients are communicating properly with TLS.
Any suggestions to get it working? I am aware that the classic is no longer supported and will be phased out, so no fix is expected, but how about the operator-based
KubecCluster
?Anything else we need to know?
The same happens with dask and dask-kubernetes latest (2024.1.0)
Environment
The text was updated successfully, but these errors were encountered: