-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to deploy a 'Compute Instance' User Resource to a Workspace AML Service #4151
Comments
Hi @dram1964, can you create an AML in the portal manually? |
Hi @tim-allen-ck - logged-in as the Global Admin for the tenant, I've created an AML workspace with basic settings (public access) in the UK South region and added a compute which completed in 5 minutes or so. My efforts via the TRE usually take around 30 minutes before they report a failure. I could try to repeat the exercise using an adjusted version of the terraform code from the AML workspace service if that would be useful. Should I use the same credentials as I have in the TRE code? |
Interesting development - Decided to re-deploy the AML Service into a workspace, this time with |
Could potentially be something to do with private endpoints within the vnet? |
@dram1964 did you get any further with this? If it is compute size, it doesn't really make sense that it works in one network configuration, but not the other. As @tim-allen-ck says if can deploy the instance through the AML studio it would be useful to identify if the issue is the templates in this project, or a subscription/quota issue. |
@marrobi , @tim-allen-ck: I couldn't see any quota issues with private endpoints in the subscription/region (25/65,000). I've destroyed my original TRE deployment, and created a new one (without my custom templates) in the same subscription/region: but I'm still having the same issue. |
Have you tried to create the compute instance via the AML studio? |
Only on a Workspace that had public access. A private AML workspace looked a bit complicated on the Portal - seems I need to create a VNet beforehand to then create private endpoints. I can give a go though. |
Just a quick update: thought I'd try a re-deployment of the TRE in the westeurope region: but the same error occurs when trying to deploy the compute instance. So I'm going to manually deploy a private AML instance and see how that goes. I'm going to use the terraform quickstart template rather than attempt this in the portal. |
I've setup a AML workspace with |
Deployment of AML Compute Instance fails
When adding a Compute Instance to a TRE Workspace AML Service, the deployment fails with the following error:
desired number of dedicated nodes could not be allocated
. This error has been happening consistently for the pasttwo days. Have not tried it before then with this version of the TRE.
This error occurs when deploying via:
Steps to reproduce
Additional Steps taken
Additional Info
Azure TRE release version: v0.19.1
tre-workspace-base: 1.5.7
tre-service-azureml: 0.8.11
tre-user-resource-aml-compute-instance: 0.5.7
deployment location: UKSouth
The text was updated successfully, but these errors were encountered: