Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to migrate to Porter v1 #3737

Closed
JaimieWi opened this issue Oct 10, 2023 · 7 comments
Closed

Failure to migrate to Porter v1 #3737

JaimieWi opened this issue Oct 10, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@JaimieWi
Copy link
Contributor

I can see an error in the resource processor that indicates there is a failure in the migration process for porter.

We haven't run an upgrade to our TRE in a while and have only just discovered this error. We have managed to upgrade the TRE in every other aspect to v0.14.1, our DEV pipeline is all green.

Describe the bug
In our DEV TRE, any new resources create successfully. Updates to existing resources fails. For example the upgrade to the newer version of the firewall fails. We're unable to see any operations in the UI, in cosmos the error indicates it is unable to find the existing resource.

Screenshot 2023-10-10 at 15 23 34
"resourceVersion": 22,
"status": "updating_failed",
"action": "upgrade",
"message": "db4cced8-7840-4a24-a381-bcd7971d7044: Error message: could not find installation /db4cced8-7840-4a24-a381-bcd7971d7044: Installation not found could not find installation /db4cced8-7840-4a24-a381-bcd7971d7044: Installation not found ; Command executed: az cloud set --name AzureCloud && az login --identity -u 99b4123b-5955-4ecf-b6dc-1198e4b83b41 && az acr login --name acrdevci && porter upgrade \"db4cced8-7840-4a24-a381-bcd7971d7044\" --reference acrdevci.azurecr.io/tre-shared-service-firewall:v1.1.5 

When checking the resource processor I can see that this is because the run.sh script failed at the porter migrate step
porter storage migrate --old-home "${PORTER_HOME_V0}" --old-account "azurestorage"

Further investigation shows an error stating:
error reading the schema from the old PORTER_HOME: no file found for schema: File does not exist

Screenshot 2023-10-10 at 15 09 37

Steps to reproduce

  1. Use Azure Pipelines
  2. Have v0.7.0 of the TRE
  3. Upgrade to v0.14.1, editing the .devcontainer files appropriately
  4. Try to update the firewall or any other resource and it fails.

Porter section of our dockerfile:
Screenshot 2023-10-10 at 14 53 04

Notes:

  • Porter v1 is successfully set up in MongoDB, files can be found.
  • The error is exactly the same as mentioned in this issue :Resource Processor does not start on clean install #3498
    • But the work around mentioned here doesn't help as it seems to skip the migration.
  • We haven't run our prod pipeline yet

Please could you advise on what our next steps should be

@JaimieWi JaimieWi added the bug Something isn't working label Oct 10, 2023
@marrobi
Copy link
Member

marrobi commented Oct 10, 2023

Hi @JaimieWi , v0.7.0 is going back in time a bit, didn't think anyone would be left on porter v0.

Have you gone straight from v0.7.0 to v0.14.1? Have you looked at all the breaking changes listed https://github.com/microsoft/AzureTRE/releases on intermediate releases and dealt with them as required?

Afraid you might be in unchartered territory.

@JaimieWi
Copy link
Contributor Author

Hi @marrobi, Thank you for the quick reply!

I know, sorry its such a big leap!

I did go through each release, one at a time in our dev pipeline and managed to by pass other errors that occurred. This was all done within the temp TRE in our pipeline. It seems everything else is successful.

I then merged into our DEVCI TRE and needed to update the firewall, which is where I noticed this issue..

  • Have many people had to do the porter migration?
  • Just wondering if there are any other troubleshooting steps I can run through?

@marrobi
Copy link
Member

marrobi commented Oct 10, 2023

At the time I believe there were only a small number of production deployments, so maybe a handful, most were in PoC/MVP hence did fresh installs. We, as a dev team did a number on our deployments.

It might be worth raising an issue her e- https://github.com/getporter/porter/issues with the porter specific error.

@JaimieWi
Copy link
Contributor Author

@marrobi Thank you, I will try raising the issue over there as well!

@marrobi
Copy link
Member

marrobi commented Oct 16, 2023

@JaimieWi have you had any luck?

@JaimieWi
Copy link
Contributor Author

JaimieWi commented Oct 17, 2023

@marrobi We discussed it internally and have made the decision to redeploy the TRE based on the newest version (and to keep on top of upgrades!)

Thank you for your help

@JaimieWi
Copy link
Contributor Author

Closing, issue abandoned

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants