Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: update Python runtime and AWS CDK #38

Merged
merged 38 commits into from
Oct 30, 2024
Merged

chore: update Python runtime and AWS CDK #38

merged 38 commits into from
Oct 30, 2024

Conversation

ceholden
Copy link
Collaborator

@ceholden ceholden commented Oct 17, 2024

What I am changing

This PR addresses issue #37 by updating,

I'm also hoping that some of the version bumps will prune down on the 82 Dependabot security warnings.

How I did it

Bumping Python versions

I bumped to the current latest Python runtime (3.12), but the CDK has a bug with that version so we're stuck on 3.11 for now. This required updating a few package dependencies. I saw that most of the Pipfiles used hard pinned versions for dependencies even though the versions are pinned in the lockfile, so I tried to keep to that convention. It would be easy to update this if we want to do that.

  • Updating psycopg2 to >=2.9.9 to support Python 3.12
  • Remove now deprecated 3rd party psycopg2-layer from AWS Lambda layers. The db layer installs the same psycopg2-binary as the psycopg2-layer so we should be able to get that dependency from our own db layer
  • Update boto3 to >=1.29.0 to support Python 3.12 for Boto3 and the Python 3.12 Lambda runtime
  • One of the packages didn't pin moto and I hit a breaking change in v5, so I went ahead and updated all of the packages that use Moto to the latest v5
    • From v4 to v5 they had a breaking change that migrated all mock_[service] decorator functions to just be mock_aws.
    • There was one other hangup - one of the link_fetcher tests failed because the mocked SQS queues apparently share the same state. I resolved this by naming each queue based on the test name.
    • See CHANGELOG for v5
  • Switched from black/isort/flake8 to ruff as one linter/formatter to rule them all that is super fast

Bumping AWS CDK

See migration guide for reference,
https://docs.aws.amazon.com/cdk/v2/guide/migrating-v2.html

The changes and related changes include,

  • AWS consolidated all the non-beta functionality into just one package, simplifying our management of CDK dependencies
  • Update imports from previously named aws_cdk.core (now in aws_cdk module)
  • Remove CDK flags that have been removed in V2. These flags in V2 now behave as if the variable were set to "true" in v1
    • core:enableStackNameDuplicates
    • aws-secretsmanager:parseOwnedSecretName

How you can test it

Run the very nice test suite!

Running from the repo root,

# make sure to have latest dependencies installed
$ make install
# with dummy postgres variables
$ PG_DB=foo PG_PASSWORD=foo PG_USER=foo make unit-tests

We can also help test that the CDK update is working without deploying by checking the CDK diff,

npx cdk diff

@ciaransweet
Copy link
Contributor

@ceholden gimme a ping when this is ready for review!

Always happy this little project is still alive and kicking 😍

@ceholden
Copy link
Collaborator Author

@ceholden gimme a ping when this is ready for review!

Always happy this little project is still alive and kicking 😍

Thanks @ciaransweet! Appreciate the good work you did especially with docs, tests, and dev tooling. This sort of maintenance work feels a lot safer with all of that in place 💯 🚀

I'm hoping to wrap this up ahead of adding the code I've written to feed the granule downloader from ESA's notification system. From preliminary testing, moving to an event driven "link fetcher" should help us reduce download latency to ~3 minutes from new data publication time and also allow us to catch old granules that have been reprocessed by ESA. Not sure if you can access but I have a writeup of the experiment findings and plan for next steps in this ticket, https://github.com/NASA-IMPACT/hls_development/issues/300

I think I'll be fighting with some of the package version bumps some more, but would love your help reviewing when it's in a good place 😸

Comment on lines -4 to -8
"@aws-cdk/core:enableStackNameDuplicates": "true",
"aws-cdk:enableDiffNoFail": "true",
"@aws-cdk/core:stackRelativeExports": "true",
"@aws-cdk/aws-ecr-assets:dockerIgnoreSupport": true,
"@aws-cdk/aws-secretsmanager:parseOwnedSecretName": true,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are removed in CDK v2 and act as if they were all true

Comment on lines +132 to +133
def mock_sqs_queue(request, sqs_resource, monkeysession, sqs_client):
queue = sqs_resource.create_queue(QueueName=f"mock-queue-{request.node.name}"[:80])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Moto v5 these queues apparently share state, so tests were failing because the queues from one test were impacting other tests. This change resolves the issue by giving unique names per test

Comment on lines +4 to +7
aws_cdk_version = "2.162.1"
inst_reqs = [
*[f"aws_cdk.{x}=={aws_cdk_version}" for x in aws_cdk_reqs],
f"aws-cdk-lib=={aws_cdk_version}",
f"aws-cdk.aws-lambda-python-alpha=={aws_cdk_version}a0",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDK v2 only has 1 main package but splits anything in "alpha" or "beta" off into separate packages. We're using PythonFunction from aws-cdk.aws-lambda-python-alpha so we need this additional package

@ceholden ceholden changed the title [DRAFT] chore: update Python runtime and AWS CDK chore: update Python runtime and AWS CDK Oct 21, 2024
@@ -21,9 +21,9 @@ This project aims to provide a serverless implementation of the current [HLS S2

To develop on this project, you should install:

* NVM [Node Version Manager](https://github.com/nvm-sh/nvm) / Node 12
* NVM [Node Version Manager](https://github.com/nvm-sh/nvm) / Node 18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weeeew 6 node versions!

README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@ciaransweet ciaransweet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome job here @ceholden just a minor comment but looks gucci to me otherwise!

README.md Outdated Show resolved Hide resolved
@ceholden
Copy link
Collaborator Author

Thanks to the help of @chuckwondo we're now past the hurdle of setting up CDK v2 bootstrap in us-west-2 for the staging account 🎉. The deploy user we're using for dev requires some additional IAM permission grants that I'm working through before we can deploy the integration test stack. I think I've gotten past that hurdle just now.

It looks like we also need to add a permissions boundary to the stacks because our deploy user's permissions to do things (e.g., create roles) depends on the permissions boundary being set for those deployed resources. Chuck's most recent PR has a nice example of what we need to do for this

Copy link
Collaborator

@chuckwondo chuckwondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Fingers crossed on getting integration tests to deploy and run.

@ceholden
Copy link
Collaborator Author

@chuckwondo thanks for your review and help on Friday! I did get the integration tests to run and succeed, but had a lot of trouble with misconfiguration in the staging account permissions boundaries. I'll ask around on Slack about next steps to fix this issue, but no code change should be required on this end

@chuckwondo
Copy link
Collaborator

@chuckwondo thanks for your review and help on Friday! I did get the integration tests to run and succeed, but had a lot of trouble with misconfiguration in the staging account permissions boundaries. I'll ask around on Slack about next steps to fix this issue, but no code change should be required on this end

Can you elaborate on the misconfiguration? I'm not following, especially since int. tests have succeeded.

@ceholden
Copy link
Collaborator Author

@chuckwondo I'll followup on Slack with more info, but FYI I re-ran the integration tests to reproduce the issue after updating the Github environment. I'm going to edit the "dev" Github environment to remove the permissions boundary and re-run to ensure the integration tests are still passing

@ceholden ceholden merged commit 97096cf into main Oct 30, 2024
3 checks passed
@ceholden ceholden deleted the ceh/issue37 branch October 30, 2024 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants