-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pathogen-repo-build: support assuming AWS role for runtime permissions #81
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
1fae489
to
59027ff
Compare
Resolves: <#4> Related-to: <nextstrain/.github#81>
Resolves: <#4> Related-to: <nextstrain/.github#81>
Assumes the role `GitHubActionsRoleNextstrainS3Access` (just a placeholder name, final name TDB in nextstrain/infra#4) and scopes repo specific permissions with an inline session policy. The runtime credentials are then saved in an envdir that is passed to the build command via `NEXTSTRAIN_RUNTIME_ENVDIRS`. Namespaced `NEXTSTRAIN_RUNTIME_ENVDIR` with `./git/nextstrain` as suggested by @tsibley in review¹ ¹ <#81 (comment)> pathogen-repo-build:
30fd636
to
b2c38a0
Compare
Static role permissions policies which are expected to be narrowed/scoped-down by an inline session policy set by the pathogen-repo-build workflow. Problems with this approach noted in review commentary: - <nextstrain/.github#81 (comment)> - <nextstrain/.github#81 (comment)> Resolves: <#4> Related-to: <nextstrain/.github#81>
Let's us map GitHub token claims to principal tags, which is great. No need to customize the "sub" claim any more, and granular matching. But ultimately not workable because GitHub OIDC doesn't include the repository name _sans_ owner in its token claims (e.g. "zika" for "nextstrain/zika"), so we don't have any policy variable to interpolate into policy conditions. I thought maybe we could (ab)use role session names for this and pass the unqualified repo name via it ourselves, but that hit a wall because while sts:RoleSessionName is available in the AssumeRoleWithWebIdentity request, it's not available on subsequent requests which we need to be able to use in policy conditions. Then I thought maybe we could use sts:SourceIdentity instead in the role assumption request which is available as aws:SourceIdentity in subsequent requests, except that GitHub's token would have to include an "https://aws.amazonaws.com/source_identity" claim, and it ofc does not provide any way to do that. If it did, it would likely also let us provide a custom "https://aws.amazonaws.com/tags" claim at which point all of this nonsense would be moot anyway. Finally, I thought maybe we could use a custom audience ("aud" claim) which GitHub _does_ let you specify. This is available in AWS requests as the "token.actions.githubusercontent.com:aud" key and we can use it as a policy variable. \o/ But to use it we'd need to cut out Cognito Identity (e.g. go back to where we started without it) and lose the token claims → principal tags mapping. So we'd need to customize the GitHub token's "sub" claim and match against it. We'd also need to explicitly list out all possible valid "aud" on the AWS OIDC IdP configuration (as "client_id_list"), and that would be frustrating. This all feels more complex than it's worth at this point. Resolves: <#4> Related-to: <nextstrain/.github#81>
For posterity, two false starts (failed prototypes/ideas) on the AWS side of this: |
Assumes repo-specific roles, `GitHubActionsRoleNextstrainBuild@zika` for example, which are managed by the Terraform configuration in the nextstrain/infra repo. The runtime credentials are then saved in an envdir that is passed to the build command via `NEXTSTRAIN_RUNTIME_ENVDIRS`. Namespaced `NEXTSTRAIN_RUNTIME_ENVDIR` with `./git/nextstrain` as suggested by @tsibley in review¹ ¹ <#81 (comment)> Related-to: <nextstrain/infra#4> Co-authored-by: Thomas Sibley <[email protected]>
b2c38a0
to
58ba6a0
Compare
Re-pushed with the (not yet working) code to do what I'm planning on supporting over on the AWS side. |
TKTK Resolves: <#4> Related-to: <nextstrain/.github#81>
TKTK Resolves: <#4> Related-to: <nextstrain/.github#81> Supersedes: <#6>, <#7>
TKTK Resolves: <#4> Related-to: <nextstrain/private#96>, <nextstrain/.github#81> Supersedes: <#6>, <#7>
Assumes repo-specific roles, `GitHubActionsRoleNextstrainBuild@zika` for example, which are managed by the Terraform configuration in the nextstrain/infra repo. The runtime credentials are then saved in an envdir that is passed to the build command via `NEXTSTRAIN_RUNTIME_ENVDIRS`. Namespaced `NEXTSTRAIN_RUNTIME_ENVDIR` with `./git/nextstrain` as suggested by @tsibley in review¹ ¹ <#81 (comment)> Related-to: <nextstrain/infra#4> Co-authored-by: Thomas Sibley <[email protected]>
58ba6a0
to
619606b
Compare
nextstrain/infra#8 is where I ended up. |
TKTK Resolves: <#4> Related-to: <nextstrain/private#96>, <nextstrain/.github#81> Supersedes: <#6>, <#7>
TKTK Resolves: <#4> Related-to: <nextstrain/private#96>, <nextstrain/.github#81> Supersedes: <#6>, <#7>
Assumes repo-specific roles, `GitHubActionsRoleNextstrainRepo@zika` for example, which are managed by the Terraform configuration in the nextstrain/infra repo. The repo here is always the _calling_ repository, regardless of if the "repo" input was provided to use workflows from another place. The runtime credentials are then saved in an envdir that is passed to the build command via `NEXTSTRAIN_RUNTIME_ENVDIRS`. Namespaced `NEXTSTRAIN_RUNTIME_ENVDIR` with `./git/nextstrain` as suggested by @tsibley in review¹ ¹ <#81 (comment)> Related-to: <nextstrain/infra#4> Co-authored-by: Thomas Sibley <[email protected]>
619606b
to
83abe0d
Compare
Assumes repo-specific roles, `GitHubActionsRoleNextstrainRepo@zika` for example, which are managed by the Terraform configuration in the nextstrain/infra repo. The repo here is always the _calling_ repository, regardless of if the "repo" input was provided to use workflows from another place. The runtime credentials are then saved in an envdir that is passed to the build command via `NEXTSTRAIN_RUNTIME_ENVDIRS`. Namespaced `NEXTSTRAIN_RUNTIME_ENVDIR` with `./git/nextstrain` as suggested by @tsibley in review¹ ¹ <#81 (comment)> Related-to: <nextstrain/infra#4> Co-authored-by: Thomas Sibley <[email protected]>
83abe0d
to
eec1d61
Compare
The script was copied from the ncov-ingest repo <https://github.com/nextstrain/ncov-ingest/blob/20b432624ad2c4bf24c85756bd04bae1e067bde8/bin/write-envdir> Adding in preparation for configuring separate AWS credentials for the build runtime in pathogen-repo-build.
Renamed to `setup-aws-batch-credentials` in preparation for adding another step that sets up AWS credentials for the runtime. The new step for setting up runtime AWS credentials probably won't need an anchor as it should only run once before the build starts. I just wanted to be specific with the existing anchor name to prevent any potential confusion.
Assumes repo-specific roles, `GitHubActionsRoleNextstrainRepo@zika` for example, which are managed by the Terraform configuration in the nextstrain/infra repo. The repo here is always the _calling_ repository, regardless of if the "repo" input was provided to use workflows from another place. The runtime credentials are then saved in an envdir that is passed to the build command via `NEXTSTRAIN_RUNTIME_ENVDIRS`. Namespaced `NEXTSTRAIN_RUNTIME_ENVDIR` with `./git/nextstrain` as suggested by @tsibley in review¹ ¹ <#81 (comment)> Related-to: <nextstrain/infra#4> Co-authored-by: Thomas Sibley <[email protected]>
eec1d61
to
009b8d8
Compare
I've reworked this a bit to successfully integrate with nextstrain/infra#8. |
A collection of templated repo-specific roles for inside a Nextstrain runtime and one role for the GitHub Actions job itself, i.e. outside the runtime. The inside-the-runtime roles are given pathogen-specific permissions necessary for the ingest and phylogenetic workflows of a pathogen repo. The outside-the-runtime role is only necessary/used at the moment for access to AWS Batch. The roles can only be assumed by specific repos when performed by our centralized pathogen-repo-build workflow. While this doesn't completely prevent off-label use by other GitHub Actions workflows launched from a pathogen repo, it does make it more involved to do so, hopefully to the point of discouragement. The associated GitHub repository configuration is managed by Terraform now as well since the customization of the "sub" claim in GitHub Action's OIDC token is tightly coupled to our AWS role trust policies. Resolves: <#4> Related-to: <nextstrain/private#96>, <nextstrain/.github#81> Supersedes: <#6>, <#7>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pushing this over the finish line @tsibley 🙌
Assumes repo-specific roles, `GitHubActionsRoleNextstrainRepo@zika` for example, which are managed by the Terraform configuration in the nextstrain/infra repo. The repo here is always the _calling_ repository, regardless of if the "repo" input was provided to use workflows from another place. The runtime credentials are then saved in an envdir that is passed to the build command via `NEXTSTRAIN_RUNTIME_ENVDIRS`. Namespaced `NEXTSTRAIN_RUNTIME_ENVDIR` with `./git/nextstrain` as suggested by @tsibley in review¹ ¹ <#81 (comment)> Related-to: <nextstrain/infra#4> Co-authored-by: Thomas Sibley <[email protected]>
…g credentials Very helpful for troubleshooting when looking at logs. Note that when using the AWS Batch runtime the credential identity will only be emitted for the initial job submission, not subsequent wait-N jobs. This seems fine.
…ials steps So they are more easily differentiated/recognized in the job logs. I found myself easily mixing them up when trying to find the right one.
cf5c523
to
3a2d597
Compare
I think I'll plan to merge this in the morning... so I can deal with the fallout. |
Merged. I believe the new work will really only start being used once we remove existing static credentials from repos. |
Test run of zika's ingest-to-phylogenetic.yaml failed during the
The out-of-order stdout/stderr merge to logs makes it a little confusing, but the access keys were successfully written and it failed on I'll push a fix. |
Assumes repo-specific roles,
GitHubActionsRoleNextstrainRepo@zika
for example, which are managed by the Terraform configuration in the nextstrain/infra repo. The repo here is always the calling repository, regardless of if the "repo" input was provided to use workflows from another place.The runtime credentials are then saved in an envdir that is passed to the build command via
NEXTSTRAIN_RUNTIME_ENVDIRS
.Namespaced
NEXTSTRAIN_RUNTIME_ENVDIR
with./git/nextstrain
as suggested by @tsibley in review¹¹ #81 (comment)
Related-to: nextstrain/infra#4
Co-authored-by: Thomas Sibley [email protected]
Checklist