Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] - Project isolation #5204

Conversation

robert-ulbrich-mercedes-benz
Copy link
Contributor

Tracking issue

#5189

Why are the changes needed?

It proposes a way how to isolate single tenants in Flyte from each other.

What changes were proposed in this pull request?

Suggesting an RFC.

How was this patch tested?

No tests applied.

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Copy link

welcome bot commented Apr 9, 2024

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Apr 9, 2024
@robert-ulbrich-mercedes-benz robert-ulbrich-mercedes-benz changed the title RFC - Project isolation [RFC] - Project isolation Apr 11, 2024
@davidmirror-ops davidmirror-ops added the rfc A label for RFC issues label Apr 11, 2024

Since the API endpoints are very different from one another, the check needs to be implemented in each endpoint. Because of the different formats of the endpoint holding the project it will be hard to define a middleware centrally, which performs the check.

The mapping between claim value and project to authorize can be configured in the following way and put in the Helm chart of the Flyte deployment:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this work for dynamically registered projects via the CLI? is there a catch-all or default mapping or do deployments now need to be updated when a new project is registered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For newly registered projects per RFC we will currently need to adapt the configuration in order to map the project to a certain claim value. We can also introduce a default mapping - if you think it is a good idea, I will adapt the RFC

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a default in that it reduces the friction for creating a new project, but then I guess it breaks from your proposed paradigm of explicitly white-listing access to projects.

My only concern is that since all it currently takes to create a project is a simple api call that any user can make, the change as it's currently proposed introduces more friction for self-service project registration (which is maybe undesirable in a context where you're already managing per-project access!)

I'm curious, in practice what has the overhead been like for your organization when it comes to provisioning new projects? Is that an infrequent occurrence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Flyte instance in our organization is centrally managed. So it is not desired to have users creating projects on their own. In our organization project creation is put in the hands of the team managing the Flyte instance. So the Flyte projects we have are well defined and there are not too many projects being added or removed.

In companies I assume the Flyte instances will usually be managed by a dedicated team.

return nil, status.Errorf(codes.InvalidArgument, "Incorrect request, nil requests not allowed")
}

if ProjectAccessPermitted(ctx, request.GetId().Project) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an opt-in mechanism, right? how do you ensure that new endpoints also get the same treatment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is an opt-in mechanism. We have meanwhile implemented it in a private fork of the project. We moved away from changing every single endpoint, but added a central middleware. But since there are no common fields for all of the request objects, actually each and every endpoint and its request object need to be explicitly mentioned, which I consider a weakness because if newly added endpoints are forgotten to be added in that middleware, then those endpoints will not benefit from the project isolation. Unfortunately we could not find a better solution to this issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you like to, I can also show you how we actually implemented it in the fork.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robert-ulbrich-mercedes-benz that all makes sense, I can see where maybe the middleware rejects unhandled endpoints by default but then that still requires manual code changes per new endpoint (and for newer or infrequent endpoints, it may not be obvious that the authorization check was missed)

How do you keep up with new service definition changes in your fork? Or is the process manual currently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We manually went over the GRPC service definitions that are available. We found out, that only Admin Services are of concern. But I share your concerns regarding new endpoints that might be missed in that central middleware.

@davidmirror-ops
Copy link
Contributor

davidmirror-ops commented May 23, 2024

Contributors meetup notes: Eduardo to have a sync with Robert to better review this.

@davidmirror-ops
Copy link
Contributor

06/06/2024 Contributors meeting notes: the core challenge is how to add an opt-in middleware to flyteadmin. We couldn't find a great solution to solve this, and just relying on developers being careful with listing potential new endpoints is not enough, it's too risky, with the risks outweighing the potential benefits. We tried to use reflection on flyteadmin but it failed completely so, if someone more versed in Go wants to try it, we'd be happy to support them. While we're glad it solves the author's use case on their fork, there are multiple technical objections that lead us to not accept this proposal as an upstream change to Flyte. cc @robert-ulbrich-mercedes-benz Thanks and please let us know your thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request rfc A label for RFC issues size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Status: Not Accepted
Development

Successfully merging this pull request may close these issues.

3 participants