-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#6054] feat(core): add more GCS permission to support fileset operations #6141
base: main
Are you sure you want to change the base?
Conversation
@jerryshao @yuqi1129 , PTAL |
String.format( | ||
"resource.name.startsWith('projects/_/buckets/%s/objects/%s')", | ||
bucketName, resourcePath)); | ||
getAllResources(resourcePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we only need to handle readExpressions
, what about writeExpressions
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the behaviour of Hadoop GCS connector to check subpath info to check path exists. so no need to add write permission.
|
||
@VisibleForTesting | ||
// "a/b/c" will get ["a", "a/", "a/b", "a/b/", "a/b/c"] | ||
static List<String> getAllResources(String resourcePath) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the real cause of this issue? I recall that you used the prefix match mode to control permission. Why can't it work?
When I worked on GCS IAM authorization, I tested if using the prefix condition on a role, I worked, so I wonder if there is any difference in credentials.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a/b/c
for example, without this, only grant the permission for the path start with a/b/c
, but HADOOP GCS connector need the storage read perssion for a/
a/b/
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave the user the list object privileges for the bucket and read permission for the exact path, then the user will got the privileges.
The following is the result:
Can't list objects in the bucket root
Can't list objects in the xiaoyu123/gcs_catalog_with_organization
Can't list objects in the xiaoyu123/gcs_catalog_with_organization/schema
Can list objects in the xiaoyu123/gcs_catalog_with_organization/schema/example
Can list objects in the xiaoyu123/gcs_catalog_with_organization/schema/example/people
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I couldn't get your point, what's the problem here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I 'm trying to say: Is this a proper way to solve this problem? If the path is very long, say have the item may be very large.
In the example above, I can use the prefix condition of the path to control the permission in IAM, so I wonder If we can do similar operation in for credentails.
What changes were proposed in this pull request?
a/b
, add "a", "a/", "a/b" read permission for GCS connectorstorage.legacyBucketReader
withstorage.insightsCollectorService
, becausestorage.legacyBucketReader
provides extra list permission for the bucket.Why are the changes needed?
Fix: #6054
Does this PR introduce any user-facing change?
no
How was this patch tested?
Iceberg GCS IT and fileset GCS credential IT