-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Will Comet support closed-source forks of Apache Spark (e.g. CSP versions)? #414
Comments
Thanks @andygrove for creating this. I think we don't claim that Comet supports for closed source forks of Spark right now. It would be impossible to make such claims as we don't have such resources to make sure it happens. For #412, I think although it is proposed to support AWS Spark, but the patch actually can be seen as a prevention to take unexpected constructors which have different parameters. I think it makes sense to do. |
There is also a reported compatibility issue with Databricks Spark: #190 |
I plan on creating a PR to update our documentation to make it clear that we only support Apache Spark and not other Spark implementations. |
I agree that we cannot support (i.e. guarantee compatibility) with proprietary forks, but I guess it is OK to accept PRs like #412 since it doesn't break anything and can only increase adoption. |
Based on the dicussion in #190 is sounds at bit like databricks might be shipping patches that are not by in the https://www.apache.org/foundation/marks/downstream.html (Or at least my reading of it). Not that following it guarantees compatiability, but it sounds like it should make it less likely to have issues. |
… regressions (apache#414) * Set internal default configs * add header * comment out shuffle mode check * revert shuffle mode default change * revert enable shuffle by default * update defaults
What is the problem the feature request solves?
We have our first PR up that works around an issue with Comet working with AWS Spark (#412).
I think we need to carefully consider our stance on supporting closed-source forks of Spark from the cloud service providers.
Supporting closed-source Spark versions is challenging for many reasons:
If the community desires to maintain Comet versions that can work with CSP Spark versions, then I think we would need to find an approach that allows those contributors to extend the "core" Comet project and add CSP support without adding maintenance burden for the core project.
One idea, for example, would be to keep the core
datafusion-comet
project compatible with OSS Apache Spark, and then have specific downstream repositories such asdatafusion-comet-aws
that extend the project to support a specific CSP.Describe the potential solution
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: