You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In GH mvn actions we validate the JVM bytecode builds against supported published Apache Spark builds
The cache key in https://github.com/NVIDIA/spark-rapids/actions/caches for the dependencies just uses the current date at the day granularity. This may miss the most recent spark-rapids-jni and spark-rapids-private dependency worst case by ~24h that the current spark-rapids PR is trying to pick up for a new API
We can use REST API to determine the latest available timestamp and make it part of the cache key
This will guarantee that we have the latest internal spark-rapids* in the active cache when the user re-runs the GH action after the nightly artifact is published.
The text was updated successfully, but these errors were encountered:
Do we want to do this for other internal dependencies too? What about the private jar? I don't think we depend on anything else that is a SNAPSHOT release besides spark when we are trying to work on a new shim. And even then waiting 24 hours is probably not a big deal.
gerashegalov
changed the title
[BUG] Invalidate GH action dependency cache when spark-rapids-jni nightly is updated
[BUG] Invalidate GH action dependency cache when internal nightly dependencies are updated
Nov 22, 2024
@revans2 good point: I reworded the issue to generalize so that covers spark-rapids-private. I think Spark SNAPSHOT 24h staleness is acceptable but we should implement it in a way such that it's easy to modify the list of must-be-up-to-date dependencies.
Describe the bug
In GH mvn actions we validate the JVM bytecode builds against supported published Apache Spark builds
The cache key in https://github.com/NVIDIA/spark-rapids/actions/caches for the dependencies just uses the current date at the day granularity. This may miss the most recent spark-rapids-jni and spark-rapids-private dependency worst case by ~24h that the current spark-rapids PR is trying to pick up for a new API
We can use REST API to determine the latest available timestamp and make it part of the cache key
This will guarantee that we have the latest internal spark-rapids* in the active cache when the user re-runs the GH action after the nightly artifact is published.
The text was updated successfully, but these errors were encountered: