[Bug]: Bundling beam job into jar for SparkRunner requires local execution #30214
Closed
1 of 16 tasks
Labels
awaiting triage
bug
done & done
Issue has been reviewed after it was closed for verification, followups, etc.
P3
python
Milestone
What happened?
Hi all,
When attempting to bundle a beam job like the wordcount example into a jar for execution on Spark it appears that this step requires local execution. I am following: https://beam.apache.org/documentation/runners/spark/#kubernetes. When running:
I get an error:
OSError No files found based on the file pattern s3://bucket/path
. My local machine running this command does not have access to the input files on s3. But Spark does.The comment on output_executable_path states it builds the jar rather than running. If that's the case I'm confused why I get an access error? Not running the beam code should mean my local machine does not need to talk to s3?
Any help would be appreciated! Thanks!
PS: I'm able to build the jar, and run on spark successfully if I use an input path on s3 that I have access to on my local machine.
Issue Priority
Priority: 3 (minor)
Issue Components
The text was updated successfully, but these errors were encountered: