Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

definition of 'task run time' #4

Open
cometta opened this issue May 16, 2024 · 3 comments
Open

definition of 'task run time' #4

cometta opened this issue May 16, 2024 · 3 comments

Comments

@cometta
Copy link

cometta commented May 16, 2024

i triggered a spark job for a few minutes, but i see from Grafana dashboard showing 'task run time' of at least one hour. is the info on Grafana correct?

image

@cometta cometta changed the title definition of ' definition of 'task run time' May 16, 2024
@LucaCanali
Copy link
Member

It seems accurate and aligns with my own experience as well. The metrics you're observing are cumulative across all executed tasks and all executors. The main idea behind using Apache Spark is to parallelize execution, allowing multiple CPUs/tasks to work simultaneously. Additionally, the metric "Number of Active Tasks" indicates how many tasks are being executed in parallel.

@cometta
Copy link
Author

cometta commented May 17, 2024

for the case when i only need to know total duration = end time - start time of Spark job. any widget i can refer to?

@LucaCanali
Copy link
Member

That's the kind of basic information you can easily get from the Spark Web UI, see https://spark.apache.org/docs/latest/web-ui.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants