Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add history and workflow invocation carbon emissions reporting #17542

Open
wants to merge 38 commits into
base: dev
Choose a base branch
from

Conversation

Renni771
Copy link
Contributor

@Renni771 Renni771 commented Feb 25, 2024

Adresses #17044, #15046 and usegalaxy-eu/environmental_impact/#2.

This PR implements carbon emissions reporting for histories and workflow invocations.

Backend changes:

  • The job metrics core plugin has been extended to add CPU and memory energy usage as a job metric. This allows energy usage to be calculated on both the front- and backend and allows carbon emissions reporting to be generalised to more than a single job.
  • Adds HTTP endpoints to query the accumulated metrics of all jobs in a history and workflow invocation. The new routes are /api/histories/{history_id}/metrics and /api/invocations/{invocation_id}/metrics.
  • Adds AWS EC2 reference data to the backend, so that the core plugin can make a guess about CPU hardware information used for power consumption calculations. The model used for carbon emissions reporting takes a CPU's total core count and its thermal design power TDP into account. Currently, Galaxy has no support for getting these values at all, so we make a rough guess on what CPU could have run a job given the job execution's allocated core count and memory usage.

Frontend changes:

  • Setting the carbon_emission_estimates estimates flag in the galaxy config to false prevents all carbon emissions reporting from being shown in the frontend UI (Job details page, history statistics page, workflow invocation summary).
  • Unifies the CarbonEmissions.vue component so it can be used in more places.
  • Adds a carbon emissions section to the workflow invocation summary tab (see screenshots below).
  • Replaces the "history storage overview" route with a new "history statistics" route. Previously this was accessible by clicking the "history size" button in the history panel. This new screen now includes the storage overview data previously found in the old route and includes a new carbon emissions report for the current history (see screenshots below).
  • Updates the frontend API schema to type the new energy usage HTTP routes added to the backend.
  • Adds component tests where necessary.
  • Refactors old carbon emissions reporting code.

Screenshots of new UI changes:

History statistics screen with history carbon emissions and storage overview:
Screenshot 2024-03-05 at 00 14 04
Screenshot 2024-03-05 at 00 14 20

Workflow invocation carbon emissions:
workflow_invocation_carbon_emissions

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

* Read awsEc2ReferenceData directly in AwsEstimate component
* Read config values directly in JobMetrics components
* Cleanup props for AwsEstimate and JobMetrics components
@Renni771 Renni771 marked this pull request as ready for review February 27, 2024 11:43
@github-actions github-actions bot added this to the 24.1 milestone Feb 27, 2024
Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this all disabled by default please ? The API route should only be included in the router if carbon emission reporting is enabled, and the cards should move out of the summary tab and into its own tab if the carbon emission thing is enabled.

invocation_id: DecodedDatabaseIdField,
) -> MetricsSummaryCumulative:
workflow_invocation = self._workflows_manager.get_invocation(trans, invocation_id, eager=True)
job_ids = [step.job_id for step in workflow_invocation.steps if step.job_id is not None]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's excluding all subworkflows jobs and all map over jobs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a rudimentary solution to accessing subworkflow jobs. Am I on the right track here, or is there a recommended way to access subworkflow jobs?

job_ids = [step.job_id for step in workflow_invocation.steps if step.job_id is not None]

for subworkflow in workflow_invocation.subworkflow_invocations:
    job_ids.extend([step.job_id for step in subworkflow.steps if step.job_id is not None])

I'm actually unfamiliar with what map over jobs are. Could you please clarify what they are? :)

@Renni771
Copy link
Contributor Author

Renni771 commented Mar 5, 2024

The API route should only be included in the router if carbon emission reporting is enabled

Given that they return the accumulated job metrics of all jobs in a history or workflow invocation (and other additional ones that could be added in the future) is it really necessary to exclude the /api/histories/{history_id}/metrics and /api/invocations/{invocation_id}/metrics routes from the router when carbon emission reporting is disabled? The routes only concern themselves with jobs and their job metrics data and are simply there to report whatever job metric data is available. I believe they shouldn't be excluded when carbon emissions reporting is disabled. Once we are able to share config data in the job_metrics package, we can simply exclude adding the energy usage metrics from the job, and they would therefore be excluded from the /api/<route-name>/metrics API responses.

Comment on lines +8980 to +8985
/** Total Energy Needed Cpu Kwh */
total_energy_needed_cpu_kwh: number;
/** Total Energy Needed Kwh */
total_energy_needed_kwh: number;
/** Total Energy Needed Memory Kwh */
total_energy_needed_memory_kwh: number;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move that into another model ?

Copy link
Contributor Author

@Renni771 Renni771 Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have something like this in mind?:

# lib/galaxy/schema/schema.py

class MetricsSummaryEnergyCumulative(Model):
    total_energy_needed_cpu_kwh: float
    total_energy_needed_memory_kwh: float
    total_energy_needed_kwh: float

class MetricsSummaryCoreCumulative(Model):
    total_allocated_cores_cpu: int
    total_allocated_memory_mebibyte: int
    total_runtime_seconds: int
    energy_usage_metrics: Optional[MetricsSummaryEnergyCumulative]

@mvdbeek
Copy link
Member

mvdbeek commented Mar 5, 2024

The routes only concern themselves with jobs and their job metrics data and are simply there to report whatever job metric data is available. I believe they shouldn't be excluded when carbon emissions reporting is disabled.

ok, that sounds good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants