Resource optimisation #92

muffato · 2023-11-16T21:42:46Z

I merged #90 by accident, so reopening a new PR.

Like in sanger-tol/readmapping#82 the goal is to stop using the process_* labels and instead optimise the resource requests of every process.

I'm using the same dataset: 10 genomes of increasing size, with 1 Hi-C library and 1 PacBio library each.

	Fasta size (bytes)	PacBio size (# reads)	Hi-C (# reads)
GCA_939531405.1	13,824,461	1,546,435	955,654,834
GCA_937625935.1	26,683,271	189,202	980,890,138
GCA_951394315.1	58,010,196	1,965,084	704,258,466
GCA_947172415.1	118,858,594	799,796	87,833,110
GCA_910589235.2	232,212,321	1,586,931	727,465,652
GCA_949987625.1	417,566,504	2,211,570	705,705,280
GCA_946406115.1	810,357,340	1,872,695	842,629,084
GCA_963513935.1	1,803,897,959	7,338,871	3,305,634,916
GCA_951213105.1	3,609,437,155	1,121,856	3,127,898,040
GCA_946902985.2	9,152,113,672	1,537,548	886,707,886

I found much less correlation than in the read-mapping pipeline. The only input size that I found useful was the genome size, now collected at the beginning of the pipeline and added to the meta map. There is some correlation between the number of Hi-C reads and some process runtime, but not memory usage. Since runtime estimates don't need to be very accurate (really, it's only normal/long/week that matters), I don't even pull that input size.

I am using helper functions to grow values (like the number of CPUs) in a logarithm fashion. In effect, this is to limit the increase of the number of CPUs, especially as the advantage of multi-threading tends to decrease with a higher number of threads.

Also:

I could replace the GrabFiles process with some Groovy magic, as per https://community.seqera.io/t/is-it-bad-practice-to-try-and-pluck-files-from-an-element-in-a-channel-that-is-a-directory-with-channel-manipulation/224/2 . This saves 1 LSF job.
I adjusted the GNU_SORT parameters to fix Leftover files in /tmp from the sort commands #91

In this PR, the new resource requirements make every process succeed at the first attempt. The formulas are the lowest legible-ish correlations I could find.

Metric	Before	After	Improvement
Total memory requested (GB)	3,660.0	997.1	÷3.7
Memory efficiency (used/requested, %)	21.2	78.0
Total memory reservation (GB-hours)	2,792.4	2,405.4	÷1.2
Memory reservation efficiency (used/requested, %)	86.0	89.4
Total CPUs requested (n)	610.0	510.0	÷1.2
CPU efficiency (used/requested, %)	47.6	60.5
Total CPU reservation (CPU-hours)	465.4	332.0	÷1.4
CPU reservation efficiency (used/requested, %)	62.0	86.3
Job failures (%)	0.5	0.0

Detailed charts showing the memory/CPU/time used/requested for every process: before (PDF), after (PDF)

If we want to tolerate processes failing at the first attempt, being resubmitted once or twice to finally complete, I'm sure some requirements may be lowered even more. We would have to make sure that the resources wasted on those first attempts doesn't outweigh the savings we'll make on other processes. Something to investigate later...

PR checklist

This comment contains a description of changes (with reason).
If you've fixed a bug or added code that should be tested, add tests!
If you've added a new tool - have you followed the pipeline conventions in the contribution docs
Make sure your code lints (nf-core lint).
Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
Usage Documentation in docs/usage.md is updated.
Output Documentation in docs/output.md is updated.
CHANGELOG.md is updated.
README.md is updated (including new tool citations and authors/contributors).

…ne [not tested !]

The default multiplication rule is too greedy

github-actions · 2023-11-16T21:44:12Z

`nf-core lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit a5d21f4

+| ✅ 132 tests passed       |+
#| ❔  20 tests were ignored |#
!| ❗   1 tests had warnings |!

❗ Test warnings:

nextflow_config - Config manifest.version should end in dev: '1.1.0'

❔ Tests ignored:

files_exist - File is ignored: assets/nf-core-genomenote_logo_light.png
files_exist - File is ignored: docs/images/nf-core-genomenote_logo_light.png
files_exist - File is ignored: docs/images/nf-core-genomenote_logo_dark.png
files_exist - File is ignored: .github/ISSUE_TEMPLATE/config.yml
files_exist - File is ignored: .github/workflows/awstest.yml
files_exist - File is ignored: .github/workflows/awsfulltest.yml
files_exist - File is ignored: conf/igenomes.config
nextflow_config - Config variable ignored: manifest.name
nextflow_config - Config variable ignored: manifest.homePage
files_unchanged - File ignored due to lint config: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/bug_report.yml
files_unchanged - File does not exist: .github/ISSUE_TEMPLATE/config.yml
files_unchanged - File ignored due to lint config: .github/workflows/linting.yml
files_unchanged - File does not exist: assets/nf-core-genomenote_logo_light.png
files_unchanged - File does not exist: docs/images/nf-core-genomenote_logo_light.png
files_unchanged - File does not exist: docs/images/nf-core-genomenote_logo_dark.png
files_unchanged - File ignored due to lint config: lib/NfcoreTemplate.groovy
actions_ci - actions_ci
actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/genomenote/genomenote/.github/workflows/awstest.yml

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreSchema.groovy
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: lib/WorkflowGenomenote.groovy
files_exist - File found: modules.json
files_exist - File found: pyproject.toml
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-genomenote_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.show_hidden_params
nextflow_config - Config variable found: params.schema_ignore_params
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
files_unchanged - lib/NfcoreSchema.groovy matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
files_unchanged - pyproject.toml matches the template
readme - README Nextflow minimum version badge matched config. Badge: 22.10.1, Config: 22.10.1
readme - README Zenodo placeholder was replaced with DOI.
pipeline_todos - No TODO strings found
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (105 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: sanger_test_full.yml
actions_schema_validation - Workflow validation passed: sanger_test.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: ci.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins.
multiqc_config - 'assets/multiqc_config.yml' contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'

Run details

nf-core/tools version 2.8
Run at 2023-11-23 09:25:44

muffato · 2023-11-17T15:24:29Z

Now that I've generated all the charts, I realise that some resource requirements are actually too low ! I should be asking 150 MB for MultiQC, not 50 MB. I guess it worked because the jobs are too fast for MEMLIMIT to have time kill.

Some COOLER_ZOOMIFY processes also take more than the 12 GB I'm requesting. The processes take 10 min, so I thought MEMLIMIT would kick in ? Anyway, I'll sort all those things out in another commit

BethYates

All changes look good to me, the groovy code to replace the GrabFiles process is a nice improvement

muffato · 2023-11-23T09:30:28Z

I've made the few changes I mentioned in #92 : just updated up / down some requirements. I reran the pipeline on all species and it worked fine.

I also merged the dev branch in to solve the conflict coming from #93

@BethYates : this PR just needs an approval and then I can merge it

muffato added 13 commits November 9, 2023 09:20

Don't need a complete process for that

f8d5965

Take 100M out for the various overheads

57df724

Make sure the temporary files are created locally rather than in /tmp

0ca1e3e

First stab at defining optimised resource requirements for the pipeli…

f85341f

…ne [not tested !]

Indentation should be a multiple of 4

6dc2da7

typo

ec978ee

Another typo

3e40be1

Removed trailing whitespace

7372b6e

Another round of resource updates

381536e

Everything should be increasing the number of CPUs with a log function

ca96b15

The default multiplication rule is too greedy

Wrongly placed ifEmpty

e8e093f

Comment

0878456

Both FastK and MerquryFK need -P. to avoid polluting /tmp

00168f5

muffato added the enhancement Improvement of the existing features label Nov 16, 2023

muffato added this to the 1.1.0 milestone Nov 16, 2023

muffato self-assigned this Nov 16, 2023

muffato marked this pull request as ready for review November 16, 2023 21:43

typo

0c13d22

muffato added 3 commits November 17, 2023 16:07

Increased a few more memory requirements that were too low

7063fb4

Reduced a few CPU requirements that were unnecessary high

1c561eb

Decreased some time requirements

84ae909

muffato linked an issue Nov 20, 2023 that may be closed by this pull request

Leftover files in /tmp from the sort commands #91

Closed

More trace fields to be able to optimise the parameters later

b5b11f0

muffato modified the milestones: 1.1.0, 2.0.0, 1.2.0 Nov 20, 2023

muffato linked an issue Nov 20, 2023 that may be closed by this pull request

Small test #16

Closed

This was linked to issues Nov 20, 2023

Medium test #18

Closed

Large test #20

Closed

BethYates reviewed Nov 22, 2023

View reviewed changes

Merge branch 'dev' into resource_optimisation

a5d21f4

muffato added a commit that referenced this pull request Nov 23, 2023

#92 got merged in

ac668ea

muffato mentioned this pull request Nov 23, 2023

Documentation updates for the release #95

Merged

9 tasks

BethYates approved these changes Nov 24, 2023

View reviewed changes

muffato merged commit 48f7738 into dev Nov 24, 2023
6 checks passed

muffato deleted the resource_optimisation branch November 24, 2023 10:44

muffato mentioned this pull request Dec 8, 2023

Resource optimisation sanger-tol/readmapping#82

Merged

9 tasks

muffato modified the milestones: 1.2.0, 1.1.0 Apr 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource optimisation #92

Resource optimisation #92

muffato commented Nov 16, 2023 •

edited

Loading

github-actions bot commented Nov 16, 2023 •

edited

Loading

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

muffato commented Nov 17, 2023

BethYates left a comment

muffato commented Nov 23, 2023

Resource optimisation #92

Resource optimisation #92

Conversation

muffato commented Nov 16, 2023 • edited Loading

PR checklist

github-actions bot commented Nov 16, 2023 • edited Loading

nf-core lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

muffato commented Nov 17, 2023

BethYates left a comment

Choose a reason for hiding this comment

muffato commented Nov 23, 2023

muffato commented Nov 16, 2023 •

edited

Loading

github-actions bot commented Nov 16, 2023 •

edited

Loading

`nf-core lint` overall result: Passed ✅ ⚠️