Increase the time for Galaxy cleanup again #1195

bgruening · 2024-05-06T16:28:22Z

I don't know why we decreased it 5 years ago.

If we purge datasets older than 60days, I think we can do it less frequently and maybe avoid the long-running, maybe overlapping transactions.

A counterargument would be that once every 2 days more datasets need to be deleted at once, producing more IO spikes? I don't know.

@sanjaysrikakulam this job should run on the maintenance node.

bgruening · 2024-05-06T20:16:12Z

I assume it takes even longer than 1h.

root@sn06:~$ /usr/bin/env GDPR_MODE=1 PGUSER=galaxy PGHOST=sn05.galaxyproject.eu GALAXY_ROOT=/opt/galaxy/server GALAXY_CONFIG_FILE=/opt/galaxy/config/galaxy.yml GALAXY_LOG_DIR=/var/log/galax
y GXADMIN_PYTHON=/opt/galaxy/venv/bin/python /usr/bin/gxadmin galaxy cleanup 60

cleanup_datasets,group=delete_userless_histories success=1,runtime=7
cleanup_datasets,group=delete_exported_histories success=1,runtime=7
cleanup_datasets,group=purge_deleted_users success=1,runtime=4
cleanup_datasets,group=purge_deleted_histories success=1,runtime=1440
cleanup_datasets,group=purge_deleted_hdas success=1,runtime=3510

cleanup_datasets,group=purge_historyless_hdas success=1,runtime=8003
cleanup_datasets,group=purge_hdas_of_purged_histories success=1,runtime=2472
cleanup_datasets,group=delete_datasets success=1,runtime=612
cleanup_datasets,group=purge_datasets success=1,runtime=197

bgruening · 2024-05-07T11:10:08Z

galaxy GXADMIN_PYTHON=/opt/galaxy/venv/bin/python /usr/bin/gxadmin galaxy cleanup 60
cleanup_datasets,group=delete_userless_histories success=1,runtime=9
cleanup_datasets,group=delete_exported_histories success=1,runtime=6
cleanup_datasets,group=purge_deleted_users success=1,runtime=5
cleanup_datasets,group=purge_deleted_histories success=1,runtime=7971
cleanup_datasets,group=purge_deleted_hdas success=1,runtime=3147
cleanup_datasets,group=purge_historyless_hdas success=1,runtime=8126
cleanup_datasets,group=purge_hdas_of_purged_histories success=1,runtime=1334
cleanup_datasets,group=delete_datasets success=1,runtime=552
cleanup_datasets,group=purge_datasets success=1,runtime=57

real    353m26.402s
user    0m42.088s
sys     0m28.040s

We need to increase the timeout ... outch.

sanjaysrikakulam · 2024-05-07T11:44:11Z

Yes, we can migrate this to the maintenance node and update the bashrc if necessary to export a GALAXY_LOG_DIR. The gxadmin command seems to create logs .

This also means we must configure a logrotate to clean up those log files on the maintenance node.

I will get back to this once the maintenance node is back online.

hexylena · 2024-05-07T11:54:24Z

gxadmin command seems to create logs .

I would be open to it logging to stderr / journald, but it's because that's how the script in galaxy works, PRs welcome!

bgruening · 2024-05-07T12:05:54Z

Please merge if you think it's fine. Im still debugging our posters problems.

sanjaysrikakulam · 2024-05-07T12:17:01Z

Cool, I am trying to catch up on things. Also, I think this is fine. We can test it, and if we find any side effects, then we can reduce the interval again like once a day or so.

sanjaysrikakulam · 2024-05-07T12:24:55Z

I just found out that the task was deployed via this. So the interval must be updated in the group vars instead.

I will create a PR reflecting yours shortly.

Update main.yml

7c3afd5

Update main.yml

2ecb69a

sanjaysrikakulam approved these changes May 7, 2024

View reviewed changes

mira-miracoli approved these changes May 7, 2024

View reviewed changes

mira-miracoli merged commit ee40ef0 into master May 7, 2024
2 checks passed

sanjaysrikakulam mentioned this pull request May 7, 2024

change the galaxy clean up task execution interval from 6hrs to 48hrs #1198

Merged

bgruening deleted the increase-time-for-galaxy-cleanup branch October 19, 2024 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase the time for Galaxy cleanup again #1195

Increase the time for Galaxy cleanup again #1195

bgruening commented May 6, 2024

bgruening commented May 6, 2024 •

edited

Loading

bgruening commented May 7, 2024

sanjaysrikakulam commented May 7, 2024

hexylena commented May 7, 2024

bgruening commented May 7, 2024

sanjaysrikakulam commented May 7, 2024

sanjaysrikakulam commented May 7, 2024

Increase the time for Galaxy cleanup again #1195

Increase the time for Galaxy cleanup again #1195

Conversation

bgruening commented May 6, 2024

bgruening commented May 6, 2024 • edited Loading

bgruening commented May 7, 2024

sanjaysrikakulam commented May 7, 2024

hexylena commented May 7, 2024

bgruening commented May 7, 2024

sanjaysrikakulam commented May 7, 2024

sanjaysrikakulam commented May 7, 2024

bgruening commented May 6, 2024 •

edited

Loading