Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise Postgres post processing to speedup process #75

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

anilthanki
Copy link

Splitting up 05-post_processing.sql.template into 05-01-post_processing.sql.template and 05-02-post_processing.sql.template with advise from DBA team to speedup process

@anilthanki anilthanki self-assigned this Sep 9, 2024
@anilthanki anilthanki force-pushed the feature/optimise_post_processing branch from fab259e to b8e10b0 Compare October 7, 2024 14:17
Copy link
Member

@pmb59 pmb59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've proposed maintenance_work_mem Increase (to be confirmed by DBA).

PostgreSQL supports parallel query execution, e.g.

SET max_parallel_workers_per_gather = 4;  

current value in SCXA db is max_parallel_workers_per_gather 2. Increasing could help too with long data-intensive db operations.

- re-enable autovacuum (mandatory)
- add a check constraint on the partition value (mandatory)
*/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increase memory for maintenance, current one used is 256MB

Suggested change
SET maintenance_work_mem = '16GB';

- cluster the partition tables on primary key (optional) -> blocking op
- collect statistics on the partition tables (mandatory)
*/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SET maintenance_work_mem = '16GB';

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DBA confirmed The memory allocated to the VM is 16GB. So using this with SET max_parallel_workers_per_gather = 4; not possible.

*/

CLUSTER scxa_analytics_<EXP-ACCESSION> USING scxa_analytics_<EXP-ACCESSION>_pk;
ANALYZE scxa_analytics_<EXP-ACCESSION>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ANALYZE scxa_analytics_<EXP-ACCESSION>;
ANALYZE scxa_analytics_<EXP-ACCESSION>;
RESET maintenance_work_mem ;

@pmb59
Copy link
Member

pmb59 commented Dec 5, 2024

@irisdianauy @anilthanki Summary of sequential actions from DBA, after VM mem increase:

  • remove SET maintenance_work_mem = '2GB'; and leave its default after VM mem increase.
  • if not significant improvement, pin SET maintenance_work_mem and check performance
  • increase max_parallel_workers_per_gather (default 2)
  • consider use binding variable in SQL statements to improve performance, with CLUSTER

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants