Create desc index #98

peterdudfield · 2023-06-16T12:23:06Z

It might be sueful to add some index which desc in time. This is becasue we load the most recent data (often)

I`ve tried adding

create index ix_generation_start_utc_desc 
on generation (start_utc desc)

It took 24 seconds to make

The text was updated successfully, but these errors were encountered:

peterdudfield · 2023-06-16T12:40:01Z

Above didnt really help

I also tried
create index ix_forecasts_timestamp_utc_desc on forecasts (timestamp_utc desc)

but it didnt really help

peterdudfield · 2023-06-16T12:46:24Z

Tyring create index ix_forecast_values_start_utc_horizon_minutes on forecast_values (start_utc, horizon_minutes)

It took 1.26 minutes to make

peterdudfield · 2023-06-16T12:50:54Z

Want to try cliustering on horizon_minutes

on forecast_values (horizon_minutes)

ALTER TABLE forecast_values
CLUSTER ON ix_forecast_values_horizon_minutes;```

This took 1.11 seconds to make

simlmx · 2023-06-16T13:28:50Z

Can you share the query you are trying to optimize?

peterdudfield · 2023-06-16T13:37:14Z

The forecast past one

peterdudfield · 2023-06-16T13:37:54Z

I thought we could partition the table into horizon_minutes=0 and horizon_minutes>0

peterdudfield · 2023-06-16T13:42:59Z

SELECT DISTINCT ON (forecasts.site_uuid, forecasts.timestamp_utc) *
FROM forecasts JOIN forecast_values ON forecasts.forecast_uuid = forecast_values.forecast_uuid
WHERE forecasts.site_uuid IN ('04d67e7f-7bcf-4b55-8ced-14d311e0a5f5', 
							  '0bd3dc11-8160-46a0-9ff7-36e125750aeb') 
AND forecasts.timestamp_utc >= '2023-06-15'
AND forecasts.timestamp_utc < '2023-06-16 15:45'
AND forecast_values.horizon_minutes = 0
AND forecast_values.start_utc >= '2023-06-15'
AND forecast_values.start_utc < '2023-06-16 15:45'
ORDER BY forecasts.site_uuid, forecasts.timestamp_utc

simlmx · 2023-06-16T14:20:54Z

You mean the 2 queries called when this is called?

pv-site-api/pv_site_api/_db_helpers.py

Line 138 in 411277e

def get_forecasts_by_sites(

If so I guess the query on past forecasts is the slow one? It's worth noting that the data returned by this query never changes (because it's in the past). As time goes by the only thing that changes is that the latest forecast (which is faster to query, that's the second query in our 2 queries).

My point is that instead of calling both queries systematically, we could call the first query once (slow but can be done async and that's typically not what the user is looking for) and then update it with the most latest forecast (which is fast) as they become available. We could even keep the data in the browser cache so that if they refresh it's already there. And if they close their window, come back 30 minutes later, we could just query the missing bit, which would be faster.

I'm not sure it's an easy problem to fix otherwise.

peterdudfield · 2023-06-16T14:24:35Z

I agree, they might not want that immediately,

I'm going to try some partitioning, and see if it helps

simlmx · 2023-06-16T14:35:31Z

Cool! It could be a good idea to make a separate toy database filled with a lot of (random) data and do tests there, to remove possible external factors, and probably iterate faster.

peterdudfield · 2023-06-16T14:38:46Z

Partition from table could be

-- drop table forecast_values_new2;

CREATE TABLE forecast_values_temp
(like forecast_values including all)

alter table forecast_values_temp
    drop constraint forecast_values_temp_pkey,
    add primary key (forecast_value_uuid, horizon_minutes);
	
create table forecast_values_new 
(like forecast_values_temp including all)
partition by range (horizon_minutes);

CREATE TABLE forecast_values_0 PARTITION OF forecast_values_new
    FOR VALUES FROM (0) TO (1);
	
CREATE TABLE forecast_values_g0 PARTITION OF forecast_values_new
    FOR VALUES FROM (1) TO (10000000);

-- This can take some time > 10
INSERT INTO forecast_values_new
SELECT * FROM forecast_values

ALTER TABLE forecast_values RENAME TO forecast_values_old;
ALTER TABLE forecast_values_new RENAME TO forecast_values;

And then do it in the datamodel

I want to try

alter table forecast_values
    drop constraint forecast_values_pkey,
    add primary key (forecast_value_uuid, horizon_minutes);

first

simlmx · 2023-06-16T14:47:15Z

It sounds dangerous to try those in the Dev/Prod databases. You could end up in a state that alembic doesn't recognize by mistake no?

peterdudfield · 2023-06-16T14:57:17Z

alter table forecast_values
    drop constraint forecast_values_pkey,
    add primary key (forecast_value_uuid, horizon_minutes);

This didnt work, Ill role this back.

I'm doing it in 'dev', the above meanes I can always role back, never deleting anything

peterdudfield · 2023-06-16T15:13:51Z

I made a parititon table and done the renaming, will have to leave it over the weekend to see if it speeds things up or not.

peterdudfield · 2023-09-07T14:06:36Z

I rolled the partition back

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create desc index #98

Create desc index #98

peterdudfield commented Jun 16, 2023 •

edited

Loading

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

simlmx commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023 •

edited

Loading

simlmx commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

simlmx commented Jun 16, 2023

peterdudfield commented Jun 16, 2023 •

edited

Loading

simlmx commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Sep 7, 2023

Create desc index #98

Create desc index #98

Comments

peterdudfield commented Jun 16, 2023 • edited Loading

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

simlmx commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023 • edited Loading

simlmx commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

simlmx commented Jun 16, 2023

peterdudfield commented Jun 16, 2023 • edited Loading

simlmx commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Jun 16, 2023

peterdudfield commented Sep 7, 2023

peterdudfield commented Jun 16, 2023 •

edited

Loading

peterdudfield commented Jun 16, 2023 •

edited

Loading

peterdudfield commented Jun 16, 2023 •

edited

Loading