docs: ADR for results storage with discrete metrics #2304

mikewilli · 2024-11-08T21:41:07Z

This lays out the options for how to deal with the new discrete metric execution in terms of storing the results.

I am leaning towards Option 2 because I think it will be the most straightforward, with the main complexity lying in creating the views to match the current structure. Open to discussion though!

scholtzan · 2024-11-12T18:15:01Z

docs/adr-0003-discrete_metrics_table_structure.md

+* **+** Low complexity of logic
+* **-** This breaks backwards compatibility with the current tables schemas (mitigated by view schema remaining the same)
+* **-** Redundancy in table for repeated client info columns
+


I believe even this will basically be billed as if the entire table is recreated: https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax

Yea you're right, at least for the case where we need to delete rows. This should only apply to rerun scenarios though, at least. In normal execution, we wouldn't need to delete any records, so it should only process the new data:

q = The sum of bytes processed by the DML statement itself, including any columns referenced in tables scanned by the DML statement.
t = The sum of bytes for all columns in the table being updated by the DML statement, as they are at the time the query starts. All columns are included, regardless of whether those columns are referenced in or modified by the DML statement.
Bytes processed
If there are only INSERT clauses: q.
If there is an UPDATE or DELETE clause: q + t.

I can clarify this in the pros/cons.

scholtzan · 2024-11-12T18:15:18Z

docs/adr-0003-discrete_metrics_table_structure.md

+### 3. Table per Metric
+
+* This option is almost identical to the [Row per Metric](#2.-Row-per-Metric) option, however instead of adding rows to an existing table, we will create a new table for each metric
+* The `metric_slug` column from Option 2 is not necessary, so we can retain the current column-named-with-metric-slug, but each table will only ever 


I think there something missing here

scholtzan · 2024-11-12T18:17:41Z

docs/adr-0003-discrete_metrics_table_structure.md

+* **-** Added cost (BigQuery treats each MERGE as basically a full delete/recreate of the table)
+
+
+### 2. Row per Metric


I don't "love" this option, but of the ones we came up so far I think this might be the best one

Yea I feel the same way. I guess the question is whether we're happy enough with this to proceed, or if we dislike them all enough to go back to the drawing board. I'd vote move forwards with this approach obviously, but open to discussion.

docs: ADR for results storage with discrete metrics

99935af

mikewilli requested review from jaredlockhart, scholtzan and danielkberry November 8, 2024 21:41

scholtzan reviewed Nov 12, 2024

View reviewed changes

clarify cost implications

890c6b3

mikewilli mentioned this pull request Nov 13, 2024

Discrete metrics table structure #2307

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: ADR for results storage with discrete metrics #2304

docs: ADR for results storage with discrete metrics #2304

mikewilli commented Nov 8, 2024

scholtzan Nov 12, 2024

mikewilli Nov 12, 2024

scholtzan Nov 12, 2024

scholtzan Nov 12, 2024 •

edited

Loading

mikewilli Nov 12, 2024

		* - Added cost (BigQuery treats each MERGE as basically a full delete/recreate of the table)


		### 2. Row per Metric

docs: ADR for results storage with discrete metrics #2304

Are you sure you want to change the base?

docs: ADR for results storage with discrete metrics #2304

Conversation

mikewilli commented Nov 8, 2024

scholtzan Nov 12, 2024

Choose a reason for hiding this comment

mikewilli Nov 12, 2024

Choose a reason for hiding this comment

scholtzan Nov 12, 2024

Choose a reason for hiding this comment

scholtzan Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

mikewilli Nov 12, 2024

Choose a reason for hiding this comment

scholtzan Nov 12, 2024 •

edited

Loading