You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
Describe the feature
Here's a typical use case for a dbt model:
SELECT
...
FROM table1
JOIN table2 ON ...
JOIN table3 ON ...
WHERE1=1--defining the batchANDtable1.event_date>=date('2024-01-01') ANDtable1.event_date<date('2024-01-02')
--filtering a dependent table to read enough data to cover the batch (events in table2 can occur +/-1 day from events in table1)ANDtable2.event_date>=date('2024-01-01') - interval 1 day ANDtable2.event_date<date('2024-01-02') + interval 1 day
--filtering a dependent table to read enough data to cover the batch (events in table3 can occur + 2 days from events in table1)ANDtable3.event_date>=date('2024-01-01') ANDtable3.event_date<date('2024-01-02') + interval 2 day
If I understand the current implementation correctly, the microbatch lookback parameter will let me define one lookback value (in units of batches) that will filter data from table1, table2 and table3, only on the "left" side, i.e. before the batch.
In practice, there will always be records that are tricky on the edges (e.g. fall into the next day by 1 second). Therefore it's especially important to have the buffer on both sides of the batch. The ability to configure the buffer table by table is a performance gain, especially if there are multiple tables that require a small buffer and one table that requires a very large buffer.
My example would be processing order events and reading the order table, where most orders are recent, but some orders are scheduled orders, created up to 90 days before any events happen.
Describe alternatives you've considered
No response
Who will this benefit?
Users of the microbatch incremental loading logic.
Are you interested in contributing this feature?
No response
Anything else?
#10640 might be describing the same request, but I'm not sure
The text was updated successfully, but these errors were encountered:
Is this your first time submitting a feature request?
Describe the feature
Here's a typical use case for a dbt model:
If I understand the current implementation correctly, the microbatch
lookback
parameter will let me define one lookback value (in units of batches) that will filter data from table1, table2 and table3, only on the "left" side, i.e. before the batch.In practice, there will always be records that are tricky on the edges (e.g. fall into the next day by 1 second). Therefore it's especially important to have the buffer on both sides of the batch. The ability to configure the buffer table by table is a performance gain, especially if there are multiple tables that require a small buffer and one table that requires a very large buffer.
My example would be processing order events and reading the order table, where most orders are recent, but some orders are scheduled orders, created up to 90 days before any events happen.
Describe alternatives you've considered
No response
Who will this benefit?
Users of the
microbatch
incremental loading logic.Are you interested in contributing this feature?
No response
Anything else?
#10640 might be describing the same request, but I'm not sure
The text was updated successfully, but these errors were encountered: