-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix/is-incremental-compatible #4
bugfix/is-incremental-compatible #4
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-joemarkiewicz Everything is running well locally! I ran full refresh and incremental runs on bigquery, databricks all purpose, and databricks sql, and confirm all run the expected incremental strategy.
I have some comments/questions for the integrations tests, so just let me know if you want to chat about those!
.buildkite/scripts/run_models.sh
Outdated
@@ -16,12 +16,21 @@ db=$1 | |||
echo `pwd` | |||
cd integration_tests | |||
dbt deps | |||
if [ "$db" = "databricks-sql" ]; then | |||
dbt seed --vars '{fivetran_platform_schema: sqlw_tests}' --target "$db" --full-refresh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dbt seed --vars '{fivetran_platform_schema: sqlw_tests}' --target "$db" --full-refresh | |
dbt seed --vars '{salesforce_marketing_cloud_schema: sfmc_sqlw_tests}' --target "$db" --full-refresh |
The var here just needs to be updated--I think this will fix the table not found messages in the last buildkite run.
I would also make these vars specific for sfmc to keep separated from other packages sql warehouse schemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes 🤦 that was a copy/paste error on my end. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}" | ||
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}" | ||
http_path: "{{ env_var('CI_DATABRICKS_SQL_DBT_HTTP_PATH') }}" | ||
schema: sqlw_tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update to the schema name here too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
integration_tests/dbt_project.yml
Outdated
@@ -21,6 +21,10 @@ vars: | |||
salesforce_marketing_cloud_link_send_identifier: "link_send_data" | |||
salesforce_marketing_cloud_triggered_send_identifier: "triggered_send_data" | |||
|
|||
models: | |||
+schema: "{{ 'sqlw_tests' if target.name == 'databricks-sql' else 'salesforce_marketing_cloud' }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update to the schema name here too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
integration_tests/tests/consistency/consistency_events_enhanced.sql
Outdated
Show resolved
Hide resolved
dbt seed --target "$db" --full-refresh | ||
dbt run --target "$db" --full-refresh | ||
dbt test --target "$db" | ||
dbt run --target "$db" | ||
dbt test --target "$db" | ||
dbt run --vars '{salesforce_marketing_cloud__link_enabled: false, salesforce_marketing_cloud__list_enabled: false}' --full-refresh --target "$db" | ||
dbt test --target "$db" | ||
|
||
fi | ||
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In dbt_fivetran_log, do you recall why these lines were included in run_models.sh? It seems like all is functioning here without it, so just curious!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those were just necessary for the Azure SQL Server support. Different from Databricks SQL Warehouse. No need for that here since we are not supporting Azure SQL Server in this model yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense--thank you!
Co-authored-by: fivetran-catfritz <[email protected]>
…d.sql Co-authored-by: fivetran-catfritz <[email protected]>
Thanks @fivetran-catfritz for the review and comments/suggestions! I just applied the updates and this is good for re-review. Let me know if you have any other comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So nice to have staging and transform all in one place. Thanks for the responses, and all lgtm!
dbt seed --target "$db" --full-refresh | ||
dbt run --target "$db" --full-refresh | ||
dbt test --target "$db" | ||
dbt run --target "$db" | ||
dbt test --target "$db" | ||
dbt run --vars '{salesforce_marketing_cloud__link_enabled: false, salesforce_marketing_cloud__list_enabled: false}' --full-refresh --target "$db" | ||
dbt test --target "$db" | ||
|
||
fi | ||
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense--thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-joemarkiewicz Looking good! Just a few questions/suggestions before approving.
@@ -16,12 +16,21 @@ db=$1 | |||
echo `pwd` | |||
cd integration_tests | |||
dbt deps | |||
if [ "$db" = "databricks-sql" ]; then | |||
dbt seed --vars '{salesforce_marketing_cloud_schema: sfmc_sqlw_tests}' --target "$db" --full-refresh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a dbt compile statement after this line, similar to dbt_fivetran_log
(link)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A compile statement is not necessary. A compile will already be run as part of the run operation. The compile in Fivetran Log may have been for a particular reason, but I don't imagine we need to include it here.
dbt test --vars '{salesforce_marketing_cloud_schema: sfmc_sqlw_tests}' --target "$db" | ||
dbt run --vars '{salesforce_marketing_cloud_schema: sfmc_sqlw_tests}' --target "$db" | ||
dbt test --vars '{salesforce_marketing_cloud_schema: sfmc_sqlw_tests}' --target "$db" | ||
dbt run --vars '{salesforce_marketing_cloud_schema: sfmc_sqlw_tests, salesforce_marketing_cloud__link_enabled: false, salesforce_marketing_cloud__list_enabled: false}' --full-refresh --target "$db" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a dbt run (incremental) after the full-refresh, similar to fivetran_log
? (link)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah including an incremental run after a full refresh would be best here. Added to both sql warehouse and normal sections of the run_models.sh
Co-authored-by: Avinash Kunnath <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved!
PR Overview
This PR will address the following Issue/Feature: Internal Ticket T-739367
This PR will result in the following new package version:
v0.2.0
This will be a breaking change due to the nature of Databricks incremental runs resulting in a breaking change as the table format will change for all Databricks incremental models as well as non databricks runtimes will not leverage an incremental strategy for the impacted models.
Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:
PR Checklist
Basic Validation
Please acknowledge that you have successfully performed the following commands locally:
Before marking this PR as "ready for review" the following have been applied:
Detailed Validation
Please share any and all of your validation steps:
Please see the validation tests passing below using the following variables for BigQuery:
Additionally, we can see the Databricks incremental logic is working as expected.
✅ Databricks Runtime is using the insert-overwrite incremental logic on subsequent runs as expected.
✅ Databricks SQL Warehouse (also represents the behavior for all non Databricks Runtimes) is not using the incremental logic on subsequent runs as expected.
Ensuring the other warehouses are not impacted, I can confirm that Snowflake, BigQuery, Redshift, and Postgres are all still using the appropriate incremental logic on subsequent runs.
If you had to summarize this PR in an emoji, which would it be?
🧱