Merge branch 'current' into dbeatty/builtins-context-variable

dbt-labs · Sep 25, 2023 · 47fd6c7 · 47fd6c7
2 parents 8b379be + 6f24892
commit 47fd6c7
Show file tree

Hide file tree

Showing 4 changed files with 100 additions and 7 deletions.
diff --git a/website/docs/docs/cloud/billing.md b/website/docs/docs/cloud/billing.md
@@ -94,6 +94,90 @@ There are 2 options to disable models from being built and charged:
 2. Alternatively, you can delete some or all of your dbt Cloud jobs. This will ensure that no runs are kicked off, but you will permanently lose your job(s). 
 
 
+## Optimize costs in dbt Cloud
+
+dbt Cloud offers ways to optimize your model’s built usage and warehouse costs. 
+
+### Best practices for optimizing successful models built
+
+When thinking of ways to optimize your costs from successful models built, there are methods to reduce those costs while still adhering to best practices. To ensure that you are still utilizing tests and rebuilding views when logic is changed, it's recommended to implement a combination of the best practices that fit your needs. More specifically, if you decide to exclude views from your regularly scheduled dbt Cloud job runs, it's imperative that you set up a merge job (with a link to the section) to deploy updated view logic when changes are detected.
+
+#### Exclude views in a dbt Cloud job
+
+Many dbt Cloud users utilize views, which don’t always need to be rebuilt every time you run a job. For any jobs that contain views that _do not_ include macros that dynamically generate code (for example, case statements) based on upstream tables and also _do not_ have tests, you can implement these steps:
+
+1. Go to your current production deployment job in dbt Cloud.
+2. Modify your command to include: `-exclude config.materialized:view`.
+3. Save your job changes.
+
+If you have views that contain macros with case statements based on upstream tables, these will need to be run each time to account for new values. If you still need to test your views with each run, follow the [Exclude views while still running tests](#exclude-views-while-running-tests) best practice to create a custom selector. 
+
+#### Exclude views while running tests
+
+Running tests for views in every job run can help keep data quality intact and save you from the need to rerun failed jobs. To exclude views from your job run while running tests, you can follow these steps to create a custom [selector](https://docs.getdbt.com/reference/node-selection/yaml-selectors) for your job command. 
+
+1. Open your dbt project in the dbt Cloud IDE.
+2. Add a file called `selectors.yml` in your top-level project folder.
+3. In the file, add the following code:
+
+   ```yaml 
+    selectors:
+      - name: skip_views_but_test_views
+        description: >
+          A default selector that will exclude materializing views
+          without skipping tests on views.
+        default: true
+        definition:
+          union:
+            - union: 
+              - method: path
+                value: "*"
+              - exclude: 
+                - method: config.materialized
+                  value: view
+            - method: resource_type
+              value: test
+
+    ```
+
+4. Save the file and commit it to your project.
+5. Modify your dbt Cloud jobs to include `--selector skip_views_but_test_views`.
+
+#### Build only changed views
+
+If you want to ensure that you're building views whenever the logic is changed, create a merge job that gets triggered when code is merged into main: 
+
+1. Ensure you have a [CI job setup](/docs/deploy/ci-jobs) in your environment.
+2. Create a new [deploy job](/docs/deploy/deploy-jobs#create-and-schedule-jobs) and call it “Merge Job".
+3. Set the  **Environment** to your CI environment. Refer to [Types of environments](/docs/deploy/deploy-environments#types-of-environments) for more details.
+4. Set **Commands** to: `dbt run -s state:modified+`.
+    Executing `dbt build` in this context is unnecessary because the CI job was used to both run and test the code that just got merged into main.
+5. Under the **Execution Settings**, select the default production job to compare changes against:
+    - **Defer to a previous run state** &mdash; Select the “Merge Job” you created so the job compares and identifies what has changed since the last merge.
+6. In your dbt project, follow the steps in [Run a dbt Cloud job on merge](/guides/orchestration/custom-cicd-pipelines/3-dbt-cloud-job-on-merge) to create a script to trigger the dbt Cloud API to run your job after a merge happens within your git repository or watch this [video](https://www.loom.com/share/e7035c61dbed47d2b9b36b5effd5ee78?sid=bcf4dd2e-b249-4e5d-b173-8ca204d9becb).
+
+The purpose of the merge job is to:
+
+- Immediately deploy any changes from PRs to production.
+- Ensure your production views remain up-to-date with how they’re defined in your codebase while remaining cost-efficient when running jobs in production.
+
+The merge action will optimize your cloud data platform spend and shorten job times, but you’ll need to decide if making the change is right for your dbt project.
+
+### Rework inefficient models
+
+#### Job Insights tab
+
+To reduce your warehouse spend, you can identify what models, on average, are taking the longest to build in the **Job** page under the **Insights** tab. This chart looks at the average run time for each model based on its last 20 runs. Any models that are taking longer than anticipated to build might be prime candidates for optimization, which will ultimately reduce cloud warehouse spending. 
+
+#### Model Timing tab
+
+To understand better how long each model takes to run within the context of a specific run, you can look at the **Model Timing** tab. Select the run of interest on the **Run History** page to find the tab. On that **Run** page, click **Model Timing**. 
+
+Once you've identified which models could be optimized, check out these other resources that walk through how to optimize your work: 
+* [Build scalable and trustworthy data pipelines with dbt and BigQuery](https://services.google.com/fh/files/misc/dbt_bigquery_whitepaper.pdf) 
+* [Best Practices for Optimizing Your dbt and Snowflake Deployment](https://www.snowflake.com/wp-content/uploads/2021/10/Best-Practices-for-Optimizing-Your-dbt-and-Snowflake-Deployment.pdf) 
+* [How to optimize and troubleshoot dbt models on Databricks](/guides/dbt-ecosystem/databricks-guides/how_to_optimize_dbt_models_on_databricks)
+
 ## FAQs
 
 * What happens if I need more than 8 seats on the Team plan? 

diff --git a/website/docs/guides/best-practices/how-we-style/2-how-we-style-our-sql.md b/website/docs/guides/best-practices/how-we-style/2-how-we-style-our-sql.md
@@ -25,7 +25,7 @@ id: 2-how-we-style-our-sql
 
 - 🔙 Fields should be stated before aggregates and window functions.
 - 🤏🏻 Aggregations should be executed as early as possible (on the smallest data set possible) before joining to another table to improve performance.
-- 🔢 Ordering and grouping by a number (eg. group by 1, 2) is preferred over listing the column names (see [this classic rant](https://blog.getdbt.com/write-better-sql-a-defense-of-group-by-1/) for why). Note that if you are grouping by more than a few columns, it may be worth revisiting your model design.
+- 🔢 Ordering and grouping by a number (eg. group by 1, 2) is preferred over listing the column names (see [this classic rant](https://www.getdbt.com/blog/write-better-sql-a-defense-of-group-by-1) for why). Note that if you are grouping by more than a few columns, it may be worth revisiting your model design.
 
 ## Joins
 

diff --git a/website/docs/reference/dbt-classes.md b/website/docs/reference/dbt-classes.md
@@ -86,6 +86,7 @@ col = Column('name', 'varchar', 255)
 col.is_string() # True
 col.is_numeric() # False
 col.is_number() # False
+col.is_integer() # False
 col.is_float() # False
 col.string_type() # character varying(255)
 col.numeric_type('numeric', 12, 4) # numeric(12,4)
@@ -112,6 +113,7 @@ col.numeric_type('numeric', 12, 4) # numeric(12,4)
 - **is_string()**: Returns True if the column is a String type (eg. text, varchar), else False
 - **is_numeric()**: Returns True if the column is a fixed-precision Numeric type (eg. `numeric`), else False
 - **is_number()**: Returns True if the column is a number-y type (eg. `numeric`, `int`, `float`, or similar), else False
+- **is_integer()**: Returns True if the column is an integer (eg. `int`, `bigint`, `serial` or similar), else False
 - **is_float()**: Returns True if the column is a float type (eg. `float`, `float64`, or similar), else False
 - **string_size()**: Returns the width of the column if it is a string type, else, an exception is raised
 
@@ -136,6 +138,9 @@ col.numeric_type('numeric', 12, 4) # numeric(12,4)
 -- Return true if the column is a number
 {{ string_column.is_number() }}
 
+-- Return true if the column is an integer
+{{ string_column.is_integer() }}
+
 -- Return true if the column is a float
 {{ string_column.is_float() }}
 
@@ -151,6 +156,9 @@ col.numeric_type('numeric', 12, 4) # numeric(12,4)
 -- Return true if the column is a number
 {{ numeric_column.is_number() }}
 
+-- Return true if the column is an integer
+{{ numeric_column.is_integer() }}
+
 -- Return true if the column is a float
 {{ numeric_column.is_float() }}
 

diff --git a/website/docs/reference/dbt-jinja-functions/dbt-project-yml-context.md b/website/docs/reference/dbt-jinja-functions/dbt-project-yml-context.md
@@ -1,22 +1,23 @@
 ---
-title: " About dbt_project.yml context variables"
+title: " About dbt_project.yml context"
 sidebar_label: "dbt_project.yml context"
 id: "dbt-project-yml-context"
-description: "The context variables and methods are available when configuring resources in the dbt_project.yml file."
+description: "The context methods and variables available when configuring resources in the dbt_project.yml file."
 ---
 
-The following context variables and methods are available when configuring
+The following context methods and variables are available when configuring
 resources in the `dbt_project.yml` file. This applies to the `models:`, `seeds:`,
 and `snapshots:` keys in the `dbt_project.yml` file.
 
+**Available context methods:**
+- [env_var](/reference/dbt-jinja-functions/env_var)
+- [var](/reference/dbt-jinja-functions/var) (_Note: only variables defined with `--vars` are available_)
+
 **Available context variables:**
 - [target](/reference/dbt-jinja-functions/target)
-- [env_var](/reference/dbt-jinja-functions/env_var)
-- [vars](/reference/dbt-jinja-functions/var) (_Note: only variables defined with `--vars` are available_)
 - [builtins](/reference/dbt-jinja-functions/builtins)
 - [dbt_version](/reference/dbt-jinja-functions/dbt_version)
 
-
 ### Example configuration
 
 <File name='dbt_project.yml'>