subcategory |
---|
Compute |
Use databricks_pipeline
to deploy Delta Live Tables.
resource "databricks_notebook" "dlt_demo" {
...
}
resource "databricks_pipeline" "this" {
name = "Pipeline Name"
storage = "/test/first-pipeline"
configuration = {
key1 = "value1"
key2 = "value2"
}
cluster {
label = "default"
num_workers = 2
custom_tags = {
cluster_type = "default"
}
}
cluster {
label = "maintenance"
num_workers = 1
custom_tags = {
cluster_type = "maintenance"
}
}
library {
notebook {
path = databricks_notebook.dlt_demo.id
}
}
filters {
include = ["com.databricks.include"]
exclude = ["com.databricks.exclude"]
}
continuous = false
}
The following arguments are required:
name
- A user-friendly name for this pipeline. The name can be used to identify pipeline jobs in the UI.storage
- A location on DBFS or cloud storage where output data and metadata required for pipeline execution are stored. By default, tables are stored in a subdirectory of this location.configuration
- An optional list of values to apply to the entire pipeline. Elements must be formatted as key:value pairs.library
blocks - Specifies ipeline code and required artifacts. Syntax resembles library configuration block with the addition of a specialnotebook
type of library that should havepath
attribute.cluster
blocks - Clusters to run the pipeline. If none is specified, pipelines will automatically select a default cluster configuration for the pipeline.continuous
- A flag indicating whether to run the pipeline continuously. The default value isfalse
.target
- The name of a database for persisting pipeline output data. Configuring the target setting allows you to view and query the pipeline output data from the Databricks UI.
The resource job can be imported using the id of the pipeline
$ terraform import databricks_pipeline.this <pipeline-id>