-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DbtDocsGCSOperator
#616
Conversation
👷 Deploy Preview for amazing-pothos-a3bca0 processing.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much @jbandoro; this is looking great!
I particularly enjoyed the refactoring you did. It makes the code easier to maintain.
There are some comments inline, the most important being the backward compatibility.
Please also fix the broken integration test:
ERROR tests/test_example_dags_no_connections.py - assert not {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/dbt_docs.py': 'Traceback (most recent call last):\n ...G")\nairflow.exceptions.DuplicateTaskIdFound: Task id \'generate_dbt_docs_azure\' has already been added to the DAG\n'}
edit: nvm I think I fixed the issue, I think the integration should pass now |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #616 +/- ##
==========================================
+ Coverage 93.29% 93.37% +0.07%
==========================================
Files 53 53
Lines 2089 2113 +24
==========================================
+ Hits 1949 1973 +24
Misses 140 140
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much, @jbandoro ! Looks great!
@jbandoro, this change is available in the 1.3.0a1 pre-release of Cosmos:
Thank you very much for your contribution! |
Features * Add ProfileMapping for Vertica by @perttus in astronomer#540 and astronomer#688 * Add ProfileMapping for Snowflake encrypted private key path by @ivanstillfront in astronomer#608 * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in astronomer#649 * Add DbtDocsGCSOperator for uploading dbt docs to GCS by @jbandoro in astronomer#616 Others Rebased on changes released on 1.2.5 (1.3.0a1 was based on 1.2.4)
[Justin Bandoro](https://www.linkedin.com/in/justin-bandoro-592b14a7/) (@jbandoro) is a Data Engineer at Kevala Inc. He's based in San Francisco (USA) and has been an early adopter of Cosmos, using it regularly at his company. Not only has he been using Cosmos since the early stages, but he has consistently improved Cosmos since January 2023: ![Screenshot 2023-12-04 at 16 28 29](https://github.com/astronomer/astronomer-cosmos/assets/272048/43197938-d1ab-431f-b101-b6026e5cd3ab) Some of his contributions include new features, code quality, documentation and overall improvements. Some examples: * Speed up integration tests in 67% #732 * Prevent override of dbt profile fields #702 * Add support for env vars in `RenderConfig` in #690 * Use symbolic links to run local tasks, avoiding to copy potentially huge dbt project folders in #660 * Improve documentation in #638 * Automated and improved the code complexity checks in #629 * Added `DbtDocsGCSOperator` in #616 * Added support for Python 3.7 in #88 and #214 Additionally, he has been interacting with users in the #airflow-dbt Slack channel in a very collaborative and supportive way. We want to promote him as a Cosmos committer and maintainer for all these, recognising his constant efforts and achievements towards our community. Thank you very much, @jbandoro !
**Features** * Add `ProfileMapping` for Snowflake encrypted private key path by @ivanstillfront in #608 * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in #649 * Add `DbtDocsGCSOperator` for uploading dbt docs to GCS by @jbandoro in #616 * Add support to select using (some) graph operators when using `LoadMode.CUSTOM` and `LoadMode.DBT_MANIFEST` by @tatiana in #728 * Add cosmos/propagate_logs Airflow config support for disabling log propagation by @agreenburg in #648 * Add operator_args ``full_refresh`` as a templated field by @joppevos in #623 * Expose environment variables and dbt variables in ``ProjectConfig`` by @jbandoro in #735 **Enhancements** * Make Pydantic an optional dependency by @pixie79 in #736 * Create a symbolic link to `dbt_packages` when `dbt_deps` is False when using `LoadMode.DBT_LS` by @DanMawdsleyBA in #730 * Support no `profile_config` for `ExecutionMode.KUBERNETES` and `ExecutionMode.DOCKER` by @MrBones757 and @tatiana in #681 and #731 * Add `aws_session_token` for Athena mapping by @benjamin-awd in #663 **Others** * Replace flake8 for Ruff by @joppevos in #743 * Reduce code complexity to 8 by @joppevos in #738 * Update conflict matrix between Airflow and dbt versions by @tatiana in #731 * Speed up integration tests by @jbandoro in #732
**Features** * Add new parsing method ``LoadMode.DBT_LS_FILE`` by @woogakoki in #733 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls-file)). * Add support to select using (some) graph operators when using ``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in #728 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html#using-select-and-exclude)) * Add support for dbt ``selector`` arg for DAG parsing by @jbandoro in #755, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#render-config)). * Add ``ProfileMapping`` for Vertica by @perttus in #540, #688 and #741, as ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/VerticaUserPassword.html)). * Add ``ProfileMapping`` for Snowflake encrypted private key path by @ivanstillfront in #608, as ([documentation]( https://astronomer.github.io/astronomer-cosmos/profiles/SnowflakeEncryptedPrivateKeyFilePem.html)). * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in #649 * Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro in #616, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/generating-docs.html#upload-to-gcs)). * Add cosmos/propagate_logs Airflow config support for disabling log propagation by @agreenburg in #648, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/logging.html)). * Add operator_args ``full_refresh`` as a templated field by @joppevos in #623 * Expose environment variables and dbt variables in ``ProjectConfig`` by @jbandoro in #735 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/project-config.html#project-config-example)). * Support disabling event tracking when using Cosmos profile mapping by @jbandoro in #768, ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/index.html#disabling-dbt-event-tracking)). **Enhancements** * Make Pydantic an optional dependency by @pixie79 in #736 * Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in #730 * Add ``aws_session_token`` for Athena mapping by @benjamin-awd in #663 * Retrieve temporary credentials from ``conn_id`` for Athena by @octiva in #758 * Extend ``DbtDocsLocalOperator`` with static flag by @joppevos in #759 **Bug fixes** * Remove Pydantic upper version restriction so Cosmos can be used with Airflow 2.8 by @jlaneve in #772 **Others** * Replace flake8 for Ruff by @joppevos in #743 * Reduce code complexity to 8 by @joppevos in #738 * Speed up integration tests by @jbandoro in #732 * Fix README quickstart link in by @RNHTTR in #776 * Add package location to work with hatchling 1.19.0 by @jbandoro in #761 * Fix type check error in ``DbtKubernetesBaseOperator.build_env_args`` by @jbandoro in #766 * Improve ``DBT_MANIFEST`` documentation by @dwreeves in #757 * Update conflict matrix between Airflow and dbt versions by @tatiana in #731 and #779 * pre-commit updates in #775, #770, #762
**Features** * Add new parsing method ``LoadMode.DBT_LS_FILE`` by @woogakoki in astronomer#733 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls-file)). * Add support to select using (some) graph operators when using ``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in astronomer#728 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html#using-select-and-exclude)) * Add support for dbt ``selector`` arg for DAG parsing by @jbandoro in astronomer#755, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#render-config)). * Add ``ProfileMapping`` for Vertica by @perttus in astronomer#540, astronomer#688 and astronomer#741, as ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/VerticaUserPassword.html)). * Add ``ProfileMapping`` for Snowflake encrypted private key path by @ivanstillfront in astronomer#608, as ([documentation]( https://astronomer.github.io/astronomer-cosmos/profiles/SnowflakeEncryptedPrivateKeyFilePem.html)). * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in astronomer#649 * Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro in astronomer#616, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/generating-docs.html#upload-to-gcs)). * Add cosmos/propagate_logs Airflow config support for disabling log propagation by @agreenburg in astronomer#648, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/logging.html)). * Add operator_args ``full_refresh`` as a templated field by @joppevos in astronomer#623 * Expose environment variables and dbt variables in ``ProjectConfig`` by @jbandoro in astronomer#735 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/project-config.html#project-config-example)). * Support disabling event tracking when using Cosmos profile mapping by @jbandoro in astronomer#768, ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/index.html#disabling-dbt-event-tracking)). **Enhancements** * Make Pydantic an optional dependency by @pixie79 in astronomer#736 * Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in astronomer#730 * Add ``aws_session_token`` for Athena mapping by @benjamin-awd in astronomer#663 * Retrieve temporary credentials from ``conn_id`` for Athena by @octiva in astronomer#758 * Extend ``DbtDocsLocalOperator`` with static flag by @joppevos in astronomer#759 **Bug fixes** * Remove Pydantic upper version restriction so Cosmos can be used with Airflow 2.8 by @jlaneve in astronomer#772 **Others** * Replace flake8 for Ruff by @joppevos in astronomer#743 * Reduce code complexity to 8 by @joppevos in astronomer#738 * Speed up integration tests by @jbandoro in astronomer#732 * Fix README quickstart link in by @RNHTTR in astronomer#776 * Add package location to work with hatchling 1.19.0 by @jbandoro in astronomer#761 * Fix type check error in ``DbtKubernetesBaseOperator.build_env_args`` by @jbandoro in astronomer#766 * Improve ``DBT_MANIFEST`` documentation by @dwreeves in astronomer#757 * Update conflict matrix between Airflow and dbt versions by @tatiana in astronomer#731 and astronomer#779 * pre-commit updates in astronomer#775, astronomer#770, astronomer#762
Features * Add ProfileMapping for Vertica by @perttus in astronomer#540 and astronomer#688 * Add ProfileMapping for Snowflake encrypted private key path by @ivanstillfront in astronomer#608 * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in astronomer#649 * Add DbtDocsGCSOperator for uploading dbt docs to GCS by @jbandoro in astronomer#616 Others Rebased on changes released on 1.2.5 (1.3.0a1 was based on 1.2.4)
[Justin Bandoro](https://www.linkedin.com/in/justin-bandoro-592b14a7/) (@jbandoro) is a Data Engineer at Kevala Inc. He's based in San Francisco (USA) and has been an early adopter of Cosmos, using it regularly at his company. Not only has he been using Cosmos since the early stages, but he has consistently improved Cosmos since January 2023: ![Screenshot 2023-12-04 at 16 28 29](https://github.com/astronomer/astronomer-cosmos/assets/272048/43197938-d1ab-431f-b101-b6026e5cd3ab) Some of his contributions include new features, code quality, documentation and overall improvements. Some examples: * Speed up integration tests in 67% astronomer#732 * Prevent override of dbt profile fields astronomer#702 * Add support for env vars in `RenderConfig` in astronomer#690 * Use symbolic links to run local tasks, avoiding to copy potentially huge dbt project folders in astronomer#660 * Improve documentation in astronomer#638 * Automated and improved the code complexity checks in astronomer#629 * Added `DbtDocsGCSOperator` in astronomer#616 * Added support for Python 3.7 in astronomer#88 and astronomer#214 Additionally, he has been interacting with users in the #airflow-dbt Slack channel in a very collaborative and supportive way. We want to promote him as a Cosmos committer and maintainer for all these, recognising his constant efforts and achievements towards our community. Thank you very much, @jbandoro !
**Features** * Add `ProfileMapping` for Snowflake encrypted private key path by @ivanstillfront in astronomer#608 * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in astronomer#649 * Add `DbtDocsGCSOperator` for uploading dbt docs to GCS by @jbandoro in astronomer#616 * Add support to select using (some) graph operators when using `LoadMode.CUSTOM` and `LoadMode.DBT_MANIFEST` by @tatiana in astronomer#728 * Add cosmos/propagate_logs Airflow config support for disabling log propagation by @agreenburg in astronomer#648 * Add operator_args ``full_refresh`` as a templated field by @joppevos in astronomer#623 * Expose environment variables and dbt variables in ``ProjectConfig`` by @jbandoro in astronomer#735 **Enhancements** * Make Pydantic an optional dependency by @pixie79 in astronomer#736 * Create a symbolic link to `dbt_packages` when `dbt_deps` is False when using `LoadMode.DBT_LS` by @DanMawdsleyBA in astronomer#730 * Support no `profile_config` for `ExecutionMode.KUBERNETES` and `ExecutionMode.DOCKER` by @MrBones757 and @tatiana in astronomer#681 and astronomer#731 * Add `aws_session_token` for Athena mapping by @benjamin-awd in astronomer#663 **Others** * Replace flake8 for Ruff by @joppevos in astronomer#743 * Reduce code complexity to 8 by @joppevos in astronomer#738 * Update conflict matrix between Airflow and dbt versions by @tatiana in astronomer#731 * Speed up integration tests by @jbandoro in astronomer#732
**Features** * Add new parsing method ``LoadMode.DBT_LS_FILE`` by @woogakoki in astronomer#733 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls-file)). * Add support to select using (some) graph operators when using ``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in astronomer#728 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html#using-select-and-exclude)) * Add support for dbt ``selector`` arg for DAG parsing by @jbandoro in astronomer#755, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#render-config)). * Add ``ProfileMapping`` for Vertica by @perttus in astronomer#540, astronomer#688 and astronomer#741, as ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/VerticaUserPassword.html)). * Add ``ProfileMapping`` for Snowflake encrypted private key path by @ivanstillfront in astronomer#608, as ([documentation]( https://astronomer.github.io/astronomer-cosmos/profiles/SnowflakeEncryptedPrivateKeyFilePem.html)). * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in astronomer#649 * Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro in astronomer#616, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/generating-docs.html#upload-to-gcs)). * Add cosmos/propagate_logs Airflow config support for disabling log propagation by @agreenburg in astronomer#648, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/logging.html)). * Add operator_args ``full_refresh`` as a templated field by @joppevos in astronomer#623 * Expose environment variables and dbt variables in ``ProjectConfig`` by @jbandoro in astronomer#735 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/project-config.html#project-config-example)). * Support disabling event tracking when using Cosmos profile mapping by @jbandoro in astronomer#768, ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/index.html#disabling-dbt-event-tracking)). **Enhancements** * Make Pydantic an optional dependency by @pixie79 in astronomer#736 * Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in astronomer#730 * Add ``aws_session_token`` for Athena mapping by @benjamin-awd in astronomer#663 * Retrieve temporary credentials from ``conn_id`` for Athena by @octiva in astronomer#758 * Extend ``DbtDocsLocalOperator`` with static flag by @joppevos in astronomer#759 **Bug fixes** * Remove Pydantic upper version restriction so Cosmos can be used with Airflow 2.8 by @jlaneve in astronomer#772 **Others** * Replace flake8 for Ruff by @joppevos in astronomer#743 * Reduce code complexity to 8 by @joppevos in astronomer#738 * Speed up integration tests by @jbandoro in astronomer#732 * Fix README quickstart link in by @RNHTTR in astronomer#776 * Add package location to work with hatchling 1.19.0 by @jbandoro in astronomer#761 * Fix type check error in ``DbtKubernetesBaseOperator.build_env_args`` by @jbandoro in astronomer#766 * Improve ``DBT_MANIFEST`` documentation by @dwreeves in astronomer#757 * Update conflict matrix between Airflow and dbt versions by @tatiana in astronomer#731 and astronomer#779 * pre-commit updates in astronomer#775, astronomer#770, astronomer#762
Description
Adds
DbtDocsGCSOperator
so dbt docs can be uploaded to GCS.Related Issue(s)
#541
Breaking Change?
Yes, I decided to create an abstract base classDbtDocsCloudLocalOperator
since the construction for all of S3, Azure and GCS operators are all so similar, however this breaksDbtDocsAzureStorageLocalOperator
andDbtDocsS3LocalOperator
since instead ofaws_conn_id
andazure_conn_id
it is nowconnection_id
.AlsoDbtDocsAzureStorageLocalOperator
usesbucket_name
now like the others instead ofcontainer_name
.No breaking changes but standardized
DbtDocsS3LocalOperator
,DbtDocsAzureStorageLocalOperator
to accept args forconnection_id
andbucket_name
. The current args ofaws_conn_id
(S3),azure_conn_id
andcontainer_name
(Azure) will still work with warnings to switch toconnection_id
andbucket_name
.Checklist