Skip to content

Commit

Permalink
tf conditional apply
Browse files Browse the repository at this point in the history
  • Loading branch information
lalelisealstad committed Sep 29, 2024
1 parent dc6d1d3 commit b8fe21c
Show file tree
Hide file tree
Showing 4 changed files with 43 additions and 10 deletions.
20 changes: 17 additions & 3 deletions .github/workflows/terraform-pyspark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
- name: Authenticate with Google Cloud
uses: google-github-actions/auth@v1
with:
credentials_json: ${{ secrets.GCP_SA_KEY }}
credentials_json: ${{ secrets.GCP_SA_KEY }}

# Set up Google Cloud SDK
- name: Set up Google Cloud SDK
Expand All @@ -35,17 +35,31 @@ jobs:
terraform init
terraform validate
- name: Terraform apply
- name: Terraform Plan
id: plan
run: terraform plan -out=plan.out
continue-on-error: true # Ensure the job continues even if no changes

# Conditional Terraform apply step based on the plan output
- name: Check if changes need to be applied
if: ${{ steps.plan.outcome == 'success' && steps.plan.conclusion == 'success' }}
run: |
terraform apply -auto-approve
terraform apply -auto-approve plan.out
# Skip apply if no changes (plan success with no diffs)
- name: Skip apply if no changes
if: ${{ steps.plan.outcome != 'success' }}
run: echo "No changes to apply."

# Step to upload the PySpark job script from GitHub repo to the GCS bucket
- name: Upload PySpark job to GCS
if: ${{ steps.plan.conclusion == 'success' }}
run: |
gsutil cp main.py gs://liquor-store-bucket/main.py
# Instantiate the Dataproc workflow template
- name: Run Dataproc Workflow Template
if: ${{ steps.plan.conclusion == 'success' }}
run: |
gcloud dataproc workflow-templates instantiate liquor-store-etl-workflow \
--region us-central1
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# liqour-sales-spark-etl

Develop to do:
- clean up repo
- describe in readme file

- add a linter to yml
- add biq query to tf, github actions and script
- big query analyse results examples

- clean up repo
- describe in readme file


take data from:
https://console.cloud.google.com/bigquery?_ga=2.8611638.288326788.1726990215-459272757.1699721835&project=mybookdashboard&ws=!1m5!1m4!4m3!1sbigquery-public-data!2siowa_liquor_sales!3ssales
Expand All @@ -34,6 +34,9 @@ pip install -r requirements.txt





#
gcloud dataproc workflow-templates set-managed-cluster liquor-etl-workflow \
--region us-central1 \
--cluster-name liquor-etl-workflow-jobs \
Expand Down
16 changes: 12 additions & 4 deletions main.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,12 +85,20 @@ def main():


df_transformed = transform(df)

try:
df_transformed = transform(df)

print('Dataframe transformed')

df_transformed.show()
except Exception as e:

print("Transformation failed", e)

raise
print('Dataframe transformed')

# load to big query
df_transformed.show()

# load to big query.....

spark.stop()

Expand Down
8 changes: 8 additions & 0 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,14 @@ resource "google_storage_bucket" "data_bucket" {
}
}

# remote state management
terraform {
backend "gcs" {
bucket = "liquor-store-terraform-state"
prefix = "terraform/state"
}
}

# Create a Dataproc workflow template
resource "google_dataproc_workflow_template" "template" {
name = "liquor-store-etl-workflow"
Expand Down

0 comments on commit b8fe21c

Please sign in to comment.