Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Incorporate SDM in CDK and add publish workflow #58

Merged
merged 29 commits into from
Nov 19, 2024

Conversation

ChristoGrab
Copy link
Collaborator

@ChristoGrab ChristoGrab commented Nov 14, 2024

Release Notes

  • New Features

    • Yoinked source-declarative-manifest from the monorepo
    • Introduced a new GitHub Actions workflow for building and publishing source-declarative-manifest Docker images.
    • Added new connectors to the connector testing workflow.
  • Enhancements

    • Expanded dependency management with the explicit addition of rapidfuzz (required to build SDM image).

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced new connectors for enhanced testing: source-the-guardian-api and source-pokeapi.
    • Added a new Dockerfile for building a Python-based Airbyte connector.
    • Implemented a new GitHub Actions workflow for testing and publishing Docker images for the Source Declarative Manifest.
  • Improvements

    • Enhanced connector testing workflow with better control flow and detailed feedback.
    • Added a new JSON schema for a low-code source specification.
    • Introduced a new run function for improved accessibility within the source declarative manifest.
  • Updates

    • Updated pyproject.toml to include a new dependency, rapidfuzz, and a new script for the source declarative manifest.
    • Added a copyright notice to relevant files.

@ChristoGrab ChristoGrab marked this pull request as draft November 14, 2024 22:34
@github-actions github-actions bot added the enhancement New feature or request label Nov 14, 2024
Copy link
Contributor

coderabbitai bot commented Nov 14, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces several updates across multiple files, primarily enhancing the connector testing workflow and introducing a new Dockerfile for building a Python-based Airbyte connector. New connectors have been added to the testing matrix, and the workflow now includes conditional job aborts and improved logging. Additionally, a new JSON schema for a low-code source is defined, and several functions and classes related to handling local YAML manifests are introduced. A new GitHub Actions workflow is also added to automate the testing and publishing of Docker images.

Changes

File Path Change Summary
.github/workflows/connector-tests.yml Updated to add source-the-guardian-api and source-pokeapi connectors. Introduced job abort conditions based on cdk_extra changes, with logging. Modified Test Connector step to include --use-local-cdk flag and improved job output evaluation.
Dockerfile New Dockerfile created for building a Python-based Airbyte connector, specifying dependencies and installation steps using Poetry. The entry point is set to execute poetry run source-declarative-manifest.
airbyte_cdk/cli/__init__.py Added copyright notice for 2024.
airbyte_cdk/cli/source_declarative_manifest/__init__.py Introduced import for run function and updated __all__ to include it, making run an exported entity.
airbyte_cdk/cli/source_declarative_manifest/_run.py Defined SourceLocalYaml class for handling local YAML files, added several command handling functions, and introduced the run function as the entry point for command logic.
airbyte_cdk/cli/source_declarative_manifest/spec.json Added a new JSON schema defining a connection specification for a low-code source, including required properties.
pyproject.toml Updated to add rapidfuzz dependency and introduced a new script source-declarative-manifest pointing to the run function.
.github/workflows/cdk-publish.yml New workflow added to automate testing and publishing of Docker images for the Source Declarative Manifest, including test and publish jobs with specific triggers and conditions.

Possibly related PRs

Suggested labels

ci, bug


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (7)
airbyte_cdk/cli/source_declarative_manifest/__init__.py (1)

1-6: Clean and well-structured module initialization! 🎉

The code follows Python best practices by explicitly declaring the public API through __all__ and providing a clean interface to the implementation details.

Hey, what do you think about adding a module docstring to describe the purpose of this package and maybe a quick example of how to use the run function? This could help other developers understand its role in the declarative source handling. wdyt? 😊

If you're interested, here's a suggestion:

"""
Provides functionality for handling declarative source manifests in Airbyte CDK.

Example:
    from airbyte_cdk.cli.source_declarative_manifest import run
    
    # Example usage of the run function
    run(args)
"""
airbyte_cdk/cli/source_declarative_manifest/spec.json (1)

4-8: Schema structure looks good! Quick thought about naming

The schema structure is well-defined with appropriate version and type constraints. I noticed we're using __injected_declarative_manifest as a required property - while the double underscore prefix is a common convention for special properties, would it make sense to use a more user-friendly name without the prefix since this is part of the public API? wdyt? 🤔

.github/workflows/docker-build.yml (2)

3-16: Consider enhancing workflow triggers for better CI/CD coverage

The current triggers look good, but what do you think about these suggestions to make it even better? 🤔

  1. Add PR trigger to test builds before merging:
  pull_request:
    paths:
      - 'airbyte_cdk/**'
      - '.github/workflows/docker-build.yml'
      - 'Dockerfile'
  1. Add CODEOWNERS validation to protect workflow changes:
  push:
    paths:
      - '.github/workflows/docker-build.yml'
    branches:
      - main
    required_reviewers:
      - '@airbytehq/workflow-maintainers'

wdyt?


46-63: Improve publishing step maintainability and fix formatting

The publishing logic works well, but here are some suggestions to make it even better:

  1. Fix trailing spaces on lines 52 and 58 🧹

  2. How about adding version tag validation?

      - name: Validate version tag
        if: ${{ github.event.inputs.version-tag != '' }}
        run: |
          if ! [[ ${{ github.event.inputs.version-tag }} =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
            echo "Invalid version format. Must be semantic (e.g., 1.0.0)"
            exit 1
          fi
  1. Consider using a composite action to reduce script complexity:
      - name: Push to Docker Hub
        uses: ./.github/actions/docker-push
        with:
          image: airbyte/source-declarative-manifest
          build-tag: build-test
          version: ${{ github.event.inputs.version-tag }}
          sha: ${{ github.sha }}

What do you think about these improvements? 🤔

🧰 Tools
🪛 yamllint

[error] 52-52: trailing spaces

(trailing-spaces)


[error] 58-58: trailing spaces

(trailing-spaces)

.github/workflows/connector-tests.yml (1)

Line range hint 3-11: Consider implementing the workflow optimization.

There's a TODO about moving these tests to run after "PyTest (Fast)". Since we're making changes to the workflow, would this be a good time to implement this optimization? It could help reduce CI load by running expensive connector tests only after basic tests pass, wdyt? 🤔

airbyte_cdk/cli/source_declarative_manifest/_run.py (2)

34-59: Parameterize the 'manifest.yaml' path for greater flexibility

In the SourceLocalYaml class, we have hardcoded the path_to_yaml as "manifest.yaml". Would it be beneficial to allow this path to be configurable? This way, users could specify a different manifest file if needed, enhancing the flexibility of the class. WDYT?


78-96: Refactor duplicated error handling into a utility function

The error handling code in _get_local_yaml_source and create_declarative_source is quite similar. Would it make sense to extract this logic into a separate utility function to avoid duplication and simplify maintenance? What are your thoughts?

Also applies to: 146-164

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e27cb81 and 87c06ea.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • .github/workflows/connector-tests.yml (1 hunks)
  • .github/workflows/docker-build.yml (1 hunks)
  • Dockerfile (1 hunks)
  • airbyte_cdk/cli/__init__.py (1 hunks)
  • airbyte_cdk/cli/source_declarative_manifest/__init__.py (1 hunks)
  • airbyte_cdk/cli/source_declarative_manifest/_run.py (1 hunks)
  • airbyte_cdk/cli/source_declarative_manifest/spec.json (1 hunks)
  • pyproject.toml (2 hunks)
✅ Files skipped from review due to trivial changes (2)
  • Dockerfile
  • airbyte_cdk/cli/init.py
🧰 Additional context used
🪛 yamllint
.github/workflows/docker-build.yml

[error] 52-52: trailing spaces

(trailing-spaces)


[error] 58-58: trailing spaces

(trailing-spaces)

🔇 Additional comments (4)
airbyte_cdk/cli/source_declarative_manifest/spec.json (1)

2-2: Verify documentation URL accessibility

Hey! Let's make sure this documentation URL is live and contains the relevant content for users. 🔍

✅ Verification successful

Documentation URL is accessible and working

The documentation URL https://docs.airbyte.com/integrations/sources/low-code returns a successful HTTP 200 response, confirming it's live and accessible to users.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the documentation URL is accessible and returns a valid response
curl -s -I "https://docs.airbyte.com/integrations/sources/low-code" | head -n 1

Length of output: 91

.github/workflows/docker-build.yml (1)

1-63: Verify workflow security configuration

The workflow handles sensitive operations. Let's verify the security settings:

✅ Verification successful

The workflow's security configuration looks solid! 🔒

The permissions are properly scoped with clear justifications:

  • id-token: write is correctly set for trusted publishing
  • contents: write is appropriately configured for artifact uploads
  • The workflow runs only on main branch pushes or manual triggers
  • Docker Hub credentials are only used when needed (on main or manual dispatch)

The security settings align well with similar workflows in the repo that handle artifacts and publishing. What do you think about keeping it as is? 😊

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for potential security issues in the workflow

# Check if GITHUB_TOKEN is used with minimal permissions
rg -l 'GITHUB_TOKEN' .github/workflows/

# Look for other workflow files that might trigger this one
fd -e yml -e yaml . .github/workflows/ -x grep -l "workflow_dispatch"

# Check for workflow_dispatch events in other files that might need protection
fd -e yml -e yaml . .github/workflows/ -x grep -l "workflow_dispatch" {} \; -exec grep -l "permissions:" {} \;

Length of output: 920


Script:

#!/bin/bash
# Let's check the workflow files that use workflow_dispatch for their permissions and security settings
echo "=== docker-build.yml ==="
cat .github/workflows/docker-build.yml | grep -A 5 "permissions:"

echo -e "\n=== test-command.yml ==="
cat .github/workflows/test-command.yml | grep -A 5 "permissions:"

echo -e "\n=== pypi_publish.yml ==="
cat .github/workflows/pypi_publish.yml | grep -A 5 "permissions:"

echo -e "\n=== autofix-command.yml ==="
cat .github/workflows/autofix-command.yml | grep -A 5 "permissions:"

echo -e "\n=== poetry-lock-command.yml ==="
cat .github/workflows/poetry-lock-command.yml | grep -A 5 "permissions:"

echo -e "\n=== pdoc_publish.yml ==="
cat .github/workflows/pdoc_publish.yml | grep -A 5 "permissions:"

echo -e "\n=== connector-tests.yml ==="
cat .github/workflows/connector-tests.yml | grep -A 5 "permissions:"

# Check for write permissions in docker-build.yml
echo -e "\n=== Write permissions in docker-build.yml ==="
cat .github/workflows/docker-build.yml | grep -B 5 -A 5 "write"

Length of output: 2178

🧰 Tools
🪛 yamllint

[error] 52-52: trailing spaces

(trailing-spaces)


[error] 58-58: trailing spaces

(trailing-spaces)

.github/workflows/connector-tests.yml (1)

84-89: Consider enhancing the TODO comment with more context and tracking.

The TODO comment explains the blocker, but could we make it more actionable? Maybe add a link to an issue tracking the --use-local-cdk support for manifest connectors? Also, should we add a condition to skip these tests until the support is added to prevent false failures, wdyt? 🤔

Let's check if there's an existing issue tracking this:

pyproject.toml (1)

52-52: Could you clarify the intended use case for rapidfuzz?

I notice we're adding rapidfuzz, which is typically used for fuzzy string matching. Would love to understand how this fits into the SDM integration - are we planning to use it for manifest parsing or something else? Also, the version constraint ^3.10.1 looks good as it allows minor updates while preventing breaking changes, wdyt? 🤔

.github/workflows/docker-build.yml Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
airbyte_cdk/cli/source_declarative_manifest/_run.py Outdated Show resolved Hide resolved
@aaronsteers
Copy link
Contributor

aaronsteers commented Nov 15, 2024

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ChristoGrab, I'd like to pushback on the Dockerfile introduction and the push with vanilla docker commands in the GHA workflow.

Here's why:

  • We originally stepped away from Dockerfile and bash scripts to publish because it makes it hard to get consistency and reproductability in the way we build / publish our images. You can read more about that in this blog post (the Legacy CI: A Maze of YAML and Shell Scripts section)
  • We decided to use dagger instead so that we can build and publish images programatically. Which allows us to write tests for them and also build abstractions which fosters consistency.
  • Taking your approach could indeed be the fastest immediate path but it breaks build consistency which is valuable in the long run.

It's why I want to suggest you to maintain the build logic of SDM in our base_images package. The release could happen from the CDK repo - but I think the build logic should be kept in base_images.

Given that:

  • SDM is the base image for our manifest connectors
  • SDM is based on our airbyte/python-connector-base base image.
  • This python base image is currently managed in our base_images package. We declare, build and publish it with a CLI declared in tis package. (The process is currently local and manual... that would have to change to support your usecase).

My recommendations would be to tweak the base_images package so that the build logic of SDM is stored there. And you can reuse the existing logic to publish multi arch images (which you don't get with docker build).

The benefits in doing this is that:

  • We keep a central place in which our base images are declared. It makes it easier to refactor globally any docker build related thing.
  • You can write "sanity checks" which run against the image before publish. which can boost confidence if we ever change the build logic.
  • You can automate the changelog management

The challenge in this move is not the declaration of SDM in this package per se but rather that:

  • Publishing base image is currently only done locally
  • Publishing updates a version controlled changelog which is in the airbyte repo. This changelog act as a version controlled state which determines which is the latest published version. We should make the package repo agnostic and provide the changelog to update via a CLI input.
  • The system / python dependency versions we install in images is are hardcoded. We could pass a env file to the CLI that the build logic could use to modify the CDK version that gets installed in the image.
  • CDK version and SDM version must be decoupled. If we change the airbyte/python-connector-base image version SDM uses we'll want to cut a new SDM version but not a new CDK version.

So here's the SDM high level release flow I would suggest (in GHA, in airbyte-python-cdk-repo):

  1. Trigger it any time a new CDK version is published to Pypi / the airbyte/python-connector-base version changes.
  2. Install the base_images package with poetry, with a git URL targetting the airbyte repo. (base_images could eventually be in its own repo / be published to Pypi).
  3. Checkout airbyte-python-cdk-repo
  4. Run a command like:
    poetry run generate-release --image=sdm --base-image=<airbyte-python-base-image-version-from-a-version-controlled-file> --bump-type=patch --changelog-path=repo-specific-changelog-path.json --changelog-entry="Update CDK to v.X.X.X" --env-vars=`
  5. Commit and push the updated changelog.

I know that's a lot to process 😄 . My recommandation would be that you get familiar with our base_images package and understand how we currently build and publish airbyte/python-connector-base. And we can of course talk about this live.

.github/workflows/docker-build.yml Outdated Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)
.github/workflows/cdk-publish.yml (4)

2-7: Consider enhancing the workflow triggers?

A few suggestions to make the workflow more robust:

  1. Maybe add pull_request trigger to catch issues early? This would help validate changes before they hit main.
  2. The path filter for airbyte_cdk/pyproject.toml could be more specific, wdyt?
 on:
   push:
     paths:
-      - 'airbyte_cdk/pyproject.toml' # To only publish on CDK version change
+      - 'airbyte_cdk/pyproject.toml'  # Version changes
+      - '.github/workflows/cdk-publish.yml'  # Workflow changes
       - 'Dockerfile'
+  pull_request:
+    paths:
+      - 'airbyte_cdk/pyproject.toml'
+      - 'Dockerfile'

32-34: How about adding a timeout to the test step?

To prevent hanging builds, maybe we could add a timeout? Also, we could save the test output as an artifact for debugging, wdyt?

       - name: Test image
+        timeout-minutes: 10
         run: |
           docker run airbyte/source-declarative-manifest:build-test spec
+          mkdir -p test-results
+          docker run airbyte/source-declarative-manifest:build-test spec > test-results/spec-output.txt
+      - name: Upload test results
+        if: always()
+        uses: actions/upload-artifact@v3
+        with:
+          name: test-results
+          path: test-results/

75-81: How about enhancing the tag check?

The tag check is great! Maybe we could make it even better with a timeout and more informative error message? Something like:

       - name: Check if tag already exists
+        timeout-minutes: 5
         run: |
           tag="airbyte/source-declarative-manifest:${{ env.CDK_VERSION}}-${{ github.run_number }}"
           if docker manifest inspect $tag > /dev/null 2>&1; then
-            echo "The tag $tag already exists on Dockerhub. Skipping publish to prevent overwrite."
+            echo "::error::Tag $tag already exists on Dockerhub. This usually means the version in pyproject.toml needs to be bumped."
             exit 1
           fi

83-92: Consider optimizing the build configuration?

The multi-platform build looks good! Maybe we could optimize it further with build caching and compression? This could speed up builds and reduce storage costs. Wdyt about:

       - name: Build and push
         uses: docker/build-push-action@v5
         with:
           context: .
           platforms: linux/amd64,linux/arm64
           push: true
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
+          compression: zstd
           tags: |
             airbyte/source-declarative-manifest:latest
             airbyte/source-declarative-manifest:${{ env.CDK_VERSION }}
             airbyte/source-declarative-manifest:${{ env.CDK_VERSION }}-${{ github.run_number }}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8d0ffc9 and 3b86602.

📒 Files selected for processing (1)
  • .github/workflows/cdk-publish.yml (1 hunks)
🧰 Additional context used
🪛 actionlint
.github/workflows/cdk-publish.yml

71-71: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting

(shellcheck)

🔇 Additional comments (1)
.github/workflows/cdk-publish.yml (1)

36-44: Enhance vulnerability scanning configuration

The current Trivy configuration might miss important vulnerabilities. How about:

  1. Including HIGH severity vulnerabilities?
  2. Adding a timeout to prevent long-running scans?
  3. Generating a SARIF report for better visibility?
       - name: Scan for vulnerabilities
         uses: aquasecurity/trivy-action@master
         continue-on-error: true
         with:
           image-ref: airbyte/source-declarative-manifest:build-test
-          format: table
+          format: 'table,sarif'
+          output: 'trivy-results.sarif'
           exit-code: 1
-          severity: CRITICAL
+          severity: 'CRITICAL,HIGH'
+          timeout: '5m'

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
.github/workflows/cdk-publish.yml (2)

2-7: Consider expanding path triggers for completeness?

The current path triggers look good, but we might want to rebuild when SDM-related files change too. What do you think about adding these paths? 🤔

  paths:
    - 'airbyte_cdk/pyproject.toml'
    - 'Dockerfile'
+   - 'airbyte_cdk/sources/**'
+   - 'airbyte_cdk/models/**'

32-35: Enhance test coverage with additional connector commands?

Currently, we're only testing the spec command. Should we add checks for other standard connector commands too? Here's a thought:

  - name: Test image
    run: |
      docker run airbyte/source-declarative-manifest:build-test spec
+     docker run airbyte/source-declarative-manifest:build-test check --config tests/test_config.json
+     docker run airbyte/source-declarative-manifest:build-test discover --config tests/test_config.json

wdyt? 🤔

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 3b86602 and 87acda0.

📒 Files selected for processing (1)
  • .github/workflows/cdk-publish.yml (1 hunks)
🧰 Additional context used
🪛 actionlint
.github/workflows/cdk-publish.yml

73-73: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting

(shellcheck)

🔇 Additional comments (1)
.github/workflows/cdk-publish.yml (1)

66-71: 🛠️ Refactor suggestion

Consider using OIDC for Docker Hub authentication?

Instead of using Docker Hub credentials directly, we could leverage OIDC for more secure authentication. The workflow already has the necessary permissions. How about:

  - name: Login to Docker Hub
    uses: docker/login-action@v3
    with:
-     username: ${{ secrets.DOCKER_HUB_USERNAME }}
-     password: ${{ secrets.DOCKER_HUB_PASSWORD }}
+     registry: docker.io
+     token-format: 'docker-hub'

This would eliminate the need to store credentials in GitHub secrets. What are your thoughts? 🔐

Likely invalid or redundant comment.

.github/workflows/cdk-publish.yml Outdated Show resolved Hide resolved
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
.github/workflows/cdk-publish.yml (3)

3-7: Consider expanding the path triggers?

Hey there! 👋 The current path triggers might miss some important changes. How about including:

   push:
     paths:
       - 'airbyte_cdk/pyproject.toml'
       - 'Dockerfile'
+      - '.github/workflows/cdk-publish.yml'
+      - 'airbyte_cdk/version.py'  # If you have a version file

This would ensure the workflow runs when the workflow file itself is modified or when version-related files change. WDYT? 🤔


24-31: Add Docker layer caching?

Hey! To speed up builds, we could leverage Docker layer caching. Something like:

   - name: Build test image
     uses: docker/build-push-action@v5
     with:
       context: .
       platforms: linux/amd64
       load: true
       tags: airbyte/source-declarative-manifest:build-test
+      cache-from: type=gha
+      cache-to: type=gha,mode=max

This could significantly reduce build times. Would you like to try this out? ⚡


36-46: Upload Trivy results as artifacts?

Since we're generating a SARIF report, shall we upload it as an artifact for better visibility? Maybe add:

   - name: Scan for vulnerabilities
     uses: aquasecurity/trivy-action@master
     continue-on-error: true
     with:
       image-ref: airbyte/source-declarative-manifest:build-test
       format: 'table,sarif'
       output: 'trivy-results.sarif'
       exit-code: 1
       severity: 'CRITICAL,HIGH'
       timeout: '5m'
+  - name: Upload Trivy scan results
+    uses: github/codeql-action/upload-sarif@v2
+    if: always()
+    with:
+      sarif_file: 'trivy-results.sarif'

This would make the security scan results available in the Security tab. Sound good? 🔒

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 87acda0 and 7df3955.

📒 Files selected for processing (1)
  • .github/workflows/cdk-publish.yml (1 hunks)
🧰 Additional context used
🪛 actionlint
.github/workflows/cdk-publish.yml

73-73: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting

(shellcheck)

🔇 Additional comments (1)
.github/workflows/cdk-publish.yml (1)

47-93: Publish job configuration looks solid! 👍

Great job incorporating the security improvements and best practices:

  • Proper version retrieval with whitespace handling
  • Secure tag checking with proper quoting
  • Multi-platform build support
  • Appropriate permissions configuration
🧰 Tools
🪛 actionlint

73-73: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting

(shellcheck)

.github/workflows/cdk-publish.yml Show resolved Hide resolved
@ChristoGrab ChristoGrab enabled auto-merge (squash) November 18, 2024 23:59
@ChristoGrab ChristoGrab merged commit 1294b43 into main Nov 19, 2024
22 of 24 checks passed
@ChristoGrab ChristoGrab deleted the christo/sdm-test branch November 19, 2024 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants