Skip to content

Commit

Permalink
feat(python): rule for Google Dataflow
Browse files Browse the repository at this point in the history
  • Loading branch information
elsapet committed Jun 3, 2024
1 parent 9da9ae2 commit 96fb2db
Show file tree
Hide file tree
Showing 3 changed files with 83 additions and 0 deletions.
31 changes: 31 additions & 0 deletions rules/python/third_parties/google_dataflow.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
imports:
- python_shared_lang_datatype
patterns:
- pattern: beam.Create($<...>$<DATA_TYPE>$<...>)
filters:
- variable: DATA_TYPE
detection: python_shared_lang_datatype
scope: result
languages:
- python
severity: medium
skip_data_types:
- Unique Identifier
metadata:
description: Leakage of sensitive data to Google Dataflow
remediation_message: |
## Description
Leaking sensitive data to a third-party service is a common cause of data leaks and can lead to data breaches.
## Remediations
- **Do** ensure all sensitive data is removed when sending data to third-party services like Google Dataflow.
## References
- [Google Dataflow Docs](https://cloud.google.com/dataflow/docs/overview)
- [Apache Beam Python SDK](https://beam.apache.org/documentation/sdks/python/)
cwe_id:
- 201
id: python_third_parties_google_dataflow
documentation_url: https://docs.bearer.com/reference/rules/python_third_parties_google_dataflow
20 changes: 20 additions & 0 deletions tests/python/third_parties/google_dataflow/test.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
const {
createNewInvoker,
getEnvironment,
} = require("../../../helper.js")
const { ruleId, ruleFile, testBase } = getEnvironment(__dirname)

describe(ruleId, () => {
const invoke = createNewInvoker(ruleId, ruleFile, testBase)

test("google_dataflow", () => {
const testCase = "main.py"

const results = invoke(testCase)

expect(results).toEqual({
Missing: [],
Extra: []
})
})
})
32 changes: 32 additions & 0 deletions tests/python/third_parties/google_dataflow/testdata/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Use Apache Beam to create Dataflow pipeline into Google Cloud
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions

class bad():
beam_options =
def run():
beam_options = PipelineOptions(
runner='DataflowRunner',
project='my-project-id',
job_name='unique-job-name',
temp_location='gs://my-bucket/temp',
)
with beam.Pipeline(options=beam_options) as pipeline:
# bearer:expected python_third_parties_google_dataflow
pipeline | "Create elements" >> beam.Create([user.firstname, user.lastname])
| "Print elements" >> beam.Map(print)
# run() is called automatically

class ok():
beam_options =
def run():
beam_options = PipelineOptions(
runner='DataflowRunner',
project='my-project-id',
job_name='unique-job-name',
temp_location='gs://my-bucket/temp',
)
with beam.Pipeline(options=beam_options) as pipeline:
pipeline | "Create elements" >> beam.Create([user.uuid])
| "Print elements" >> beam.Map(print)
# run() is called automatically

0 comments on commit 96fb2db

Please sign in to comment.