Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#64] Queue and Storage triggers for AWS, GCP and Azure #201

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

oanarosca
Copy link
Collaborator

@oanarosca oanarosca commented May 6, 2024

Summary by CodeRabbit

  • New Features

    • Introduced QueueTrigger and StorageTrigger for AWS, allowing interaction with SQS and S3 for handling asynchronous and storage-related events.
    • Added functionality in GCP to manage trigger resources (queues, storage) ensuring they exist before function deployment.
    • Enhanced Azure function handler to support multiple triggers, including HTTP requests, queue messages, and blob storage events.
  • Enhancements

    • New methods for creating, configuring, and updating function triggers in GCP.
    • Improved event handling logic in AWS Lambda to process messages from SQS and S3.
    • Expanded Azure and GCP classes to dynamically generate function configurations based on trigger types.
    • Updated configuration files to support new dependencies and deployment capabilities related to queue management.

Copy link
Collaborator

@mcopik mcopik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work! I left some comments for refactoring - effectively, the two missing pieces are integration into our existing storage APIs and abstractions & using SeBS cache to make sure that we allocate resources only once, and off the critical path of invocation :)

I recommend using a single-bucket approach for AWS and Azure. If you find it impossible to do that on Google Cloud, then please add there an option to create more than one bucket; AFAIK there's no hard limit on the number of buckets on this platform. Just make sure that each trigger remembers its bucket name for future invocation and has it cached; it also makes it easier to clean up buckets that are no longer needed.

benchmarks/wrappers/aws/python/handler.py Outdated Show resolved Hide resolved
benchmarks/wrappers/gcp/python/handler.py Outdated Show resolved Hide resolved
benchmarks/wrappers/gcp/python/handler.py Outdated Show resolved Hide resolved
config/systems.json Outdated Show resolved Hide resolved
docs/platforms.md Show resolved Hide resolved
sebs/azure/azure.py Outdated Show resolved Hide resolved
sebs/azure/azure.py Outdated Show resolved Hide resolved
sebs/azure/triggers.py Outdated Show resolved Hide resolved
sebs/benchmark.py Outdated Show resolved Hide resolved
sebs/gcp/triggers.py Show resolved Hide resolved
Copy link

coderabbitai bot commented Jul 5, 2024

Walkthrough

The recent updates introduce new trigger types (QueueTrigger and StorageTrigger) to the AWS, Azure, and GCP modules, enhancing the ability to handle asynchronous and storage-related events. Additionally, methods for managing trigger resources have been added, ensuring their creation and configuration prior to function deployment. These changes collectively expand the event-driven capabilities across multiple cloud platforms.

Changes

File Change Summary
sebs/aws/triggers.py Introduced QueueTrigger and StorageTrigger classes with methods for handling SQS and S3 events.
sebs/gcp/gcp.py Added methods to manage trigger resources, ensuring their existence before function deployment.
benchmarks/wrappers/aws/python/handler.py Enhanced event handling to support SQS and S3 events in AWS Lambda functions.
benchmarks/wrappers/azure/python/handler.py Renamed main function and added support for handling Azure Queue and Blob Storage events.
benchmarks/wrappers/gcp/python/handler.py Restructured handler to support GCP Queue and Storage events.
benchmarks/wrappers/aws/python/queue.py Implemented a wrapper class for interacting with AWS SQS.
benchmarks/wrappers/azure/python/queue.py Introduced a class for interacting with Azure Storage Queues.
benchmarks/wrappers/gcp/python/queue.py Added a class for publishing messages to Google Cloud Pub/Sub.
sebs/aws/aws.py Updated functions to handle new trigger types for AWS Lambda.
sebs/azure/azure.py Enhanced function configuration and trigger management for Azure functions.
sebs/azure/triggers.py Added QueueTrigger and StorageTrigger classes for Azure Queue and Blob operations.
sebs/gcp/triggers.py Introduced QueueTrigger and StorageTrigger classes for GCP Pub/Sub and Cloud Storage interactions.
config/systems.json Expanded configuration for Python base images and added new dependencies for Azure functionality.
docs/modularity.md Updated documentation to reflect new trigger types and their configurations.
requirements.gcp.txt Added dependency for Google Cloud Pub/Sub.
tests/aws/create_function.py Included queue.py in the package files for AWS function creation tests.
scripts/run_experiments.py Updated initialization to include queue.py.

Poem

In the cloud where functions play,
New triggers dance in bright array.
Queues and storage, side by side,
In eventful realms, they now abide.
With every message, data flows,
In harmony, our system grows! 🐇✨

Tip

Announcements
  • The review status is no longer posted as a separate comment when there are no actionable or nitpick comments. In such cases, the review status is included in the walkthrough comment.
  • We have updated our review workflow to use the Anthropic's Claude family of models. Please share any feedback in the discussion post on our Discord.
  • Possibly related PRs: Walkthrough comment now includes a list of potentially related PRs to help you recall past context. Please share any feedback in the discussion post on our Discord.
  • Suggested labels: CodeRabbit can now suggest labels by learning from your past PRs in the walkthrough comment. You can also provide custom labeling instructions in the UI or configuration file.
  • Possibly related PRs, automatic label suggestions based on past PRs, learnings, and possibly related issues require data opt-in (enabled by default).

Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between dac2840 and 7de164e.

Files selected for processing (8)
  • config/systems.json (3 hunks)
  • docs/platforms.md (1 hunks)
  • sebs.py (3 hunks)
  • sebs/azure/azure.py (7 hunks)
  • sebs/cache.py (1 hunks)
  • sebs/faas/system.py (1 hunks)
  • sebs/gcp/gcp.py (9 hunks)
  • sebs/local/local.py (1 hunks)
Files skipped from review due to trivial changes (2)
  • docs/platforms.md
  • sebs/cache.py
Additional context used
Ruff
sebs/azure/azure.py

259-259: Use context handler for opening files

(SIM115)

Additional comments not posted (13)
config/systems.json (1)

74-75: LGTM!

The changes look good:

  • Adding queue.py to the deployment files across multiple sections indicates that this file is now part of the deployment process.
  • Adding azure-storage-queue and azure-identity packages suggests an expansion of functionality related to Azure services.
  • The formatting changes (new lines before the added packages) may improve readability.

Also applies to: 120-122

sebs.py (3)

181-181: LGTM!

The code changes to add "queue" and "storage" trigger choices are approved.


Line range hint 232-257: LGTM!

The code changes to handle the new trigger types in the invoke function are approved.


Line range hint 223-223: LGTM!

The code change to add the optional trigger parameter to the package_code function is approved.

sebs/azure/azure.py (3)

257-261: LGTM!

The code changes to utilize the create_function_json method in the package_code function are approved.

Tools
Ruff

259-259: Use context handler for opening files

(SIM115)


356-361: LGTM!

The code changes to conditionally create an HTTPTrigger in the update_function method are approved.


585-647: Approved: Method create_trigger supports multiple trigger types.

The changes to the create_trigger method significantly enhance the functionality by supporting both queue and storage triggers. The method effectively sets up the necessary Azure resources and permissions.

The method also raises a RuntimeError for unsupported trigger types, which is a good practice for error handling.

Verify that the RuntimeError is raised for unsupported trigger types:

Verification successful

Verified: RuntimeError is raised for unsupported trigger types in create_trigger.

The create_trigger method in sebs/azure/azure.py correctly raises a RuntimeError for unsupported trigger types, confirming the implementation of error handling as expected.

  • Location: sebs/azure/azure.py
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that `create_trigger` raises a `RuntimeError` for unsupported trigger types.

# Test: Search for the `RuntimeError` being raised. Expect: The `RuntimeError` is raised for unsupported trigger types.
rg --type python -A 5 $'raise RuntimeError\("Not supported!"\)'

Length of output: 95


Script:

#!/bin/bash
# Description: Verify that `create_trigger` raises a `RuntimeError` for unsupported trigger types.

# Test: Search for the `RuntimeError` being raised. Expect: The `RuntimeError` is raised for unsupported trigger types.
rg --type py -A 5 $'raise RuntimeError\("Not supported!"\)'

Length of output: 1744

sebs/gcp/gcp.py (6)

143-192: LGTM!

The create_trigger_resource method is well-implemented and enhances the functionality related to trigger resources in GCP. It effectively checks the trigger type, creates the necessary resources, logs the creation process, and handles potential conflicts if the resources already exist.


251-252: LGTM!

The code changes to write additional dependencies to the requirements.txt file in the package_code function are approved.


311-314: LGTM!

The code changes to call create_trigger_resource before creating the function in the create_function method are approved. This ensures that the required trigger resources are created before the function deployment process.


380-416: LGTM!

The code changes to the create_trigger method to handle different trigger types, including HTTP, Queue, and Storage triggers, are approved. The method now imports QueueTrigger and StorageTrigger classes and includes appropriate logging for each trigger type, providing a more structured approach to managing triggers.


428-441: LGTM!

The code changes to the cached_function method to handle different trigger types, including LibraryTrigger, QueueTrigger, and StorageTrigger, are approved. These changes enhance the functionality of the method by allowing it to handle different trigger types.


457-460: LGTM!

The code changes to call create_trigger_resource before updating the function in the update_function method are approved. This ensures that the required trigger resources are verified before the function deployment process.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Outside diff range and nitpick comments (1)
docs/platforms.md (1)

88-90: Specify language for fenced code blocks.

The added environment variable export commands should specify the language for fenced code blocks to follow Markdownlint guidelines.

-``` 
+```bash
export AZURE_SECRET_APPLICATION_ID=XXXXXXXXXXXXXXXX
export AZURE_SECRET_TENANT=XXXXXXXXXXXX
export AZURE_SECRET_PASSWORD=XXXXXXXXXXXXX

</blockquote></details>

</blockquote></details>

<details>
<summary>Review details</summary>

**Configuration used: CodeRabbit UI**
**Review profile: CHILL**

<details>
<summary>Commits</summary>

Files that changed from the base of the PR and between a2eb936737231f70a237832c91fb41e240dd32a5 and ba67b4a8d4e739364eed7dc2e86cbdaf432e633b.

</details>


<details>
<summary>Files selected for processing (28)</summary>

* benchmarks/wrappers/aws/python/handler.py (1 hunks)
* benchmarks/wrappers/aws/python/storage.py (1 hunks)
* benchmarks/wrappers/azure/python/handler.py (2 hunks)
* benchmarks/wrappers/gcp/python/handler.py (2 hunks)
* config/example.json (1 hunks)
* config/systems.json (1 hunks)
* docs/modularity.md (1 hunks)
* docs/platforms.md (1 hunks)
* requirements.azure.txt (1 hunks)
* sebs.py (3 hunks)
* sebs/aws/aws.py (14 hunks)
* sebs/aws/config.py (6 hunks)
* sebs/aws/function.py (2 hunks)
* sebs/aws/s3.py (7 hunks)
* sebs/aws/triggers.py (5 hunks)
* sebs/azure/azure.py (7 hunks)
* sebs/azure/function.py (4 hunks)
* sebs/azure/triggers.py (2 hunks)
* sebs/benchmark.py (1 hunks)
* sebs/cache.py (1 hunks)
* sebs/experiments/config.py (5 hunks)
* sebs/faas/function.py (1 hunks)
* sebs/faas/system.py (1 hunks)
* sebs/gcp/function.py (2 hunks)
* sebs/gcp/gcp.py (8 hunks)
* sebs/gcp/triggers.py (2 hunks)
* sebs/local/local.py (1 hunks)
* sebs/openwhisk/openwhisk.py (1 hunks)

</details>






<details>
<summary>Files skipped from review due to trivial changes (5)</summary>

* config/example.json
* requirements.azure.txt
* sebs/aws/config.py
* sebs/cache.py
* sebs/faas/function.py

</details>



<details>
<summary>Additional context used</summary>

<details>
<summary>Ruff</summary><blockquote>

<details>
<summary>sebs/experiments/config.py</summary><blockquote>

63-63: Use `config.get("flags", {})` instead of an `if` block

Replace with `config.get("flags", {})`

(SIM401)

---

64-64: Use `config.get("trigger", {})` instead of an `if` block

Replace with `config.get("trigger", {})`

(SIM401)

</blockquote></details>
<details>
<summary>benchmarks/wrappers/azure/python/handler.py</summary><blockquote>

2-2: `base64` imported but unused

Remove unused import: `base64`

(F401)

---

80-80: Local variable `ret` is assigned to but never used

Remove assignment to unused variable `ret`

(F841)

---

89-89: Local variable `ret` is assigned to but never used

Remove assignment to unused variable `ret`

(F841)

</blockquote></details>
<details>
<summary>benchmarks/wrappers/gcp/python/handler.py</summary><blockquote>

3-3: `google.cloud.storage` imported but unused

Remove unused import: `google.cloud.storage`

(F401)

---

72-72: Local variable `ret` is assigned to but never used

Remove assignment to unused variable `ret`

(F841)

---

91-91: Local variable `ret` is assigned to but never used

Remove assignment to unused variable `ret`

(F841)

</blockquote></details>
<details>
<summary>sebs/gcp/triggers.py</summary><blockquote>

5-5: `os` imported but unused

Remove unused import: `os`

(F401)

---

8-8: `googleapiclient.errors.HttpError` imported but unused

Remove unused import: `googleapiclient.errors.HttpError`

(F401)

---

212-212: Statement ends with an unnecessary semicolon

Remove unnecessary semicolon

(E703)

</blockquote></details>
<details>
<summary>sebs/azure/azure.py</summary><blockquote>

257-257: Use context handler for opening files

(SIM115)

</blockquote></details>

</blockquote></details>
<details>
<summary>Markdownlint</summary><blockquote>

<details>
<summary>docs/platforms.md</summary><blockquote>

87-87: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</blockquote></details>

</blockquote></details>

</details>
<details>
<summary>Additional comments not posted (35)</summary><blockquote>

<details>
<summary>sebs/azure/function.py (2)</summary><blockquote>

`1-2`: **Imports look good!**

The import statements are necessary for the changes in the `deserialize` method.

---

Line range hint `31-49`: 
**Changes to `deserialize` method look good!**

The updates ensure that the method can handle the new trigger types and improve type safety by using the `cast` function.

</blockquote></details>
<details>
<summary>benchmarks/wrappers/aws/python/storage.py (1)</summary><blockquote>

`50-51`: **New `get_object` method looks good!**

The method is correctly implemented and follows the pattern used in other methods in the class.

</blockquote></details>
<details>
<summary>sebs/gcp/function.py (2)</summary><blockquote>

Line range hint `1-2`: 
**Imports look good!**

The import statements are necessary for the changes in the `deserialize` method.

---

Line range hint `33-51`: 
**Changes to `deserialize` method look good!**

The updates ensure that the method can handle the new trigger types and improve type safety by using the `cast` function.

</blockquote></details>
<details>
<summary>sebs/aws/function.py (2)</summary><blockquote>

Line range hint `1-2`: 
**Imports look good!**

The import statements are necessary for the changes in the `deserialize` method.

---

Line range hint `42-65`: 
**Changes to `deserialize` method look good!**

The updates ensure that the method can handle the new trigger types and improve type safety by using the `cast` function.

</blockquote></details>
<details>
<summary>sebs/experiments/config.py (2)</summary><blockquote>

`35-38`: **LGTM!**

The `trigger` property method is correctly implemented.

---

`50-50`: **LGTM!**

The `trigger` attribute is correctly added to the dictionary returned by the `serialize` method.

</blockquote></details>
<details>
<summary>benchmarks/wrappers/aws/python/handler.py (3)</summary><blockquote>

`10-13`: **LGTM!**

The queue trigger handling block is correctly implemented.

---

`14-23`: **LGTM!**

The storage trigger handling block is correctly implemented.

---

`28-28`: **LGTM!**

The `request-id` and `income-timestamp` are correctly added to the event dictionary.

</blockquote></details>
<details>
<summary>config/systems.json (1)</summary><blockquote>

`22-22`: **Approved: Addition of Python 3.9 base image.**

The addition of Python 3.9 to the list of base images for local experiments is consistent with the existing configuration.

</blockquote></details>
<details>
<summary>sebs/aws/s3.py (6)</summary><blockquote>

`57-58`: **Approved: Improved logging message in `_create_bucket`.**

The log message now includes the bucket name and region, improving clarity.

---

`74-75`: **Approved: Region-specific bucket creation logic.**

The logic ensures that buckets are created correctly in regions other than `us-east-1`.

---

`91-92`: **Approved: Improved error message in `_create_bucket`.**

The error message now includes the bucket name and region, improving clarity.

---

`116-117`: **Approved: Improved logging message in `uploader_func`.**

The log message now includes the file path and bucket name, improving clarity.

---

`127-130`: **Approved: Improved logging message in `download`.**

The log message now includes the bucket name, key, and file path, improving clarity.

---

`159-162`: **Approved: Improved logging message in `clean_bucket`.**

The log message now includes the bucket name and region, improving clarity.

</blockquote></details>
<details>
<summary>sebs/gcp/triggers.py (1)</summary><blockquote>

`5-8`: **Remove unused imports and unnecessary semicolon.**

The imports `os` and `googleapiclient.errors.HttpError` are unused, and there is an unnecessary semicolon.

```diff
- import os
- from googleapiclient.errors import HttpError
+ from google.cloud import storage as gcp_storage

Likely invalid or redundant comment.

Tools
Ruff

5-5: os imported but unused

Remove unused import: os

(F401)


8-8: googleapiclient.errors.HttpError imported but unused

Remove unused import: googleapiclient.errors.HttpError

(F401)

sebs/aws/triggers.py (2)

138-152: Ensure correct initialization of QueueTrigger attributes.

The QueueTrigger constructor initializes several attributes. Ensure that the attributes _deployment_client, _queue_arn, and _queue_url are correctly assigned and used.

Verification successful

Initialization and usage of QueueTrigger attributes are correct.

The attributes _deployment_client, _queue_arn, and _queue_url are correctly initialized and used within the QueueTrigger class. The properties include assertions to ensure these attributes are set before being accessed.

  • sebs/aws/triggers.py:
    • Lines 138-152: Initialization in the constructor.
    • Usage in properties and methods within the class.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correct initialization of `QueueTrigger` attributes.

# Test: Search for the usage of `_deployment_client`, `_queue_arn`, and `_queue_url` within the `QueueTrigger` class. Expect: At least one occurrence for each attribute.
rg --type python --pcre2 --context 5 $'class QueueTrigger.*\\n.*def .*\\n.*self\\._deployment_client.*\\n.*self\\._queue_arn.*\\n.*self\\._queue_url' sebs/aws/triggers.py

Length of output: 195


Script:

#!/bin/bash
# Description: Verify the correct initialization of `QueueTrigger` attributes.

# Test: Search for the usage of `_deployment_client`, `_queue_arn`, and `_queue_url` within the `QueueTrigger` class. Expect: At least one occurrence for each attribute.
rg --pcre2 --context 5 'self\._deployment_client|self\._queue_arn|self\._queue_url' sebs/aws/triggers.py

Length of output: 3693


244-255: Ensure correct initialization of StorageTrigger attributes.

The StorageTrigger constructor initializes several attributes. Ensure that the attributes _deployment_client and _bucket_name are correctly assigned and used.

docs/modularity.md (1)

Line range hint 1-1:
Verify the usage and documentation of the new parameter in some_function.

The function signature has been updated to include an additional parameter trigger. Ensure that the new parameter is used correctly within the function and that its usage is documented appropriately.

sebs.py (3)

227-230: Ensure the correct insertion of the trigger parameter into the experiment config.

The trigger parameter is inserted into the experiment config. Verify that this is done correctly and consistently.

Verification successful

Verified the insertion of the trigger parameter into the experiment config.

The trigger parameter is correctly and consistently inserted into the experiment config using update_nested_dict.

  • Location: sebs.py, lines 227-230
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the insertion of the `trigger` parameter into the experiment config.

# Test: Search for the insertion of the `trigger` parameter. Expect: Correct insertion of the parameter.
rg --type python $'update_nested_dict(config, ["experiments", "trigger"], trigger)'

Length of output: 115


Script:

#!/bin/bash
# Description: Verify the insertion of the `trigger` parameter into the experiment config.

# Find all Python files and search for the pattern within them.
fd --type f --extension py --exec rg 'update_nested_dict(config, \["experiments", "trigger"\], trigger)'

Length of output: 106


Script:

#!/bin/bash
# Description: Verify the insertion of the `trigger` parameter into the experiment config.

# Find all Python files and search for any usage of `update_nested_dict` involving "trigger" in the arguments.
fd --type f --extension py --exec rg 'update_nested_dict\(config, \[.*"trigger".*\], trigger\)'

Length of output: 165


246-251: Ensure the correct handling of the trigger parameter for GCP and Azure.

The trigger parameter is used to augment the function name for GCP and Azure. Verify that this is done correctly.

Verification successful

The trigger parameter is correctly handled for GCP and Azure.

The trigger parameter is used to augment the function name, ensuring the correct handling for both GCP and Azure as described in the code snippet.

  • sebs.py, lines 246-251: The trigger parameter is appended to the function name for GCP and Azure.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the handling of the `trigger` parameter for GCP and Azure.

# Test: Search for the handling of the `trigger` parameter for GCP and Azure. Expect: Correct handling of the parameter.
rg --type python $'if deployment_client.name() == "gcp" or deployment_client.name() == "azure":'

Length of output: 128


Script:

#!/bin/bash
# Description: Verify the handling of the `trigger` parameter for GCP and Azure.

# Test: Search for the handling of the `trigger` parameter for GCP and Azure.
rg $'if deployment_client.name() == "gcp" or deployment_client.name() == "azure":'

Length of output: 83


Script:

#!/bin/bash
# Description: Verify the handling of the `trigger` parameter for GCP and Azure.

# Test: Search for variations of the condition and the usage of the `trigger` parameter.
rg 'deployment_client.name\(\) == "gcp"'
rg 'deployment_client.name\(\) == "azure"'
rg '\btrigger\b'

Length of output: 13037


176-176: Verify the usage of the new parameter trigger in invoke.

The function signature has been updated to include an additional parameter trigger. Ensure that the new parameter is used correctly within the function.

Verification successful

Verify the usage of the new parameter trigger in invoke.

The trigger parameter is correctly used within the invoke function in sebs.py. The parameter is checked, inserted into the configuration, and used to modify the function name and create triggers appropriately.

  • Code locations:
    • trigger parameter initialization and default assignment: trigger = trigger if trigger is not None else "http"
    • Configuration update: update_nested_dict(config, ["experiments", "trigger"], trigger)
    • Function name augmentation: function_name = "{}-{}".format(function_name, trigger)
    • Trigger type retrieval and usage:
      • trigger_type = Trigger.TriggerType.get(trigger)
      • triggers = func.triggers(trigger_type)
      • trigger = deployment_client.create_trigger(func, trigger_type)
      • trigger = triggers[0]
      • ret = trigger.sync_invoke(input_config)
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of the new parameter `trigger` in `invoke`.

# Test: Search for the function definition and its usage. Expect: Correct usage of the new parameter.
rg --type python $'def invoke(benchmark, benchmark_input_size, repetitions, trigger, memory, timeout, function_name, image_tag_prefix, **kwargs)'

Length of output: 177


Script:

#!/bin/bash
# Description: Verify the usage of the new parameter `trigger` in `invoke`.

# Test: Search for the function definition and its usage. Expect: Correct usage of the new parameter.
rg --type py 'def invoke(benchmark, benchmark_input_size, repetitions, trigger, memory, timeout, function_name, image_tag_prefix, **kwargs)'

Length of output: 142


Script:

#!/bin/bash
# Description: Verify the usage of the new parameter `trigger` in `invoke`.

# Step 1: Search for the function definition `invoke`.
rg --type py 'def invoke'

# Step 2: Search for the usage of the parameter `trigger` within the `invoke` function.
rg --type py 'invoke' | xargs rg 'trigger'

Length of output: 836


Script:

#!/bin/bash
# Description: Verify the usage of the new parameter `trigger` in `invoke`.

# Step 1: Search for the function definition `invoke` in sebs.py.
rg --type py 'def invoke' sebs.py

# Step 2: Search for the usage of the parameter `trigger` within the `invoke` function in sebs.py.
rg --type py 'trigger' sebs.py

Length of output: 900

sebs/aws/aws.py (2)

269-281: Ensure correct handling of the new trigger types in cached_function.

The function cached_function has been updated to handle new trigger types (QueueTrigger and StorageTrigger). Verify that this is done correctly.

Verification successful

The new trigger types (QueueTrigger and StorageTrigger) are correctly handled in the cached_function. The function sets the logging_handlers and assigns the deployment_client for these trigger types, similar to the existing trigger types.

  • sebs/aws/aws.py:
    • cached_function handles Trigger.TriggerType.QUEUE and Trigger.TriggerType.STORAGE correctly.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the handling of new trigger types in `cached_function`.

# Test: Search for the handling of new trigger types. Expect: Correct handling of new trigger types.
rg --type python $'for trigger in function.triggers(Trigger.TriggerType.QUEUE):'

Length of output: 112


Script:

#!/bin/bash
# Description: Verify the handling of new trigger types in `cached_function`.

# Test: Search for the handling of new trigger types. Expect: Correct handling of new trigger types.
rg 'for trigger in function.triggers(Trigger.TriggerType.QUEUE):' --glob '*.py'
rg 'for trigger in function.triggers(Trigger.TriggerType.STORAGE):' --glob '*.py'

Length of output: 165


Script:

#!/bin/bash
# Description: Verify the handling of new trigger types in `cached_function`.

# Locate the `cached_function` definition.
rg 'def cached_function' --glob '*.py'

# Verify if `cached_function` handles `Trigger.TriggerType.QUEUE`.
rg 'Trigger.TriggerType.QUEUE' --glob '*.py'

# Verify if `cached_function` handles `Trigger.TriggerType.STORAGE`.
rg 'Trigger.TriggerType.STORAGE' --glob '*.py'

Length of output: 1830


Script:

#!/bin/bash
# Description: Extract the content of the `cached_function` from `sebs/aws/aws.py` to verify the handling of new trigger types.

# Extract the `cached_function` definition and its content.
rg -A 20 'def cached_function' sebs/aws/aws.py

Length of output: 1047


135-135: Verify the usage of the new parameter trigger in package_code.

The function signature has been updated to include an additional parameter trigger. Ensure that the new parameter is used correctly within the function.

sebs/benchmark.py (1)

508-508: Verify the integration of the trigger argument.

Ensure that the deployment_build_step function correctly handles the new trigger argument. This change might affect the behavior of the function, so thorough testing is recommended.

sebs/azure/azure.py (3)

160-204: Verify the correctness of the create_function_json function.

Ensure that the function correctly generates the JSON configuration for different trigger types. Pay attention to the entryPoint and bindings fields to ensure they match the expected structure for Azure functions.

Verification successful

Verify the correctness of the create_function_json function.

The create_function_json function generates JSON configurations for different trigger types (queue, storage, HTTP). The usage context shows it writes the configuration to a function.json file. Ensure the generated JSON structure matches Azure's expected format for each trigger type.

  • Queue Trigger: Verify that the bindings field contains a queueTrigger with the correct attributes.
  • Storage Trigger: Verify that the bindings field contains a blobTrigger with the correct attributes.
  • HTTP Trigger: Verify that the bindings field contains an httpTrigger and an http output with the correct attributes.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness of the `create_function_json` function.

# Test: Search for the function usage and ensure it generates the correct JSON configuration.
rg --type python $'create_function_json'

Length of output: 70


Script:

#!/bin/bash
# Description: Verify the correctness of the `create_function_json` function.

# Test: Search for the function usage and ensure it generates the correct JSON configuration.
rg --glob '*.py' 'create_function_json'

Length of output: 223


Script:

#!/bin/bash
# Description: Extract the context in which `create_function_json` is used.

# Extract 10 lines before and after the usage of `create_function_json`.
rg --glob '*.py' -A 10 -B 10 'create_function_json' sebs/azure/azure.py

Length of output: 1495


Line range hint 221-258:
Verify the integration of the trigger argument in package_code.

Ensure that the trigger argument is correctly handled in the package_code function. This change might affect the behavior of the function, so thorough testing is recommended.

Verification successful

The trigger argument is correctly integrated in the package_code function.

The trigger argument is used appropriately in the func_name string formatting, and there are no other usages of the trigger argument in the function.

  • sebs/azure/azure.py, lines 238-261: The trigger argument is correctly handled in the package_code function.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of the `trigger` argument in `package_code`.

# Test: Search for the function definition and its usage. Expect: Proper handling of the `trigger` argument.
rg --type python $'def package_code'

Length of output: 68


Script:

#!/bin/bash
# Description: Verify the usage of the `trigger` argument in `package_code`.

# Test: Search for the function definition and its usage. Expect: Proper handling of the `trigger` argument.
rg 'def package_code'

Length of output: 397


Script:

#!/bin/bash
# Description: Extract the `package_code` function implementation from `sebs/azure/azure.py`.

# Extract the function implementation for review.
rg -A 30 'def package_code' sebs/azure/azure.py

Length of output: 1192


Script:

#!/bin/bash
# Description: Extract the complete implementation of the `package_code` function from `sebs/azure/azure.py`.

# Extract the function implementation for review.
rg -A 50 'def package_code' sebs/azure/azure.py

Length of output: 1831


Script:

#!/bin/bash
# Description: Extract the remaining implementation of the `package_code` function from `sebs/azure/azure.py`.

# Extract the function implementation for review.
rg -A 70 'def package_code' sebs/azure/azure.py

Length of output: 2502

Tools
Ruff

257-257: Use context handler for opening files

(SIM115)


577-613: Verify the correctness of the create_trigger function.

Ensure that the function correctly creates the specified trigger type and integrates it with the Azure function. Pay attention to the role assignment and trigger creation logic to ensure they are implemented correctly.

Verification successful

The create_trigger function appears to be correctly implemented and used.

The function correctly creates the specified trigger type and integrates it with the Azure function. The role assignment and trigger creation logic are appropriately handled.

  • sebs.py: The function is invoked to create a trigger when no existing triggers are found for the specified type.
  • sebs/regression.py: Similar usage pattern, ensuring triggers are created if not already present.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function.

# Test: Search for the function usage and ensure it creates the correct trigger type.
rg --type python $'create_trigger'

Length of output: 64


Script:

#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function.

# Search for the function definition of `create_trigger` in all Python files.
rg 'def create_trigger' --glob '*.py'

# Search for the usage of the `create_trigger` function in all Python files.
rg 'create_trigger' --glob '*.py'

Length of output: 2532


Script:

#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function in `sebs/azure/azure.py`.

# Extract the specific usages of the `create_trigger` function in `sebs/azure/azure.py`.
rg 'create_trigger' --glob 'sebs/azure/azure.py'

Length of output: 162


Script:

#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function by analyzing its usage in other files.

# Extract the specific lines of code where `create_trigger` is invoked in other files.
rg 'create_trigger' --glob '*.py' -n

Length of output: 1772


Script:

#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function by analyzing its usage in `sebs.py` and `sebs/regression.py`.

# Extract the specific lines of code around the invocation of `create_trigger` in `sebs.py`.
rg 'create_trigger' sebs.py -A 5 -B 5

# Extract the specific lines of code around the invocation of `create_trigger` in `sebs/regression.py`.
rg 'create_trigger' sebs/regression.py -A 5 -B 5

Length of output: 1117

sebs/gcp/gcp.py (3)

110-123: Verify the correctness of the get_trigger_resource_name function.

Ensure that the function correctly generates the fully qualified name of the trigger resource for GCP functions. Pay attention to the string formatting to ensure it matches the expected structure.

Verification successful

The get_trigger_resource_name function is correctly implemented.

The function is used appropriately within the codebase to generate the fully qualified names of GCP trigger resources, and the string formatting matches the expected structure.

  • sebs/gcp/gcp.py: Lines 110-123
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness of the `get_trigger_resource_name` function.

# Test: Search for the function usage and ensure it generates the correct resource name.
rg --type python $'get_trigger_resource_name'

Length of output: 75


Script:

#!/bin/bash
# Description: Verify the correctness of the `get_trigger_resource_name` function.

# Test: Search for the function usage and ensure it generates the correct resource name.
rg --glob '*.py' 'get_trigger_resource_name'

Length of output: 505


Script:

#!/bin/bash
# Description: Extract lines around the occurrences of `get_trigger_resource_name` to understand its usage.

# Extract 5 lines before and after each occurrence of `get_trigger_resource_name` to understand its context.
rg --glob '*.py' -C 5 'get_trigger_resource_name'

Length of output: 2747


143-193: Verify the correctness of the create_trigger_resource function.

Ensure that the function correctly creates the required resources for GCP functions. Pay attention to the error handling and resource creation logic to ensure they are implemented correctly.

Verification successful

The create_trigger_resource function is correctly implemented and used.

The function is appropriately used to ensure the existence of necessary resources before creating or updating GCP functions. The error handling and resource creation logic are correctly implemented and utilized.

  • sebs/gcp/gcp.py: Lines around 143 and 193
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness of the `create_trigger_resource` function.

# Test: Search for the function usage and ensure it creates the correct resources.
rg --type python $'create_trigger_resource'

Length of output: 73


Script:

#!/bin/bash
# Description: Verify the correctness of the `create_trigger_resource` function.

# Test: Search for the function usage and ensure it creates the correct resources.
rg 'create_trigger_resource'

Length of output: 300


Script:

#!/bin/bash
# Description: Extract surrounding lines of code where `create_trigger_resource` is called.

# Extract 10 lines before and after the function calls to understand the context.
rg 'create_trigger_resource' -A 10 -B 10

Length of output: 3607


310-313: Verify the integration of the trigger argument in create_function.

Ensure that the trigger argument is correctly handled in the create_function function. This change might affect the behavior of the function, so thorough testing is recommended.

sebs/gcp/triggers.py Outdated Show resolved Hide resolved
sebs/gcp/triggers.py Outdated Show resolved Hide resolved
@@ -55,6 +61,7 @@ def deserialize(config: dict) -> "Config":
cfg._download_results = config["download_results"]
cfg._runtime = Runtime.deserialize(config["runtime"])
cfg._flags = config["flags"] if "flags" in config else {}
cfg._trigger = config["trigger"] if "trigger" in config else {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify the conditional statement.

The conditional statement can be simplified using the get method.

  cfg._flags = config["flags"] if "flags" in config else {}
- cfg._trigger = config["trigger"] if "trigger" in config else {}
+ cfg._trigger = config.get("trigger", {})
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cfg._trigger = config["trigger"] if "trigger" in config else {}
cfg._trigger = config.get("trigger", {})
Tools
Ruff

64-64: Use config.get("trigger", {}) instead of an if block

Replace with config.get("trigger", {})

(SIM401)

@@ -11,6 +11,7 @@ def __init__(self):
self._flags: Dict[str, bool] = {}
self._experiment_configs: Dict[str, dict] = {}
self._runtime = Runtime(None, None)
self._trigger: Trigger.TriggerType
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize the _trigger attribute in the constructor.

The _trigger attribute is added but not initialized in the constructor. This could lead to an AttributeError if accessed before being set.

  self._runtime = Runtime(None, None)
  self._trigger: Trigger.TriggerType
+ self._trigger = None
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self._trigger: Trigger.TriggerType
self._runtime = Runtime(None, None)
self._trigger: Trigger.TriggerType
self._trigger = None

sebs/azure/triggers.py Outdated Show resolved Hide resolved
Comment on lines 209 to 224
def sync_invoke(self, payload: dict) -> ExecutionResult:

self.logging.debug(f"Invoke function {self.name}")

sqs_client = boto3.client(
'sqs', region_name=self.deployment_client.config.region)

# Publish payload to queue
serialized_payload = json.dumps(payload)
sqs_client.send_message(
QueueUrl=self.queue_url, MessageBody=serialized_payload)
self.logging.info(f"Sent message to queue {self.name}")

# TODO(oana): gather metrics

def async_invoke(self, payload: dict) -> concurrent.futures.Future:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gather metrics in QueueTrigger.sync_invoke.

The sync_invoke method has a TODO comment to gather metrics. Ensure that metrics are gathered to monitor the performance and usage of the queue.

Do you want me to implement the metrics gathering logic or open a GitHub issue to track this task?

sebs/aws/triggers.py Outdated Show resolved Hide resolved
Comment on lines 153 to 181
# When creating the trigger for the first time, also create and store
# queue information.
if (not self.queue_arn and not self.queue_url):
# Init clients
lambda_client = self.deployment_client.get_lambda_client()
sqs_client = boto3.client(
'sqs', region_name=self.deployment_client.config.region)

# Create queue
self.logging.debug(f"Creating queue {self.name}")

self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
self._queue_arn = sqs_client.get_queue_attributes(
QueueUrl=self.queue_url,
AttributeNames=["QueueArn"]
)["Attributes"]["QueueArn"]

self.logging.debug("Created queue")

# Add queue trigger
if (not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn,
FunctionName=self.name)
["EventSourceMappings"])):
lambda_client.create_event_source_mapping(
EventSourceArn=self.queue_arn,
FunctionName=self.name,
MaximumBatchingWindowInSeconds=1
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encapsulate queue creation logic in a separate method.

The queue creation logic within the QueueTrigger constructor can be encapsulated in a separate method to improve readability and maintainability.

+  def create_queue(self):
+      lambda_client = self.deployment_client.get_lambda_client()
+      sqs_client = boto3.client('sqs', region_name=self.deployment_client.config.region)
+      self.logging.debug(f"Creating queue {self.name}")
+      self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
+      self._queue_arn = sqs_client.get_queue_attributes(
+          QueueUrl=self.queue_url,
+          AttributeNames=["QueueArn"]
+      )["Attributes"]["QueueArn"]
+      self.logging.debug("Created queue")
+      if not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn, FunctionName=self.name)["EventSourceMappings"]):
+          lambda_client.create_event_source_mapping(
+              EventSourceArn=self.queue_arn,
+              FunctionName=self.name,
+              MaximumBatchingWindowInSeconds=1
+          )

  def __init__(self, fname: str, deployment_client: Optional[AWS] = None, queue_arn: Optional[str] = None, queue_url: Optional[str] = None):
      super().__init__()
      self.name = fname
      self._deployment_client = None
      self._queue_arn = None
      self._queue_url = None
      if deployment_client:
          self._deployment_client = deployment_client
      if queue_arn:
          self._queue_arn = queue_arn
      if queue_url:
          self._queue_url = queue_url
      if not self.queue_arn and not self.queue_url:
          self.create_queue()
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# When creating the trigger for the first time, also create and store
# queue information.
if (not self.queue_arn and not self.queue_url):
# Init clients
lambda_client = self.deployment_client.get_lambda_client()
sqs_client = boto3.client(
'sqs', region_name=self.deployment_client.config.region)
# Create queue
self.logging.debug(f"Creating queue {self.name}")
self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
self._queue_arn = sqs_client.get_queue_attributes(
QueueUrl=self.queue_url,
AttributeNames=["QueueArn"]
)["Attributes"]["QueueArn"]
self.logging.debug("Created queue")
# Add queue trigger
if (not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn,
FunctionName=self.name)
["EventSourceMappings"])):
lambda_client.create_event_source_mapping(
EventSourceArn=self.queue_arn,
FunctionName=self.name,
MaximumBatchingWindowInSeconds=1
)
# When creating the trigger for the first time, also create and store
# queue information.
if (not self.queue_arn and not self.queue_url):
self.create_queue()
def create_queue(self):
# Init clients
lambda_client = self.deployment_client.get_lambda_client()
sqs_client = boto3.client(
'sqs', region_name=self.deployment_client.config.region)
# Create queue
self.logging.debug(f"Creating queue {self.name}")
self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
self._queue_arn = sqs_client.get_queue_attributes(
QueueUrl=self.queue_url,
AttributeNames=["QueueArn"]
)["Attributes"]["QueueArn"]
self.logging.debug("Created queue")
# Add queue trigger
if (not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn,
FunctionName=self.name)
["EventSourceMappings"])):
lambda_client.create_event_source_mapping(
EventSourceArn=self.queue_arn,
FunctionName=self.name,
MaximumBatchingWindowInSeconds=1
)

Comment on lines 256 to 304
# When creating the trigger for the first time, also create and store
# storage bucket information.
if (not self.bucket_name):
# Init clients
s3 = boto3.resource('s3')
lambda_client = self.deployment_client.get_lambda_client()

# AWS disallows underscores in bucket names
self._bucket_name = self.name.replace('_', '-')
function_arn = lambda_client.get_function(FunctionName=self.name)[
"Configuration"]["FunctionArn"]

# Create bucket
self.logging.info(f"Creating bucket {self.bucket_name}")

region = self.deployment_client.config.region
if (region == "us-east-1"):
s3.create_bucket(Bucket=self.bucket_name)
else:
s3.create_bucket(
Bucket=self.bucket_name,
CreateBucketConfiguration={
"LocationConstraint": region
}
)

self.logging.info("Created bucket")

lambda_client.add_permission(
FunctionName=self.name,
StatementId=str(uuid.uuid1()),
Action="lambda:InvokeFunction",
Principal="s3.amazonaws.com",
SourceArn=f"arn:aws:s3:::{self.bucket_name}",
)

# Add bucket trigger
bucket_notification = s3.BucketNotification(self.bucket_name)
bucket_notification.put(
NotificationConfiguration={'LambdaFunctionConfigurations': [
{
'LambdaFunctionArn': function_arn,
'Events': [
's3:ObjectCreated:*'
],

},
]})

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encapsulate bucket creation logic in a separate method.

The bucket creation logic within the StorageTrigger constructor can be encapsulated in a separate method to improve readability and maintainability.

+  def create_bucket(self):
+      s3 = boto3.resource('s3')
+      lambda_client = self.deployment_client.get_lambda_client()
+      self._bucket_name = self.name.replace('_', '-')
+      function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"]["FunctionArn"]
+      self.logging.info(f"Creating bucket {self._bucket_name}")
+      region = self.deployment_client.config.region
+      if region == "us-east-1":
+          s3.create_bucket(Bucket=self._bucket_name)
+      else:
+          s3.create_bucket(
+              Bucket=self._bucket_name,
+              CreateBucketConfiguration={"LocationConstraint": region}
+          )
+      self.logging.info("Created bucket")
+      lambda_client.add_permission(
+          FunctionName=self.name,
+          StatementId=str(uuid.uuid1()),
+          Action="lambda:InvokeFunction",
+          Principal="s3.amazonaws.com",
+          SourceArn=f"arn:aws:s3:::{self._bucket_name}",
+      )
+      bucket_notification = s3.BucketNotification(self._bucket_name)
+      bucket_notification.put(
+          NotificationConfiguration={'LambdaFunctionConfigurations': [
+              {
+                  'LambdaFunctionArn': function_arn,
+                  'Events': ['s3:ObjectCreated:*']
+              },
+          ]}
+      )

  def __init__(self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None):
      super().__init__()
      self.name = fname
      self._deployment_client = None
      self._bucket_name = None
      if deployment_client:
          self._deployment_client = deployment_client
      if bucket_name:
          self._bucket_name = bucket_name
      if not self.bucket_name:
          self.create_bucket()
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# When creating the trigger for the first time, also create and store
# storage bucket information.
if (not self.bucket_name):
# Init clients
s3 = boto3.resource('s3')
lambda_client = self.deployment_client.get_lambda_client()
# AWS disallows underscores in bucket names
self._bucket_name = self.name.replace('_', '-')
function_arn = lambda_client.get_function(FunctionName=self.name)[
"Configuration"]["FunctionArn"]
# Create bucket
self.logging.info(f"Creating bucket {self.bucket_name}")
region = self.deployment_client.config.region
if (region == "us-east-1"):
s3.create_bucket(Bucket=self.bucket_name)
else:
s3.create_bucket(
Bucket=self.bucket_name,
CreateBucketConfiguration={
"LocationConstraint": region
}
)
self.logging.info("Created bucket")
lambda_client.add_permission(
FunctionName=self.name,
StatementId=str(uuid.uuid1()),
Action="lambda:InvokeFunction",
Principal="s3.amazonaws.com",
SourceArn=f"arn:aws:s3:::{self.bucket_name}",
)
# Add bucket trigger
bucket_notification = s3.BucketNotification(self.bucket_name)
bucket_notification.put(
NotificationConfiguration={'LambdaFunctionConfigurations': [
{
'LambdaFunctionArn': function_arn,
'Events': [
's3:ObjectCreated:*'
],
},
]})
def create_bucket(self):
s3 = boto3.resource('s3')
lambda_client = self.deployment_client.get_lambda_client()
self._bucket_name = self.name.replace('_', '-')
function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"]["FunctionArn"]
self.logging.info(f"Creating bucket {self._bucket_name}")
region = self.deployment_client.config.region
if region == "us-east-1":
s3.create_bucket(Bucket=self._bucket_name)
else:
s3.create_bucket(
Bucket=self._bucket_name,
CreateBucketConfiguration={"LocationConstraint": region}
)
self.logging.info("Created bucket")
lambda_client.add_permission(
FunctionName=self.name,
StatementId=str(uuid.uuid1()),
Action="lambda:InvokeFunction",
Principal="s3.amazonaws.com",
SourceArn=f"arn:aws:s3:::{self._bucket_name}",
)
bucket_notification = s3.BucketNotification(self._bucket_name)
bucket_notification.put(
NotificationConfiguration={'LambdaFunctionConfigurations': [
{
'LambdaFunctionArn': function_arn,
'Events': ['s3:ObjectCreated:*']
},
]}
)
# When creating the trigger for the first time, also create and store
# storage bucket information.
if (not self.bucket_name):
self.create_bucket()

sebs/openwhisk/openwhisk.py Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between ba67b4a and be4e4f9.

Files selected for processing (9)
  • sebs/aws/aws.py (4 hunks)
  • sebs/aws/function.py (2 hunks)
  • sebs/aws/triggers.py (2 hunks)
  • sebs/azure/azure.py (7 hunks)
  • sebs/azure/function.py (4 hunks)
  • sebs/azure/triggers.py (2 hunks)
  • sebs/benchmark.py (3 hunks)
  • sebs/gcp/function.py (2 hunks)
  • sebs/gcp/triggers.py (2 hunks)
Files skipped from review as they are similar to previous changes (3)
  • sebs/aws/aws.py
  • sebs/azure/triggers.py
  • sebs/benchmark.py
Additional context used
Ruff
sebs/gcp/triggers.py

5-5: os imported but unused

Remove unused import: os

(F401)


8-8: googleapiclient.errors.HttpError imported but unused

Remove unused import: googleapiclient.errors.HttpError

(F401)

sebs/azure/azure.py

258-258: Use context handler for opening files

(SIM115)

Additional comments not posted (16)
sebs/azure/function.py (2)

19-21: LGTM!

The typename method is straightforward and correctly implemented.


Line range hint 31-48: Ensure the imports are necessary and correct.

The deserialize method correctly imports and casts new trigger types. The logic for handling unknown trigger types is also appropriate.

However, verify that all imports are necessary and used.

Verification successful

The imports are necessary and correctly used.

The imports from sebs.faas.function import Trigger and from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger are utilized in multiple files, including sebs/azure/function.py. Therefore, they are essential for the functionality implemented in the deserialize method.

  • sebs/azure/function.py
  • tests/aws/invoke_function_http.py
  • sebs/regression.py
  • sebs/benchmark.py
  • sebs/gcp/function.py
  • sebs/gcp/gcp.py
  • sebs/openwhisk/function.py
  • sebs/experiments/perf_cost.py
  • sebs/experiments/network_ping_pong.py
  • sebs/experiments/invocation_overhead.py
  • sebs/aws/function.py
  • sebs/azure/azure.py
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for unused import `from sebs.faas.function import Trigger`.
rg --type python $'from sebs.faas.function import Trigger'

# Test: Search for unused import `from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger`.
rg --type python $'from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger'

Length of output: 216


Script:

#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for usage of import `from sebs.faas.function import Trigger`.
rg 'from sebs.faas.function import Trigger' -g '*.py' -A 5

# Test: Search for usage of import `from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger`.
rg 'from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger' -g '*.py' -A 5

Length of output: 5976

sebs/gcp/function.py (1)

Line range hint 33-51: Ensure the imports are necessary and correct.

The deserialize method correctly imports and casts new trigger types. The logic for handling unknown trigger types is also appropriate.

However, verify that all imports are necessary and used.

Verification successful

Verification successful: The imports from sebs.faas.function import Trigger and from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger are necessary and correctly used in the deserialize method.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for unused import `from sebs.faas.function import Trigger`.
rg --type python $'from sebs.faas.function import Trigger'

# Test: Search for unused import `from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger`.
rg --type python $'from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger'

Length of output: 230


Script:

#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for the usage of `from sebs.faas.function import Trigger`.
rg 'from sebs.faas.function import Trigger' -A 5

# Test: Search for the usage of `from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger`.
rg 'from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger' -A 5

Length of output: 5730

sebs/aws/function.py (1)

Line range hint 42-63: Ensure the imports are necessary and correct.

The deserialize method correctly imports and casts new trigger types. The logic for handling unknown trigger types is also appropriate.

However, verify that all imports are necessary and used.

Verification successful

Imports are necessary and correctly used.

The imports from sebs.faas.function import Trigger and from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger are necessary and correctly used in the deserialize method.

  • sebs/aws/function.py: Usage of from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger confirmed.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for unused import `from sebs.faas.function import Trigger`.
rg --type python $'from sebs.faas.function import Trigger'

# Test: Search for unused import `from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger`.
rg --type python $'from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger'

Length of output: 230


Script:

#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for usage of the import `from sebs.faas.function import Trigger`.
rg -t py 'from sebs.faas.function import Trigger'

# Test: Search for usage of the import `from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger`.
rg -t py 'from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger'

Length of output: 1170

sebs/aws/triggers.py (11)

183-184: LGTM!

The method typename is straightforward and correct.


186-189: LGTM!

The property queue_arn is straightforward and correct.


191-194: LGTM!

The property queue_url is straightforward and correct.


197-199: LGTM!

The property deployment_client is straightforward and correct.


201-203: LGTM!

The setter for deployment_client is straightforward and correct.


205-207: LGTM!

The method trigger_type is straightforward and correct.


305-306: LGTM!

The method typename is straightforward and correct.


309-311: LGTM!

The property bucket_name is straightforward and correct.


314-316: LGTM!

The property deployment_client is straightforward and correct.


318-320: LGTM!

The setter for deployment_client is straightforward and correct.


322-324: LGTM!

The method trigger_type is straightforward and correct.

sebs/azure/azure.py (1)

39-41: LGTM!

The method typename is straightforward and correct.

sebs/aws/triggers.py Outdated Show resolved Hide resolved
sebs/aws/triggers.py Outdated Show resolved Hide resolved
sebs/aws/triggers.py Outdated Show resolved Hide resolved
sebs/aws/triggers.py Outdated Show resolved Hide resolved
Comment on lines 241 to 302
class StorageTrigger(Trigger):
def __init__(
self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None
):
super().__init__()
self.name = fname

self._deployment_client = None
self._bucket_name = None

if deployment_client:
self._deployment_client = deployment_client
if bucket_name:
self._bucket_name = bucket_name

# When creating the trigger for the first time, also create and store
# storage bucket information.
if not self.bucket_name:
# Init clients
s3 = boto3.resource("s3")
lambda_client = self.deployment_client.get_lambda_client()

# AWS disallows underscores in bucket names
self._bucket_name = self.name.replace("_", "-")
function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"][
"FunctionArn"
]

# Create bucket
self.logging.info(f"Creating bucket {self.bucket_name}")

region = self.deployment_client.config.region
if region == "us-east-1":
s3.create_bucket(Bucket=self.bucket_name)
else:
s3.create_bucket(
Bucket=self.bucket_name,
CreateBucketConfiguration={"LocationConstraint": region},
)

self.logging.info("Created bucket")

lambda_client.add_permission(
FunctionName=self.name,
StatementId=str(uuid.uuid1()),
Action="lambda:InvokeFunction",
Principal="s3.amazonaws.com",
SourceArn=f"arn:aws:s3:::{self.bucket_name}",
)

# Add bucket trigger
bucket_notification = s3.BucketNotification(self.bucket_name)
bucket_notification.put(
NotificationConfiguration={
"LambdaFunctionConfigurations": [
{
"LambdaFunctionArn": function_arn,
"Events": ["s3:ObjectCreated:*"],
},
]
}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encapsulate bucket creation logic in a separate method.

The bucket creation logic within the StorageTrigger constructor can be encapsulated in a separate method to improve readability and maintainability.

+  def create_bucket(self):
+      s3 = boto3.resource('s3')
+      lambda_client = self.deployment_client.get_lambda_client()
+      self._bucket_name = self.name.replace('_', '-')
+      function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"]["FunctionArn"]
+      self.logging.info(f"Creating bucket {self._bucket_name}")
+      region = self.deployment_client.config.region
+      if region == "us-east-1":
+          s3.create_bucket(Bucket=self._bucket_name)
+      else:
+          s3.create_bucket(
+              Bucket=self._bucket_name,
+              CreateBucketConfiguration={"LocationConstraint": region}
+          )
+      self.logging.info("Created bucket")
+      lambda_client.add_permission(
+          FunctionName=self.name,
+          StatementId=str(uuid.uuid1()),
+          Action="lambda:InvokeFunction",
+          Principal="s3.amazonaws.com",
+          SourceArn=f"arn:aws:s3:::{self._bucket_name}",
+      )
+      bucket_notification = s3.BucketNotification(self._bucket_name)
+      bucket_notification.put(
+          NotificationConfiguration={'LambdaFunctionConfigurations': [
+              {
+                  'LambdaFunctionArn': function_arn,
+                  'Events': ['s3:ObjectCreated:*']
+              },
+          ]}
+      )

  def __init__(self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None):
      super().__init__()
      self.name = fname
      self._deployment_client = None
      self._bucket_name = None
      if deployment_client:
          self._deployment_client = deployment_client
      if bucket_name:
          self._bucket_name = bucket_name
      if not self.bucket_name:
          self.create_bucket()
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class StorageTrigger(Trigger):
def __init__(
self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None
):
super().__init__()
self.name = fname
self._deployment_client = None
self._bucket_name = None
if deployment_client:
self._deployment_client = deployment_client
if bucket_name:
self._bucket_name = bucket_name
# When creating the trigger for the first time, also create and store
# storage bucket information.
if not self.bucket_name:
# Init clients
s3 = boto3.resource("s3")
lambda_client = self.deployment_client.get_lambda_client()
# AWS disallows underscores in bucket names
self._bucket_name = self.name.replace("_", "-")
function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"][
"FunctionArn"
]
# Create bucket
self.logging.info(f"Creating bucket {self.bucket_name}")
region = self.deployment_client.config.region
if region == "us-east-1":
s3.create_bucket(Bucket=self.bucket_name)
else:
s3.create_bucket(
Bucket=self.bucket_name,
CreateBucketConfiguration={"LocationConstraint": region},
)
self.logging.info("Created bucket")
lambda_client.add_permission(
FunctionName=self.name,
StatementId=str(uuid.uuid1()),
Action="lambda:InvokeFunction",
Principal="s3.amazonaws.com",
SourceArn=f"arn:aws:s3:::{self.bucket_name}",
)
# Add bucket trigger
bucket_notification = s3.BucketNotification(self.bucket_name)
bucket_notification.put(
NotificationConfiguration={
"LambdaFunctionConfigurations": [
{
"LambdaFunctionArn": function_arn,
"Events": ["s3:ObjectCreated:*"],
},
]
}
)
class StorageTrigger(Trigger):
def __init__(
self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None
):
super().__init__()
self.name = fname
self._deployment_client = None
self._bucket_name = None
if deployment_client:
self._deployment_client = deployment_client
if bucket_name:
self._bucket_name = bucket_name
# When creating the trigger for the first time, also create and store
# storage bucket information.
if not self.bucket_name:
self.create_bucket()
def create_bucket(self):
s3 = boto3.resource('s3')
lambda_client = self.deployment_client.get_lambda_client()
self._bucket_name = self.name.replace('_', '-')
function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"]["FunctionArn"]
self.logging.info(f"Creating bucket {self._bucket_name}")
region = self.deployment_client.config.region
if region == "us-east-1":
s3.create_bucket(Bucket=self._bucket_name)
else:
s3.create_bucket(
Bucket=self._bucket_name,
CreateBucketConfiguration={"LocationConstraint": region}
)
self.logging.info("Created bucket")
lambda_client.add_permission(
FunctionName=self.name,
StatementId=str(uuid.uuid1()),
Action="lambda:InvokeFunction",
Principal="s3.amazonaws.com",
SourceArn=f"arn:aws:s3:::{self._bucket_name}",
)
bucket_notification = s3.BucketNotification(self._bucket_name)
bucket_notification.put(
NotificationConfiguration={'LambdaFunctionConfigurations': [
{
'LambdaFunctionArn': function_arn,
'Events': ['s3:ObjectCreated:*']
},
]}
)

Comment on lines 131 to 181
class QueueTrigger(Trigger):
def __init__(
self,
fname: str,
deployment_client: Optional[AWS] = None,
queue_arn: Optional[str] = None,
queue_url: Optional[str] = None,
):
super().__init__()
self.name = fname

self._deployment_client = None
self._queue_arn = None
self._queue_url = None

if deployment_client:
self._deployment_client = deployment_client
if queue_arn:
self._queue_arn = queue_arn
if queue_url:
self._queue_url = queue_url

# When creating the trigger for the first time, also create and store
# queue information.
if not self.queue_arn and not self.queue_url:
# Init clients
lambda_client = self.deployment_client.get_lambda_client()
sqs_client = boto3.client("sqs", region_name=self.deployment_client.config.region)

# Create queue
self.logging.debug(f"Creating queue {self.name}")

self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
self._queue_arn = sqs_client.get_queue_attributes(
QueueUrl=self.queue_url, AttributeNames=["QueueArn"]
)["Attributes"]["QueueArn"]

self.logging.debug("Created queue")

# Add queue trigger
if not len(
lambda_client.list_event_source_mappings(
EventSourceArn=self.queue_arn, FunctionName=self.name
)["EventSourceMappings"]
):
lambda_client.create_event_source_mapping(
EventSourceArn=self.queue_arn,
FunctionName=self.name,
MaximumBatchingWindowInSeconds=1,
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encapsulate queue creation logic in a separate method.

The queue creation logic within the QueueTrigger constructor can be encapsulated in a separate method to improve readability and maintainability.

+  def create_queue(self):
+      lambda_client = self.deployment_client.get_lambda_client()
+      sqs_client = boto3.client('sqs', region_name=self.deployment_client.config.region)
+      self.logging.debug(f"Creating queue {self.name}")
+      self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
+      self._queue_arn = sqs_client.get_queue_attributes(
+          QueueUrl=self.queue_url,
+          AttributeNames=["QueueArn"]
+      )["Attributes"]["QueueArn"]
+      self.logging.debug("Created queue")
+      if not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn, FunctionName=self.name)["EventSourceMappings"]):
+          lambda_client.create_event_source_mapping(
+              EventSourceArn=self.queue_arn,
+              FunctionName=self.name,
+              MaximumBatchingWindowInSeconds=1
+          )

  def __init__(self, fname: str, deployment_client: Optional[AWS] = None, queue_arn: Optional[str] = None, queue_url: Optional[str] = None):
      super().__init__()
      self.name = fname
      self._deployment_client = None
      self._queue_arn = None
      self._queue_url = None
      if deployment_client:
          self._deployment_client = deployment_client
      if queue_arn:
          self._queue_arn = queue_arn
      if queue_url:
          self._queue_url = queue_url
      if not self.queue_arn and not self.queue_url:
          self.create_queue()
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class QueueTrigger(Trigger):
def __init__(
self,
fname: str,
deployment_client: Optional[AWS] = None,
queue_arn: Optional[str] = None,
queue_url: Optional[str] = None,
):
super().__init__()
self.name = fname
self._deployment_client = None
self._queue_arn = None
self._queue_url = None
if deployment_client:
self._deployment_client = deployment_client
if queue_arn:
self._queue_arn = queue_arn
if queue_url:
self._queue_url = queue_url
# When creating the trigger for the first time, also create and store
# queue information.
if not self.queue_arn and not self.queue_url:
# Init clients
lambda_client = self.deployment_client.get_lambda_client()
sqs_client = boto3.client("sqs", region_name=self.deployment_client.config.region)
# Create queue
self.logging.debug(f"Creating queue {self.name}")
self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
self._queue_arn = sqs_client.get_queue_attributes(
QueueUrl=self.queue_url, AttributeNames=["QueueArn"]
)["Attributes"]["QueueArn"]
self.logging.debug("Created queue")
# Add queue trigger
if not len(
lambda_client.list_event_source_mappings(
EventSourceArn=self.queue_arn, FunctionName=self.name
)["EventSourceMappings"]
):
lambda_client.create_event_source_mapping(
EventSourceArn=self.queue_arn,
FunctionName=self.name,
MaximumBatchingWindowInSeconds=1,
)
class QueueTrigger(Trigger):
def __init__(
self,
fname: str,
deployment_client: Optional[AWS] = None,
queue_arn: Optional[str] = None,
queue_url: Optional[str] = None,
):
super().__init__()
self.name = fname
self._deployment_client = None
self._queue_arn = None
self._queue_url = None
if deployment_client:
self._deployment_client = deployment_client
if queue_arn:
self._queue_arn = queue_arn
if queue_url:
self._queue_url = queue_url
# When creating the trigger for the first time, also create and store
# queue information.
if not self.queue_arn and not self.queue_url:
self.create_queue()
def create_queue(self):
lambda_client = self.deployment_client.get_lambda_client()
sqs_client = boto3.client("sqs", region_name=self.deployment_client.config.region)
self.logging.debug(f"Creating queue {self.name}")
self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
self._queue_arn = sqs_client.get_queue_attributes(
QueueUrl=self.queue_url, AttributeNames=["QueueArn"]
)["Attributes"]["QueueArn"]
self.logging.debug("Created queue")
# Add queue trigger
if not len(
lambda_client.list_event_source_mappings(
EventSourceArn=self.queue_arn, FunctionName=self.name
)["EventSourceMappings"]
):
lambda_client.create_event_source_mapping(
EventSourceArn=self.queue_arn,
FunctionName=self.name,
MaximumBatchingWindowInSeconds=1,
)

def handler(event, context):

income_timestamp = datetime.datetime.now().timestamp()

# Queue trigger
if ("Records" in event and event["Records"][0]["eventSource"] == 'aws:sqs'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question here: are we certain we always receive a single event? do we need to add loop here?

from . import function
ret = function.handler(payload)

# TODO(oana)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we missing something here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is missing the core part of making measurements and returning values. Maybe we can just do a bit of restructuring to have a single invocation + measurement code, called from three trigger interfaces?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed the measurements infrastructure - all invocations are end-to-end complete.

benchmarks/wrappers/azure/python/handler.py Outdated Show resolved Hide resolved
sebs/aws/triggers.py Outdated Show resolved Hide resolved
sebs/aws/triggers.py Outdated Show resolved Hide resolved
sebs/azure/azure.py Outdated Show resolved Hide resolved
sebs/gcp/triggers.py Outdated Show resolved Hide resolved
benchmarks/wrappers/gcp/python/handler.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between be4e4f9 and 3e52f3a.

Files selected for processing (2)
  • sebs/aws/triggers.py (2 hunks)
  • sebs/gcp/gcp.py (8 hunks)
Files skipped from review as they are similar to previous changes (2)
  • sebs/aws/triggers.py
  • sebs/gcp/gcp.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3e52f3a and a466aab.

Files selected for processing (4)
  • sebs/azure/azure.py (7 hunks)
  • sebs/azure/triggers.py (2 hunks)
  • sebs/gcp/gcp.py (8 hunks)
  • sebs/gcp/triggers.py (2 hunks)
Files skipped from review as they are similar to previous changes (2)
  • sebs/azure/triggers.py
  • sebs/gcp/gcp.py
Additional context used
Ruff
sebs/gcp/triggers.py

5-5: os imported but unused

Remove unused import: os

(F401)


8-8: googleapiclient.errors.HttpError imported but unused

Remove unused import: googleapiclient.errors.HttpError

(F401)

sebs/azure/azure.py

258-258: Use context handler for opening files

(SIM115)

sebs/gcp/triggers.py Outdated Show resolved Hide resolved
sebs/gcp/triggers.py Outdated Show resolved Hide resolved
self.logging.info(f"Invoke function {self.name}")

# Init clients
client = gcp_storage.Client()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize client once.

The client initialization should be done once and reused to avoid unnecessary overhead.

-        client = gcp_storage.Client()
+        if not hasattr(self, '_gcp_storage_client'):
+            self._gcp_storage_client = gcp_storage.Client()
+        client = self._gcp_storage_client
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
client = gcp_storage.Client()
if not hasattr(self, '_gcp_storage_client'):
self._gcp_storage_client = gcp_storage.Client()
client = self._gcp_storage_client

sebs/gcp/triggers.py Show resolved Hide resolved
sebs/azure/azure.py Outdated Show resolved Hide resolved
Comment on lines 581 to 620
from sebs.azure.triggers import QueueTrigger, StorageTrigger

azure_function = cast(AzureFunction, function)
resource_group = self.config.resources.resource_group(self.cli_instance)
storage_account = azure_function.function_storage.account_name

user_principal_name = self.cli_instance.execute("az ad user list")

storage_account_scope = self.cli_instance.execute(
("az storage account show --resource-group {} --name {} --query id").format(
resource_group, storage_account
)
)

self.cli_instance.execute(
(
'az role assignment create --assignee "{}" \
--role "Storage {} Data Contributor" \
--scope {}'
).format(
json.loads(user_principal_name.decode("utf-8"))[0]["userPrincipalName"],
"Queue" if trigger_type == Trigger.TriggerType.QUEUE else "Blob",
storage_account_scope.decode("utf-8"),
)
)

trigger: Trigger
if trigger_type == Trigger.TriggerType.QUEUE:
trigger = QueueTrigger(function.name, storage_account)
self.logging.info(f"Created Queue trigger for {function.name} function")
elif trigger_type == Trigger.TriggerType.STORAGE:
trigger = StorageTrigger(function.name, storage_account)
self.logging.info(f"Created Storage trigger for {function.name} function")
else:
raise RuntimeError("Not supported!")

trigger.logging_handlers = self.logging_handlers
function.add_trigger(trigger)
self.cache_client.update_function(function)
return trigger
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add logging handlers to each trigger.

Each trigger should have logging handlers assigned for better traceability.

def create_trigger(self, function: Function, trigger_type: Trigger.TriggerType) -> Trigger:
    from sebs.azure.triggers import QueueTrigger, StorageTrigger

    azure_function = cast(AzureFunction, function)
    resource_group = self.config.resources.resource_group(self.cli_instance)
    storage_account = azure_function.function_storage.account_name

    user_principal_name = self.cli_instance.execute("az ad user list")

    storage_account_scope = self.cli_instance.execute(
        ("az storage account show --resource-group {} --name {} --query id").format(
            resource_group, storage_account
        )
    )

    self.cli_instance.execute(
        (
            'az role assignment create --assignee "{}" \
          --role "Storage {} Data Contributor" \
          --scope {}'
        ).format(
            json.loads(user_principal_name.decode("utf-8"))[0]["userPrincipalName"],
            "Queue" if trigger_type == Trigger.TriggerType.QUEUE else "Blob",
            storage_account_scope.decode("utf-8"),
        )
    )

    trigger: Trigger
    if trigger_type == Trigger.TriggerType.QUEUE:
        trigger = QueueTrigger(function.name, storage_account)
        self.logging.info(f"Created Queue trigger for {function.name} function")
    elif trigger_type == Trigger.TriggerType.STORAGE:
        trigger = StorageTrigger(function.name, storage_account)
        self.logging.info(f"Created Storage trigger for {function.name} function")
    else:
        raise RuntimeError("Not supported!")

    trigger.logging_handlers = self.logging_handlers
    function.add_trigger(trigger)
    self.cache_client.update_function(function)
    return trigger
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from sebs.azure.triggers import QueueTrigger, StorageTrigger
azure_function = cast(AzureFunction, function)
resource_group = self.config.resources.resource_group(self.cli_instance)
storage_account = azure_function.function_storage.account_name
user_principal_name = self.cli_instance.execute("az ad user list")
storage_account_scope = self.cli_instance.execute(
("az storage account show --resource-group {} --name {} --query id").format(
resource_group, storage_account
)
)
self.cli_instance.execute(
(
'az role assignment create --assignee "{}" \
--role "Storage {} Data Contributor" \
--scope {}'
).format(
json.loads(user_principal_name.decode("utf-8"))[0]["userPrincipalName"],
"Queue" if trigger_type == Trigger.TriggerType.QUEUE else "Blob",
storage_account_scope.decode("utf-8"),
)
)
trigger: Trigger
if trigger_type == Trigger.TriggerType.QUEUE:
trigger = QueueTrigger(function.name, storage_account)
self.logging.info(f"Created Queue trigger for {function.name} function")
elif trigger_type == Trigger.TriggerType.STORAGE:
trigger = StorageTrigger(function.name, storage_account)
self.logging.info(f"Created Storage trigger for {function.name} function")
else:
raise RuntimeError("Not supported!")
trigger.logging_handlers = self.logging_handlers
function.add_trigger(trigger)
self.cache_client.update_function(function)
return trigger
from sebs.azure.triggers import QueueTrigger, StorageTrigger
azure_function = cast(AzureFunction, function)
resource_group = self.config.resources.resource_group(self.cli_instance)
storage_account = azure_function.function_storage.account_name
user_principal_name = self.cli_instance.execute("az ad user list")
storage_account_scope = self.cli_instance.execute(
("az storage account show --resource-group {} --name {} --query id").format(
resource_group, storage_account
)
)
self.cli_instance.execute(
(
'az role assignment create --assignee "{}" \
--role "Storage {} Data Contributor" \
--scope {}'
).format(
json.loads(user_principal_name.decode("utf-8"))[0]["userPrincipalName"],
"Queue" if trigger_type == Trigger.TriggerType.QUEUE else "Blob",
storage_account_scope.decode("utf-8"),
)
)
trigger: Trigger
if trigger_type == Trigger.TriggerType.QUEUE:
trigger = QueueTrigger(function.name, storage_account)
self.logging.info(f"Created Queue trigger for {function.name} function")
elif trigger_type == Trigger.TriggerType.STORAGE:
trigger = StorageTrigger(function.name, storage_account)
self.logging.info(f"Created Storage trigger for {function.name} function")
else:
raise RuntimeError("Not supported!")
trigger.logging_handlers = self.logging_handlers
function.add_trigger(trigger)
self.cache_client.update_function(function)
return trigger

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Outside diff range, codebase verification and nitpick comments (6)
benchmarks/wrappers/aws/python/handler.py (1)

Line range hint 10-105: Review the updated event handling logic.

The modifications to handle different AWS event sources (SQS and S3) are well-implemented. The logic to check the event source and process it accordingly is clear and aligns with the PR objectives. However, there are a few areas that could be improved for clarity and robustness.

  1. Clarification on event handling: The previous comment by mcopik on line 15 raises a valid point about handling multiple events. This should be addressed to ensure the function can handle multiple records if they exist.
  2. Error handling: There is no error handling around the JSON parsing and other operations which might fail. Adding try-except blocks would improve the robustness.
  3. Code duplication: The logic to handle the HTTP trigger seems to be repeated and could be refactored into a separate function for better maintainability.

Consider these improvements:

     if ("Records" in event and event["Records"][0]["eventSource"] == 'aws:sqs'):
+        try:
             event = json.loads(event["Records"][0]["body"])
+        except json.JSONDecodeError:
+            logging.error("Invalid JSON format in SQS message")
+            return {"statusCode": 400, "body": "Invalid JSON format in SQS message"}

     if ("Records" in event and "s3" in event["Records"][0]):
+        try:
             bucket_name = event["Records"][0]["s3"]["bucket"]["name"]
             file_name = event["Records"][0]["s3"]["object"]["key"]
             obj = storage_inst.get_object(bucket_name, file_name)
             event = json.loads(obj['Body'].read())
+        except Exception as e:
+            logging.error(f"Error processing S3 object: {e}")
+            return {"statusCode": 500, "body": "Error processing S3 object"}
benchmarks/wrappers/azure/python/handler.py (1)

Line range hint 11-122: Review the updated event handling logic for Azure functions.

The modifications to handle different Azure event sources (HTTP, Queue, and Blob Storage) are well-implemented. The logic to check the event source and process it accordingly is clear and aligns with the PR objectives. However, there are a few areas that could be improved for clarity and robustness.

  1. Clarification on event handling: The previous comments suggest that there might be missing core parts of making measurements and returning values. This should be addressed to ensure the function can handle multiple records if they exist.
  2. Error handling: There is no error handling around the JSON parsing and other operations which might fail. Adding try-except blocks would improve the robustness.
  3. Code duplication: The logic to handle the HTTP trigger seems to be repeated and could be refactored into a separate function for better maintainability.

Consider these improvements:

     if 'connection_string' in req_json:
+        try:
             os.environ['STORAGE_CONNECTION_STRING'] = req_json['connection_string']
+        except Exception as e:
+            logging.error(f"Failed to set connection string: {e}")
+            return func.HttpResponse(status_code=500, body="Failed to set connection string")

     payload = msg.get_json()
+    try:
         payload['request-id'] = context.invocation_id
         payload['income-timestamp'] = income_timestamp
         stats = measure(payload)
         queue_client.send_message(stats)
+    except Exception as e:
+        logging.error(f"Error processing queue message: {e}")
+        return func.HttpResponse(status_code=500, body="Error processing queue message")
Tools
Ruff

2-2: base64 imported but unused

Remove unused import: base64

(F401)


5-5: azure.identity.ManagedIdentityCredential imported but unused

Remove unused import: azure.identity.ManagedIdentityCredential

(F401)


6-6: azure.storage.queue.QueueClient imported but unused

Remove unused import: azure.storage.queue.QueueClient

(F401)

benchmarks/wrappers/gcp/python/handler.py (1)

Line range hint 68-123: Refactor suggested for measure function.

The measure function is crucial as it is used by all handlers to gather measurements. However, its implementation is complex and could benefit from refactoring to improve clarity and maintainability.

Consider breaking down the function into smaller, more manageable functions that handle specific parts of the measurement process. This will make the code easier to maintain and test.

Tools
Ruff

3-3: google.cloud.storage imported but unused

Remove unused import: google.cloud.storage

(F401)

sebs/aws/aws.py (1)

Line range hint 495-530: Review the updates to the create_trigger method for supporting QueueTrigger and StorageTrigger.

The updates to the create_trigger method to support QueueTrigger and StorageTrigger are significant. The method now includes conditions to instantiate these triggers, assign logging handlers, and log the creation of these triggers. This enhances the feedback mechanism for users regarding trigger management.

Ensure that the creation of these new triggers is handled efficiently and that the logs provide clear and useful information about the trigger creation process. It's also important to check that these triggers are correctly registered and functional within the AWS environment.

sebs/benchmark.py (1)

Line range hint 475-514: Review the commented-out changes in the build method.

The commented-out section in the build method suggests a planned change to include an optional Trigger.TriggerType parameter. This change could significantly expand the method's functionality to accommodate more complex scenarios involving triggers.

Clarify the status of this planned change. If it is to be implemented, ensure that all necessary updates are made consistently across the codebase. If not, consider removing the commented-out code to avoid confusion.

sebs/azure/azure.py (1)

Line range hint 222-260: Approved: Method package_code correctly packages code for deployment.

This method is well-implemented for handling different language configurations and trigger types. However, consider using a context manager for file operations to ensure files are properly closed after operations.

Use a context manager for file operations:

with open(json_out, "w") as file:
    json.dump(self.create_function_json(func_name, EXEC_FILES[language_name]), file, indent=2)
Tools
Ruff

258-258: Use context handler for opening files

(SIM115)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a466aab and dac2840.

Files selected for processing (18)
  • benchmarks/wrappers/aws/python/handler.py (2 hunks)
  • benchmarks/wrappers/aws/python/queue.py (1 hunks)
  • benchmarks/wrappers/azure/python/handler.py (4 hunks)
  • benchmarks/wrappers/azure/python/queue.py (1 hunks)
  • benchmarks/wrappers/gcp/python/handler.py (2 hunks)
  • benchmarks/wrappers/gcp/python/queue.py (1 hunks)
  • config/systems.json (4 hunks)
  • docs/modularity.md (2 hunks)
  • requirements.gcp.txt (1 hunks)
  • scripts/run_experiments.py (1 hunks)
  • sebs/aws/aws.py (5 hunks)
  • sebs/aws/triggers.py (2 hunks)
  • sebs/azure/azure.py (7 hunks)
  • sebs/azure/triggers.py (2 hunks)
  • sebs/benchmark.py (3 hunks)
  • sebs/gcp/gcp.py (9 hunks)
  • sebs/gcp/triggers.py (2 hunks)
  • tests/aws/create_function.py (1 hunks)
Additional context used
Ruff
benchmarks/wrappers/azure/python/handler.py

2-2: base64 imported but unused

Remove unused import: base64

(F401)


5-5: azure.identity.ManagedIdentityCredential imported but unused

Remove unused import: azure.identity.ManagedIdentityCredential

(F401)


6-6: azure.storage.queue.QueueClient imported but unused

Remove unused import: azure.storage.queue.QueueClient

(F401)

benchmarks/wrappers/gcp/python/handler.py

3-3: google.cloud.storage imported but unused

Remove unused import: google.cloud.storage

(F401)

sebs/gcp/triggers.py

5-5: os imported but unused

Remove unused import: os

(F401)


8-8: googleapiclient.errors.HttpError imported but unused

Remove unused import: googleapiclient.errors.HttpError

(F401)

sebs/azure/triggers.py

11-11: azure.storage.queue.QueueClient imported but unused

Remove unused import: azure.storage.queue.QueueClient

(F401)

sebs/benchmark.py

8-8: typing.Optional imported but unused

Remove unused import: typing.Optional

(F401)

sebs/azure/azure.py

258-258: Use context handler for opening files

(SIM115)

Additional comments not posted (20)
benchmarks/wrappers/gcp/python/handler.py (3)

7-15: Function handler_http is well-implemented.

The renaming and modifications to handle HTTP requests with additional metadata are correctly implemented.


3-3: Remove unused import.

The import google.cloud.storage is flagged as unused and should be removed to clean up the code.

Apply this diff to remove the unused import:

- from google.cloud import storage as gcp_storage

Likely invalid or redundant comment.

Tools
Ruff

3-3: google.cloud.storage imported but unused

Remove unused import: google.cloud.storage

(F401)


17-35: Function handler_queue is well-implemented.

The function correctly handles messages from a queue, including decoding and processing base64-encoded data. Ensure that the encoding and decoding are consistent across all components that interact with this queue.

Run the following script to verify the consistency of data encoding:

Verification successful

Base64 encoding and decoding are consistently used across components.

The handler_queue function in benchmarks/wrappers/gcp/python/handler.py decodes base64-encoded data, and other components like sebs/gcp/triggers.py and sebs/azure/triggers.py encode data using base64. This indicates a consistent approach to data encoding and decoding across the codebase.

  • benchmarks/wrappers/gcp/python/handler.py: Decodes base64 data.
  • sebs/gcp/triggers.py: Encodes data using base64.
  • sebs/azure/triggers.py: Encodes data using base64.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the consistency of data encoding across components.

# Test: Search for base64 encoding and decoding usage. Expect: Consistent usage across components.
rg --type python -A 5 $'base64'

Length of output: 61


Script:

#!/bin/bash
# Description: Verify the consistency of data encoding across components.

# Test: Search for base64 encoding and decoding usage. Expect: Consistent usage across components.
rg --type py -A 5 $'base64'

Length of output: 3990

sebs/azure/triggers.py (2)

61-177: Review: QueueTrigger class implementation.

The QueueTrigger class is well-implemented with methods to handle queue operations effectively. It includes error handling and dynamic queue creation which are crucial for robustness.

  • Constructor: Properly initializes and creates queues if they are not provided.
  • Sync and Async Invocations: These methods are implemented to handle message sending and receiving correctly.
  • Serialization/Deserialization: Ensures that the trigger's state can be saved and restored, which is useful for distributed systems.

Overall, the implementation aligns with the PR's objectives to enhance Azure's trigger capabilities.


Line range hint 180-398: Review: StorageTrigger class implementation.

The StorageTrigger class effectively manages interactions with Azure Blob Storage. It includes comprehensive handling of blob operations and dynamic resource management.

  • Constructor: Handles the creation of blob containers if they do not exist and sets up necessary configurations.
  • Sync and Async Invocations: Implements methods to upload blobs and wait for results, which are essential for responsive cloud applications.
  • Serialization/Deserialization: Facilitates the saving and restoring of the trigger's state, enhancing the trigger's usability in distributed environments.

This class supports the PR's goals by extending the storage capabilities on Azure.

sebs/azure/azure.py (3)

39-41: Approved: Method typename correctly implemented.

This method is straightforward and correctly returns the string "Azure".


354-359: Approved: Method update_function correctly handles HTTP triggers.

This method is correctly implemented to add an HTTP trigger based on the function name suffix. It is specific to its use case and does not require changes at this time.


575-642: Approved: Method create_trigger supports multiple trigger types effectively.

This method is well-implemented to support the creation of HTTP, queue, and storage triggers. The conditional logic and logging statements are appropriately used to manage different trigger types.

config/systems.json (3)

21-22: Approved: Addition of Python 3.9 base images.

The addition of Python 3.9 base images across various platforms is correctly implemented and enhances the flexibility of the environment setup.

Also applies to: 74-75, 118-119


75-75: Approved: Addition of 'queue.py' to deployment files.

The inclusion of 'queue.py' in the deployment files list for AWS, Azure, and GCP is correctly implemented and indicates an expansion of the deployment capabilities to include queue management.

Also applies to: 119-119, 170-170


123-124: Approved: Addition of Azure-specific packages.

The inclusion of 'azure-storage-queue' and 'azure-identity' in the packages list for Azure is correctly implemented and crucial for integrating Azure queue services and identity management into the deployment process.

tests/aws/create_function.py (1)

38-39: Approved: Inclusion of 'function/queue.py' in package files.

The addition of 'function/queue.py' to the package files list for both Python and Node.js is correctly implemented. This ensures that the queue.py file is included in the deployment or testing process, enhancing the functionality related to the handling of queues.

sebs/aws/triggers.py (2)

133-176: Review of QueueTrigger class:

The QueueTrigger class is well-structured and handles the interaction with AWS SQS effectively. However, there are a few areas that could be improved:

  1. Error Handling: Ensure that all AWS SDK calls are wrapped in try-except blocks to handle potential exceptions, especially in the create_queue method.
  2. Resource Management: Consider checking if the queue exists before attempting to create it to avoid unnecessary API calls.
  3. Logging: Add more detailed logging at each step of the queue interaction to aid in debugging and monitoring.

248-326: Review of StorageTrigger class:

The StorageTrigger class handles the interaction with AWS S3 effectively. However, there are a few areas that could be improved:

  1. Error Handling: Ensure that all AWS SDK calls are wrapped in try-except blocks to handle potential exceptions, especially in the create_bucket method.
  2. Resource Management: Consider checking if the bucket exists before attempting to create it to avoid unnecessary API calls.
  3. Logging: Add more detailed logging at each step of the bucket interaction to aid in debugging and monitoring.
scripts/run_experiments.py (1)

448-448: Clarification needed on the integration of queue.py.

The AI-generated summary mentions the addition of queue.py to the initialization process, but it's not explicitly shown in the provided code. Could you clarify how queue.py is integrated and its impact on the functionality of the script?

sebs/gcp/gcp.py (5)

143-190: Method create_trigger_resource is robust and well-implemented.

The method handles the creation of queue and storage triggers effectively, with appropriate checks for existing resources and detailed logging. The return structure with the trigger configuration is well-designed to integrate with the function creation process.


Line range hint 311-331: Modifications to create_function effectively integrate trigger resource management.

The inclusion of trigger resource creation before function deployment ensures that all necessary resources are available, which is crucial for the successful deployment of functions. The method is well-implemented and integrates the trigger information seamlessly into the function creation process.


380-416: Modifications to create_trigger expand support and enhance feedback.

The method now supports the creation of queue and storage triggers, which aligns with the PR's objectives to introduce these trigger types. The enhanced logging provides valuable feedback on the creation process, which is beneficial for monitoring and debugging.


Line range hint 457-474: Modifications to update_function ensure trigger resource availability and integration.

The method effectively ensures that trigger resources are available before updating the function, which is important for consistency and reliability. The integration of trigger information into the function update request is well-handled and crucial for the successful update of functions.


428-441: Modifications to cached_function effectively manage different trigger types.

The method now handles LibraryTrigger, QueueTrigger, and StorageTrigger, assigning appropriate logging handlers and deployment clients. This is crucial for managing the lifecycle of functions and their triggers, ensuring that they can log appropriately and interact effectively with the deployment environment.

@@ -4,3 +4,4 @@ google-api-python-client==1.12.5
google-cloud-monitoring==2.0.0
google-api-python-client-stubs
google-cloud-logging==2.0.0
google-cloud-pubsub=2.23.0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct the version specifier typo.

The dependency google-cloud-pubsub should use == for specifying the version, not =. This typo could prevent the correct installation of the package.

Apply this diff to correct the typo:

-google-cloud-pubsub=2.23.0
+google-cloud-pubsub==2.23.0
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
google-cloud-pubsub=2.23.0
google-cloud-pubsub==2.23.0

Comment on lines +3 to +14
class queue:
client = None

def __init__(self, queue_name: str, account_id: str, region: str):
self.client = boto3.client('sqs', region_name=region)
self.queue_url = f"https://sqs.{region}.amazonaws.com/{account_id}/{queue_name}"

def send_message(self, message: str):
self.client.send_message(
QueueUrl=self.queue_url,
MessageBody=message,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename the class and add error handling.

Consider renaming the class from queue to avoid conflicts with Python's built-in queue module. Additionally, add error handling for AWS client interactions to improve robustness.

Suggested class name change and error handling:

-import boto3
+import boto3
+from botocore.exceptions import ClientError

-class queue:
+class AWSQueue:
     client = None

     def __init__(self, queue_name: str, account_id: str, region: str):
         self.client = boto3.client('sqs', region_name=region)
         self.queue_url = f"https://sqs.{region}.amazonaws.com/{account_id}/{queue_name}"

     def send_message(self, message: str):
         try:
             self.client.send_message(
                 QueueUrl=self.queue_url,
                 MessageBody=message,
             )
         except ClientError as e:
             print(f"An error occurred: {e}")
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class queue:
client = None
def __init__(self, queue_name: str, account_id: str, region: str):
self.client = boto3.client('sqs', region_name=region)
self.queue_url = f"https://sqs.{region}.amazonaws.com/{account_id}/{queue_name}"
def send_message(self, message: str):
self.client.send_message(
QueueUrl=self.queue_url,
MessageBody=message,
)
import boto3
from botocore.exceptions import ClientError
class AWSQueue:
client = None
def __init__(self, queue_name: str, account_id: str, region: str):
self.client = boto3.client('sqs', region_name=region)
self.queue_url = f"https://sqs.{region}.amazonaws.com/{account_id}/{queue_name}"
def send_message(self, message: str):
try:
self.client.send_message(
QueueUrl=self.queue_url,
MessageBody=message,
)
except ClientError as e:
print(f"An error occurred: {e}")

Comment on lines +3 to +14
class queue:
client = None

def __init__(self, topic_name: str, project_id: str):
self.client = pubsub_v1.PublisherClient()
self.topic_name = 'projects/{project_id}/topics/{topic}'.format(
project_id=project_id,
topic=topic_name,
)

def send_message(self, message: str):
self.client.publish(self.topic_name, message.encode("utf-8"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename the class and add error handling.

Consider renaming the class from queue to avoid conflicts with Python's built-in queue module. Additionally, add error handling for GCP client interactions to improve robustness.

Suggested class name change and error handling:

-from google.cloud import pubsub_v1
+from google.cloud import pubsub_v1
+from google.api_core.exceptions import GoogleAPICallError, RetryError

-class queue:
+class GCPQueue:
     client = None

     def __init__(self, topic_name: str, project_id: str):
         self.client = pubsub_v1.PublisherClient()
         self.topic_name = 'projects/{project_id}/topics/{topic}'.format(
             project_id=project_id,
             topic=topic_name,
         )

     def send_message(self, message: str):
         try:
             self.client.publish(self.topic_name, message.encode("utf-8"))
         except (GoogleAPICallError, RetryError) as e:
             print(f"An error occurred: {e}")
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class queue:
client = None
def __init__(self, topic_name: str, project_id: str):
self.client = pubsub_v1.PublisherClient()
self.topic_name = 'projects/{project_id}/topics/{topic}'.format(
project_id=project_id,
topic=topic_name,
)
def send_message(self, message: str):
self.client.publish(self.topic_name, message.encode("utf-8"))
from google.cloud import pubsub_v1
from google.api_core.exceptions import GoogleAPICallError, RetryError
class GCPQueue:
client = None
def __init__(self, topic_name: str, project_id: str):
self.client = pubsub_v1.PublisherClient()
self.topic_name = 'projects/{project_id}/topics/{topic}'.format(
project_id=project_id,
topic=topic_name,
)
def send_message(self, message: str):
try:
self.client.publish(self.topic_name, message.encode("utf-8"))
except (GoogleAPICallError, RetryError) as e:
print(f"An error occurred: {e}")

Comment on lines +14 to +15
def send_message(self, message: str):
self.client.send_message(message)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add error handling in send_message.

The send_message method currently does not handle any potential exceptions that might occur during the message sending process. It's recommended to add error handling to improve the robustness of this method.

Consider wrapping the call in a try-except block:

     def send_message(self, message: str):
+        try:
             self.client.send_message(message)
+        except Exception as e:
+            logging.error(f"Failed to send message: {e}")
+            raise
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def send_message(self, message: str):
self.client.send_message(message)
def send_message(self, message: str):
try:
self.client.send_message(message)
except Exception as e:
logging.error(f"Failed to send message: {e}")
raise

Comment on lines +4 to +15
class queue:
client = None

def __init__(self, queue_name: str, storage_account: str):
account_url = f"https://{storage_account}.queue.core.windows.net"
managed_credential = ManagedIdentityCredential()
self.client = QueueClient(account_url,
queue_name=queue_name,
credential=managed_credential)

def send_message(self, message: str):
self.client.send_message(message)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor class name and improve thread safety.

The class name queue should be in CamelCase to adhere to Python's PEP 8 style guide. Also, using a class-level variable for client is not thread-safe. Consider initializing the client in the __init__ method to ensure each instance has its own client.

Here's a suggested refactor:

-class queue:
+class Queue:
-    client = None
+
     def __init__(self, queue_name: str, storage_account: str):
+        self.client = None
         account_url = f"https://{storage_account}.queue.core.windows.net"
         managed_credential = ManagedIdentityCredential()
-        self.client = QueueClient(account_url,
+        self.client = QueueClient(account_url,
                             queue_name=queue_name,
                             credential=managed_credential)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class queue:
client = None
def __init__(self, queue_name: str, storage_account: str):
account_url = f"https://{storage_account}.queue.core.windows.net"
managed_credential = ManagedIdentityCredential()
self.client = QueueClient(account_url,
queue_name=queue_name,
credential=managed_credential)
def send_message(self, message: str):
self.client.send_message(message)
class Queue:
def __init__(self, queue_name: str, storage_account: str):
self.client = None
account_url = f"https://{storage_account}.queue.core.windows.net"
managed_credential = ManagedIdentityCredential()
self.client = QueueClient(account_url,
queue_name=queue_name,
credential=managed_credential)
def send_message(self, message: str):
self.client.send_message(message)

@@ -119,6 +119,7 @@ def get_storage(self, replace_existing: bool = False) -> PersistentStorage:
function
- function.py
- storage.py
- queue.py
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review the updates to the package_code function with the new trigger parameter.

The inclusion of the trigger parameter in the package_code function is a significant update. It allows the function to handle different types of triggers when packaging code, which can affect how the function is deployed and triggered on AWS.

Ensure that the implementation of this parameter is consistent across different parts of the codebase where package_code is called. It's crucial that all calls to this function pass the correct trigger type to avoid deployment issues.

Also applies to: 136-136

Comment on lines +263 to +275
from sebs.aws.triggers import LibraryTrigger, QueueTrigger, StorageTrigger

for trigger in function.triggers(Trigger.TriggerType.LIBRARY):
trigger.logging_handlers = self.logging_handlers
cast(LibraryTrigger, trigger).deployment_client = self
for trigger in function.triggers(Trigger.TriggerType.HTTP):
trigger.logging_handlers = self.logging_handlers
for trigger in function.triggers(Trigger.TriggerType.QUEUE):
trigger.logging_handlers = self.logging_handlers
cast(QueueTrigger, trigger).deployment_client = self
for trigger in function.triggers(Trigger.TriggerType.STORAGE):
trigger.logging_handlers = self.logging_handlers
cast(StorageTrigger, trigger).deployment_client = self
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review the handling of new trigger types in the cached_function method.

The modifications to the cached_function method to handle QueueTrigger and StorageTrigger are crucial for supporting these new types of triggers. The method now iterates over triggers of these types, setting their logging handlers and deployment clients accordingly.

Verify that the integration of these new triggers does not interfere with the existing functionalities of other trigger types. Additionally, ensure that the logging and deployment client settings are correctly applied to these triggers for consistent behavior across all trigger types.

@@ -5,14 +5,15 @@
import shutil
import subprocess
from abc import abstractmethod
from typing import Any, Callable, Dict, List, Tuple
from typing import Any, Callable, Dict, List, Optional, Tuple
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused import: Optional.

The import of Optional is unused in this file, which can lead to confusion and unnecessary clutter in the code.

Remove the unused import to clean up the code and improve its readability.

-from typing import Any, Callable, Dict, List, Optional, Tuple
+from typing import Any, Callable, Dict, List, Tuple
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from typing import Any, Callable, Dict, List, Optional, Tuple
from typing import Any, Callable, Dict, List, Tuple
Tools
Ruff

8-8: typing.Optional imported but unused

Remove unused import: typing.Optional

(F401)

Comment on lines +124 to +224
def __init__(
self,
fname: str,
queue_name: str,
region: str,
result_queue: Optional[GCPQueue] = None
):
super().__init__()
self.name = fname
self._queue_name = queue_name
self._region = region
self._result_queue = result_queue

# Create result queue for communicating benchmark results back to the
# client.
if (not self._result_queue):
self._result_queue = GCPQueue(
fname,
QueueType.RESULT,
self.region
)
self._result_queue.create_queue()

@staticmethod
def typename() -> str:
return "GCP.QueueTrigger"

@property
def queue_name(self) -> str:
assert self._queue_name
return self._queue_name

@property
def region(self) -> str:
assert self._region
return self._region

@property
def result_queue(self) -> GCPQueue:
assert self._result_queue
return self._result_queue

@staticmethod
def trigger_type() -> Trigger.TriggerType:
return Trigger.TriggerType.QUEUE

def sync_invoke(self, payload: dict) -> ExecutionResult:

self.logging.info(f"Invoke function {self.name}")

# Init client
pub_sub = build("pubsub", "v1", cache_discovery=False)

# Prepare payload
# GCP is very particular with data encoding...
serialized_payload = base64.b64encode(json.dumps(payload).encode("utf-8"))

# Publish payload to queue
begin = datetime.datetime.now()
pub_sub.projects().topics().publish(
topic=self.queue_name,
body={
"messages": [{"data": serialized_payload.decode("utf-8")}],
},
).execute()

response = ""
while (response == ""):
response = self.result_queue.receive_message()

end = datetime.datetime.now()

result = ExecutionResult.from_times(begin, end)
result.parse_benchmark_output(json.loads(response))
return result

def async_invoke(self, payload: dict) -> concurrent.futures.Future:

pool = concurrent.futures.ThreadPoolExecutor()
fut = pool.submit(self.sync_invoke, payload)
return fut

def serialize(self) -> dict:
return {
"type": "Queue",
"name": self.name,
"queue_name": self.queue_name,
"region": self.region,
"result_queue": self.result_queue.serialize()
}

@staticmethod
def deserialize(obj: dict) -> Trigger:
return QueueTrigger(
obj["name"],
obj["queue_name"],
obj["region"],
GCPQueue.deserialize(obj["result_queue"])
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with suggestions: Implementation of QueueTrigger and StorageTrigger.

The implementation of both QueueTrigger and StorageTrigger classes is robust and tailored for GCP interactions. The methods for synchronous and asynchronous invocation are well-defined.

Suggestions:

  • Consider adding more detailed logging for debugging and monitoring purposes.
  • Review the error handling to ensure that all potential issues are gracefully managed.
  • Optimize the resource management, especially in methods that involve network calls and data serialization.

Also applies to: 226-326

import concurrent.futures
import datetime
import json
import os
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved: Removal of unused imports.

The removal of unused imports os and googleapiclient.errors.HttpError is correctly identified by static analysis tools and should be implemented to clean up the code.

- import os
- from googleapiclient.errors import HttpError

Also applies to: 8-8

Tools
Ruff

5-5: os imported but unused

Remove unused import: os

(F401)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between dac2840 and 7de164e.

Files selected for processing (8)
  • config/systems.json (3 hunks)
  • docs/platforms.md (1 hunks)
  • sebs.py (3 hunks)
  • sebs/azure/azure.py (7 hunks)
  • sebs/cache.py (1 hunks)
  • sebs/faas/system.py (1 hunks)
  • sebs/gcp/gcp.py (9 hunks)
  • sebs/local/local.py (1 hunks)
Files skipped from review due to trivial changes (2)
  • docs/platforms.md
  • sebs/cache.py
Additional context used
Ruff
sebs/azure/azure.py

259-259: Use context handler for opening files

(SIM115)

Additional comments not posted (13)
config/systems.json (1)

74-75: LGTM!

The changes look good:

  • Adding queue.py to the deployment files across multiple sections indicates that this file is now part of the deployment process.
  • Adding azure-storage-queue and azure-identity packages suggests an expansion of functionality related to Azure services.
  • The formatting changes (new lines before the added packages) may improve readability.

Also applies to: 120-122

sebs.py (3)

181-181: LGTM!

The code changes to add "queue" and "storage" trigger choices are approved.


Line range hint 232-257: LGTM!

The code changes to handle the new trigger types in the invoke function are approved.


Line range hint 223-223: LGTM!

The code change to add the optional trigger parameter to the package_code function is approved.

sebs/azure/azure.py (3)

257-261: LGTM!

The code changes to utilize the create_function_json method in the package_code function are approved.

Tools
Ruff

259-259: Use context handler for opening files

(SIM115)


356-361: LGTM!

The code changes to conditionally create an HTTPTrigger in the update_function method are approved.


585-647: Approved: Method create_trigger supports multiple trigger types.

The changes to the create_trigger method significantly enhance the functionality by supporting both queue and storage triggers. The method effectively sets up the necessary Azure resources and permissions.

The method also raises a RuntimeError for unsupported trigger types, which is a good practice for error handling.

Verify that the RuntimeError is raised for unsupported trigger types:

Verification successful

Verified: RuntimeError is raised for unsupported trigger types in create_trigger.

The create_trigger method in sebs/azure/azure.py correctly raises a RuntimeError for unsupported trigger types, confirming the implementation of error handling as expected.

  • Location: sebs/azure/azure.py
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that `create_trigger` raises a `RuntimeError` for unsupported trigger types.

# Test: Search for the `RuntimeError` being raised. Expect: The `RuntimeError` is raised for unsupported trigger types.
rg --type python -A 5 $'raise RuntimeError\("Not supported!"\)'

Length of output: 95


Script:

#!/bin/bash
# Description: Verify that `create_trigger` raises a `RuntimeError` for unsupported trigger types.

# Test: Search for the `RuntimeError` being raised. Expect: The `RuntimeError` is raised for unsupported trigger types.
rg --type py -A 5 $'raise RuntimeError\("Not supported!"\)'

Length of output: 1744

sebs/gcp/gcp.py (6)

143-192: LGTM!

The create_trigger_resource method is well-implemented and enhances the functionality related to trigger resources in GCP. It effectively checks the trigger type, creates the necessary resources, logs the creation process, and handles potential conflicts if the resources already exist.


251-252: LGTM!

The code changes to write additional dependencies to the requirements.txt file in the package_code function are approved.


311-314: LGTM!

The code changes to call create_trigger_resource before creating the function in the create_function method are approved. This ensures that the required trigger resources are created before the function deployment process.


380-416: LGTM!

The code changes to the create_trigger method to handle different trigger types, including HTTP, Queue, and Storage triggers, are approved. The method now imports QueueTrigger and StorageTrigger classes and includes appropriate logging for each trigger type, providing a more structured approach to managing triggers.


428-441: LGTM!

The code changes to the cached_function method to handle different trigger types, including LibraryTrigger, QueueTrigger, and StorageTrigger, are approved. These changes enhance the functionality of the method by allowing it to handle different trigger types.


457-460: LGTM!

The code changes to call create_trigger_resource before updating the function in the update_function method are approved. This ensures that the required trigger resources are verified before the function deployment process.

@@ -134,6 +134,7 @@ def package_code(
language_version: str,
benchmark: str,
is_cached: bool,
trigger: Optional[Trigger.TriggerType],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure proper handling of the new trigger parameter.

The trigger parameter has been added to the package_code method but is not utilized within the method body. Consider adding logic to handle this parameter or removing it if unnecessary.

@@ -167,6 +167,7 @@ def package_code(
language_version: str,
benchmark: str,
is_cached: bool,
trigger: Optional[Trigger.TriggerType],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure proper handling of the new trigger parameter in the actual implementation.

The trigger parameter has been added to the abstract package_code method. When implementing this method in the derived classes, ensure that the trigger parameter is being utilized correctly. If not needed, consider removing it from the abstract method definition.

Comment on lines +162 to +206
def create_function_json(self, benchmark, exec_files) -> Dict:
trigger = benchmark.split("-")[-1]

if trigger == "queue":
return {
"scriptFile": exec_files,
"entryPoint": "handler_queue",
"bindings": [
{
"name": "msg",
"type": "queueTrigger",
"direction": "in",
"queueName": benchmark,
"connection": "AzureWebJobsStorage",
}
],
}
elif trigger == "storage":
return {
"scriptFile": exec_files,
"entryPoint": "handler_storage",
"bindings": [
{
"name": "blob",
"type": "blobTrigger",
"direction": "in",
"path": benchmark,
"connection": "AzureWebJobsStorage",
}
],
}
return { # HTTP
"scriptFile": exec_files,
"entryPoint": "handler_http",
"bindings": [
{
"authLevel": "anonymous",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": ["get", "post"],
},
{"type": "http", "direction": "out", "name": "$return"},
],
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved: Method create_function_json enhances flexibility.

This method correctly handles different trigger types and dynamically generates the function configuration. However, consider refactoring the repeated dictionary structures into a helper function for improved readability and maintainability.

Consider using a helper function to manage repeated dictionary structures:

def binding_config(name, type, direction, connection=None, path=None, methods=None):
    config = {
        "name": name,
        "type": type,
        "direction": direction
    }
    if connection:
        config["connection"] = connection
    if path:
        config["path"] = path
    if methods:
        config["methods"] = methods
    return config

# Example usage in create_function_json
"bindings": [binding_config("msg", "queueTrigger", "in", connection="AzureWebJobsStorage")]

Comment on lines +110 to +122
def get_trigger_resource_name(self, func_name: str) -> str:
trigger = func_name.split("-")[-1]

assert trigger == "queue" or trigger == "storage"

if trigger == "queue":
return "projects/{project_name}/topics/{topic}".format(
project_name=self.config.project_name, topic=func_name
)
else:
return "projects/{project_name}/buckets/{bucket}".format(
project_name=self.config.project_name, bucket=func_name
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method get_trigger_resource_name is well-implemented but consider improving error handling.

The method effectively constructs the resource name for queue and storage triggers. However, using assert for error handling is not recommended in production code as it can be disabled with certain Python flags, and it's not informative for the user. Consider raising a more descriptive exception if the trigger type is not supported.

Replace the assert statement with a more descriptive exception:

if trigger == "queue":
    # ...
elif trigger == "storage":
    # ...
else:
    raise ValueError(f"Unsupported trigger type: {trigger}")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants