Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java benchmarks support #223

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Java benchmarks support #223

wants to merge 14 commits into from

Conversation

mahlashrifi
Copy link
Collaborator

@mahlashrifi mahlashrifi commented Oct 17, 2024

This pull request contains code to enable SeBS for adding Java support on OpenWhisk. The completion is included in this pull request.

Summary by CodeRabbit

  • New Features

    • Introduced a new configuration file for Java benchmarks, detailing timeout, memory allocation, and supported languages.
    • Added a Maven project configuration for building Java applications.
    • New Java class App created, serving as the entry point with a sample output message.
    • New JSON configuration for OpenWhisk experiments and deployment.
    • Enhanced support for Java in the benchmarking framework and deployment processes.
  • Bug Fixes

    • Updated language options to include Java in various configurations.
  • Documentation

    • Added new configuration files and updated existing ones to support Java.
  • Chores

    • Introduced a Dockerfile for Java functions in OpenWhisk.

Copy link

coderabbitai bot commented Oct 17, 2024

Walkthrough

The changes in this pull request introduce several new files and modifications to support Java as a programming language within the benchmark framework. Key additions include a configuration file for benchmarks, a Maven project setup, a Java application entry point, and updates to various scripts and classes to accommodate Java-specific functionality. The updates enhance the overall framework to handle Java benchmarks and deployments effectively.

Changes

File Change Summary
benchmarks/600.java/601.hello-world/config.json New configuration file added with timeout, memory, and supported languages settings.
benchmarks/600.java/601.hello-world/java/pom.xml New Maven project configuration file created for Java application with specified project details.
benchmarks/600.java/601.hello-world/java/src/main/java/faas/App.java New Java class App added with a main method printing a greeting message.
benchmarks/wrappers/openwhisk/java/Main.java New Java class Main added with a main method that initializes and invokes the App handler.
config/example2.json New JSON configuration file added for OpenWhisk experiments and deployment settings.
config/systems.json New language configuration for Java added, specifying base images and deployment files.
dockerfiles/openwhisk/java/Dockerfile.function New Dockerfile created for building a Java function in OpenWhisk.
sebs.py Updated --language parameter to include "java" in the simplified_common_params decorator.
sebs/benchmark.py Enhanced Benchmark class to support Java with new methods and file handling.
sebs/faas/function.py Updated Language enum to include JAVA and modified serialization processes.
sebs/openwhisk/openwhisk.py Modified file handling logic in OpenWhisk class to exclude "Main.java" from deployment.

Poem

🐰 In the land of code where rabbits play,
A new Java friend hops in today.
With config and classes, oh what a sight,
Benchmarking in Java, everything feels right!
So let's cheer for the changes, both big and small,
In the world of code, we welcome them all! 🎉


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Outside diff range and nitpick comments (16)
benchmarks/600.java/601.hello-world/java/src/main/java/faas/App.java (3)

1-1: Consider using a more specific package name.

The current package name faas is concise but might be too generic. To provide better context and avoid potential conflicts, consider using a more specific package name that reflects the project structure or purpose, such as com.spcl.serverlessbenchmarks.java.helloworld.


3-3: Consider using a more descriptive class name.

While App is a common name for main application classes, it might be too generic for a benchmark-specific application. Consider using a more descriptive name that reflects the purpose of this benchmark, such as HelloWorldBenchmark or JavaFunctionBenchmark.


1-7: Summary: Good start, but room for improvement

Overall, this file provides a basic starting point for Java benchmarks in the serverless framework. However, there are several areas where it could be improved to better serve its purpose:

  1. The package name could be more specific.
  2. The class name could be more descriptive.
  3. The main method could include more benchmark-specific functionality.

These changes would make the code more robust and better suited for benchmarking purposes.

Would you like assistance in implementing these suggestions or in developing more advanced benchmarking features for this Java function?

dockerfiles/openwhisk/java/Dockerfile.function (3)

1-3: LGTM with a suggestion for optimization

The base image setup looks good. Using a build argument for the base image provides flexibility for different environments or versions.

However, consider optimizing the COPY command to only include necessary files for the function, rather than copying the entire context. This can reduce the image size and build time.

You could update the COPY command to be more specific, for example:

COPY pom.xml /function/
COPY src /function/src

This assumes your Java source files are in a src directory. Adjust the paths as needed for your project structure.


7-8: LGTM with a suggestion for improved readability

The Maven build process is well-implemented. The conditional check for pom.xml is a good practice to avoid errors if the file is missing.

To improve readability, consider breaking the long RUN command into multiple lines:

RUN if [ -f ./pom.xml ]; then \
        mvn clean install; \
    else \
        echo "pom.xml not found, aborting build." && exit 1; \
    fi

This multi-line format is easier to read and maintain, especially if you need to add more complex logic in the future.


1-8: Overall, the Dockerfile is well-structured with room for minor improvements

The Dockerfile successfully sets up a build environment for Java functions in OpenWhisk. It uses a flexible base image, installs the necessary build tools (Maven), and attempts to build the project conditionally.

A few suggestions for improvement:

  1. Optimize the COPY command to only include necessary files.
  2. Clean up the apt cache after installing Maven to reduce image size.
  3. Improve readability of the conditional Maven build command.

Also, note that this Dockerfile doesn't specify an ENTRYPOINT or CMD. This might be intentional if it's meant to be extended, but ensure this aligns with your intended use in the OpenWhisk environment.

Consider documenting the expected usage of this Dockerfile, especially if it's intended to be extended or used as part of a multi-stage build process. This will help other developers understand its purpose and how to properly use or modify it.

benchmarks/600.java/601.hello-world/java/pom.xml (2)

8-12: Consider revising the groupId and artifactId.

While the version and packaging are appropriate, there are some concerns with the project coordinates:

  1. The groupId "faas" is unusually short and doesn't follow the reverse domain name convention (e.g., com.company.project).
  2. The artifactId "601.hello-world" includes a number prefix, which is unconventional and may cause issues with some tools.

Consider revising these to more standard formats:

  • groupId: e.g., "com.spcl.serverlessbenchmarks"
  • artifactId: e.g., "hello-world-java"

1-38: Summary: Basic Maven configuration in place, with room for improvements.

The pom.xml file provides a functional Maven configuration for a Java project. However, there are several areas where it could be improved to align better with best practices and provide a more robust setup:

  1. Revise the groupId and artifactId to follow standard conventions.
  2. Consider updating to a more recent Java version.
  3. Update the maven-jar-plugin to the latest version.
  4. Review the main class package name for better alignment with Java conventions.
  5. Consider adding necessary dependencies if the project requires any external libraries.

These changes will enhance the project structure, maintainability, and potentially performance.

benchmarks/wrappers/openwhisk/java/Main.java (3)

1-9: Remove unused import

The Gson class is imported but never used in the code. Consider removing this unused import to keep the code clean.

Apply this diff to remove the unused import:

-import com.google.gson.JsonObject;

10-17: Remove commented-out code and unused variable

  1. The commented-out logger setup is not being used. If logging is not required, remove these comments to improve code readability.

  2. The Gson instance is created but never used in the code. Remove it to avoid confusion and potential performance impact.

Apply this diff to clean up the code:

-        // Logger logger = Logger.getLogger(FunctionHandler.class.getName());
-        // logger.setLevel(Level.INFO);
-        
-        Gson gson = new Gson();
         App function = new App();

1-55: Overall assessment and remaining concerns

The Main class provides a good structure for handling serverless function invocations, including timing the execution and detecting cold starts. However, there are a few areas that could be improved:

  1. Error handling: Consider adding more robust error handling throughout the class, especially for the App.handler call.
  2. Cold start detection: The current method using file I/O might not be reliable in all serverless environments. Consider using a more robust method.
  3. Code cleanliness: Remove unused imports and commented-out code to improve readability.
  4. Performance: Ensure that the App instance is created efficiently, possibly as a static field if it's thread-safe, to avoid unnecessary object creation on each invocation.

To improve the overall design:

  1. Consider extracting the cold start detection logic into a separate method or class for better modularity.
  2. If possible, make the App class conform to a specific interface that defines the handler method, allowing for easier testing and potential dependency injection.
  3. Consider adding logging throughout the class to aid in debugging and monitoring in a production environment.

These changes will make the code more robust, maintainable, and performant in a serverless environment.

config/systems.json (1)

247-263: Java configuration for OpenWhisk looks good, with some suggestions for improvement.

The new Java configuration for OpenWhisk is well-structured and consistent with other language configurations. It aligns with the PR objective of adding Java support. Here are some observations and suggestions:

  1. The base image, deployment files, and Minio package inclusion are appropriate for a Java application in OpenWhisk.
  2. Consider adding support for newer Java versions (e.g., Java 11, 17) to provide more options for users.
  3. Verify if Minio version 8.5.9 is the latest stable version compatible with Java 8. If not, consider updating to the latest compatible version.
  4. Evaluate if any additional Java-specific packages or configurations are needed for optimal performance in OpenWhisk.

Would you like assistance in expanding the Java version support or verifying the Minio version compatibility?

sebs/faas/function.py (2)

303-303: LGTM: Java support added to Runtime.deserialize

The addition of Java to the languages dictionary in Runtime.deserialize is correct and consistent with the existing structure.

Consider refactoring this method to use a more dynamic approach for language mapping. This could make it easier to add new languages in the future without modifying this method. For example:

@staticmethod
def deserialize(config: dict) -> Runtime:
    try:
        language = Language[config["language"].upper()]
    except KeyError:
        raise ValueError(f"Unsupported language: {config['language']}")
    return Runtime(language=language, version=config["version"])

This approach would automatically work for any language added to the Language enum without requiring changes to this method.


Java support is not fully integrated across the codebase

Several parts of the project still have hardcoded language checks limited to Python and Node.js. To ensure consistent Java support, please update the following files to include Java where applicable:

  • sebs.py
  • install.py
  • scripts/run_experiments.py
  • tools/build_docker_images.py
  • sebs/benchmark.py
  • sebs/local/local.py
  • sebs/gcp/gcp.py
  • sebs/regression.py
  • sebs/azure/azure.py
  • sebs/aws/aws.py
  • dockerfiles/local/runner.py
  • tests/aws/invoke_function_http.py
  • tests/aws/invoke_function_sdk.py
  • tests/aws/create_function.py
🔗 Analysis chain

Line range hint 1-424: Overall assessment: Java support successfully added

The changes to this file are minimal, focused, and correctly implement Java support in the Language enum and Runtime.deserialize method. The modifications are consistent with the PR objective and maintain the existing code structure.

To ensure that Java support is consistently implemented across the project, let's verify other files that might need similar updates:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential inconsistencies in Java support across the project

# Test 1: Search for other occurrences of Language enum usage
echo "Searching for other Language enum usages:"
rg --type python "class Language\(Enum\):" -A 10

# Test 2: Search for other deserialize methods that might need Java support
echo "Searching for other deserialize methods:"
rg --type python "def deserialize\(.*\).*:.*language" -A 5

# Test 3: Search for potential hardcoded language checks that might need updating
echo "Searching for potential hardcoded language checks:"
rg --type python "(python|nodejs)(?!.*java)"

Length of output: 759


Script:

#!/bin/bash
# Description: Check for potential inconsistencies in Java support across the project

# Test 1: Search for other occurrences of Language enum usage
echo "Searching for other Language enum usages in Python files:"
rg "class Language\(Enum\):" --glob "*.py" -A 10

# Test 2: Search for other deserialize methods that might need Java support
echo "Searching for deserialize methods handling languages in Python files:"
rg "def deserialize\(.*\).*:.*language" --glob "*.py" -A 5

# Test 3a: Search for hardcoded 'python' language checks
echo "Searching for hardcoded 'python' language checks in Python files:"
rg "\\bpython\\b" --glob "*.py"

# Test 3b: Search for hardcoded 'nodejs' language checks
echo "Searching for hardcoded 'nodejs' language checks in Python files:"
rg "\\bnodejs\\b" --glob "*.py"

Length of output: 8218

sebs/openwhisk/openwhisk.py (1)

222-222: LGTM: Addition of Java configuration file

The inclusion of "Main.java" in the CONFIG_FILES dictionary for the Java language is correct and necessary for proper packaging of Java benchmarks. This change ensures that the main Java file is included in the zip archive created for deployment, similar to how Python and Node.js files are handled.

For consistency with the other entries, consider using a list even for a single file:

-            "java": ["Main.java"],
+            "java": ["Main.java"],

This change doesn't affect functionality but maintains consistency with the Python and Node.js entries, which use lists.

sebs.py (1)

66-66: Summary: Java support added to CLI

The change successfully adds Java as a language option in the CLI, which is a crucial first step in supporting Java benchmarks in the SeBS framework. This modification aligns well with the PR objectives.

However, it's important to note that this change only affects the CLI interface. To ensure full Java support, other parts of the system (such as benchmark configurations, deployment logic, and result processing) may need to be updated as well. These updates might be present in other files or could be part of future PRs.

Consider creating a checklist or documentation for adding new language support to ensure all necessary components are updated consistently across the system.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 655f644 and a14d0a0.

📒 Files selected for processing (11)
  • benchmarks/600.java/601.hello-world/config.json (1 hunks)
  • benchmarks/600.java/601.hello-world/java/pom.xml (1 hunks)
  • benchmarks/600.java/601.hello-world/java/src/main/java/faas/App.java (1 hunks)
  • benchmarks/wrappers/openwhisk/java/Main.java (1 hunks)
  • config/example2.json (1 hunks)
  • config/systems.json (1 hunks)
  • dockerfiles/openwhisk/java/Dockerfile.function (1 hunks)
  • sebs.py (1 hunks)
  • sebs/benchmark.py (3 hunks)
  • sebs/faas/function.py (2 hunks)
  • sebs/openwhisk/openwhisk.py (2 hunks)
🧰 Additional context used
🔇 Additional comments (11)
benchmarks/600.java/601.hello-world/config.json (1)

1-6: LGTM! Configuration structure looks good.

The configuration file is well-structured and contains appropriate settings for a Java "hello-world" benchmark. The timeout of 120 seconds and memory allocation of 512 MB should be sufficient for this purpose.

Let's verify if these settings are consistent with other language configurations:

This script will help us ensure that the timeout and memory settings are consistent across different language benchmarks.

✅ Verification successful

Configuration settings are consistent with other benchmarks.

The timeout of 120 seconds and memory allocation of 512 MB for the Java "hello-world" benchmark are in line with existing benchmark configurations.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check consistency of config.json files across different language benchmarks

# Find all config.json files in the benchmarks directory
config_files=$(fd --type f --glob '**/config.json' benchmarks)

echo "Comparing configurations:"
for file in $config_files; do
    echo "File: $file"
    jq '{timeout, memory}' "$file"
    echo "---"
done

echo "Unique timeout values:"
jq '.timeout' $config_files | sort -u

echo "Unique memory values:"
jq '.memory' $config_files | sort -u

Length of output: 7255

benchmarks/600.java/601.hello-world/java/pom.xml (1)

1-6: LGTM: XML declaration and project root are correctly defined.

The XML declaration, namespace declarations, and Maven model version are all correctly specified and up-to-date.

config/example2.json (3)

1-11: LGTM: Configuration aligns with PR objectives

The initial configuration settings are well-structured and correctly specify Java 8 as the runtime, which aligns with the pull request's objective of adding Java support to the Serverless Benchmarks (SeBS) framework.


12-24: Comprehensive test configurations

The performance cost and network ping-pong test configurations are well-structured and cover various aspects of benchmarking. The parameters seem reasonable, but it's important to ensure they align with your specific testing requirements.

Please verify that the following test parameters are appropriate for your Java benchmarks:

  1. Number of repetitions (50 for perf-cost, 1000 for network-ping-pong)
  2. Concurrent invocations (50)
  3. Memory sizes (128MB, 256MB)
  4. Number of threads for network-ping-pong (1)

25-41: Review eviction-model test parameters

The invocation-overhead test configuration is comprehensive, covering a wide range of payload and code sizes. However, the eviction-model test parameters might need adjustment:

  1. The number of invocations is set to 1, which may not provide statistically significant results.
  2. The function_copy_idx is set to 0, which might not be the intended value.

Please review and confirm if the eviction-model test parameters, especially the number of invocations and function_copy_idx, are set as intended for Java benchmarks.

sebs/faas/function.py (1)

266-266: LGTM: Java language support added

The addition of JAVA to the Language enum is consistent with the PR objective and follows the existing naming conventions.

sebs/openwhisk/openwhisk.py (2)

171-171: LGTM: Exclusion of "Main.java" for Java support

The addition of "Main.java" to the list of excluded files is appropriate for supporting Java benchmarks. This change ensures that the main Java entry point is handled correctly during the Docker image build process, similar to how "index.js" and "main.py" are treated for Node.js and Python respectively.


Line range hint 1-524: Summary: Java support successfully implemented

The changes in this file successfully implement support for Java benchmarks in the OpenWhisk system. The modifications to the build_base_image and package_code methods ensure that Java files are handled correctly during the build and packaging processes. These changes align well with the PR objective of adding Java support to the Serverless Benchmarks (SeBS) framework for the OpenWhisk platform.

The implementation is consistent with how Python and Node.js are handled, maintaining the overall structure and logic of the existing code. No major issues were identified, and the changes appear to be well-integrated into the existing codebase.

sebs.py (1)

66-66: Verify complete Java support implementation

While the CLI now supports Java as a language option, ensure that the following areas of the system have been updated to fully support Java benchmarks:

  1. Benchmark configuration files
  2. Function deployment logic for Java functions
  3. Result processing for Java benchmarks
  4. Any language-specific optimizations or handlers

To check for Java-related implementations in other parts of the system, run:

sebs/benchmark.py (2)

253-253: LGTM: Correctly adds 'pom.xml' to FILES for Java support

The addition of "java": ["pom.xml"] to the FILES dictionary ensures that the pom.xml file is included when copying Java benchmark code.


345-346: Duplicate of previous issue regarding Java dependency management

The call to self.add_deployment_package_java(output_dir) relies on the add_deployment_package_java method, which requires correction as mentioned in the earlier comment.

Comment on lines +4 to +6
public static void main(String[] args) {
System.out.println("Hellooooooooooooooooooo, World!");
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refine the output message and consider adding benchmark-specific functionality.

  1. The current output "Hellooooooooooooooooooo, World!" seems unusual for a standard "Hello, World!" program. Consider using a more conventional "Hello, World!" message unless there's a specific reason for the extended greeting.

  2. For benchmarking purposes, consider adding more functionality to this method. Some suggestions:

    • Add a parameter to control the number of times the message is printed.
    • Include a simple computation to measure performance.
    • Add timing code to measure execution time.
    • Implement logging to capture benchmark results.

Here's an example of how you might modify the method:

public static void main(String[] args) {
    int iterations = args.length > 0 ? Integer.parseInt(args[0]) : 1;
    long startTime = System.nanoTime();
    
    for (int i = 0; i < iterations; i++) {
        System.out.println("Hello, World!");
    }
    
    long endTime = System.nanoTime();
    long duration = (endTime - startTime) / 1_000_000;  // Convert to milliseconds
    System.out.printf("Executed %d iterations in %d ms%n", iterations, duration);
}

This modification allows for multiple iterations and includes basic timing, which could be useful for benchmarking purposes.

FROM $BASE_IMAGE
COPY . /function/

RUN apt-get update && apt-get install -y maven
Copy link

@coderabbitai coderabbitai bot Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Optimize Maven installation to reduce image size

The Maven installation step is correct, but it can be optimized to reduce the final image size.

Consider updating the Maven installation command to clean up the apt cache:

RUN apt-get update && \
    apt-get install -y maven && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

This will remove unnecessary package files after installation, resulting in a smaller image size.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mahlashrifi Can you please add that?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +14 to +18
<!-- Configure the Java version (set to Java 8) -->
<properties>
<maven.compiler.source>1.8</maven.compiler.source> <!-- Source compatibility (Java 8) -->
<maven.compiler.target>1.8</maven.compiler.target> <!-- Target compatibility (Java 8) -->
</properties>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider updating to a more recent Java version.

The project is currently configured to use Java 8 (1.8) for both source and target compatibility. While Java 8 is still widely used, it's becoming outdated. Consider updating to a more recent LTS version like Java 11 or 17 to benefit from performance improvements and newer language features.

To update, you would change these lines:

<maven.compiler.source>11</maven.compiler.source>
<maven.compiler.target>11</maven.compiler.target>

Or for Java 17:

<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>

Comment on lines +20 to +37
<!-- Define the main class to run the JAR -->
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.2.0</version>
<configuration>
<archive>
<manifest>
<mainClass>faas.App</mainClass> <!-- Replace with your main class -->
</manifest>
</archive>
</configuration>
</plugin>
</plugins>
</build>
</project>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Maven-jar-plugin version is outdated and dependencies are missing.

  • The maven-jar-plugin version is 3.2.0. Update to version 3.3.0 for potential bug fixes and improvements:

    <version>3.3.0</version>
  • The <dependencies> section is missing in the pom.xml. Ensure to include all necessary dependencies to avoid runtime issues.

🔗 Analysis chain

Update plugin version and review main class package name.

  1. The maven-jar-plugin version (3.2.0) is not the latest. Consider updating to the most recent version (currently 3.3.0) for potential bug fixes and improvements:
<version>3.3.0</version>
  1. The main class is set to "faas.App". The package name "faas" is unconventional and doesn't align with standard Java package naming conventions. Consider using a more descriptive and hierarchical package name, e.g., "com.spcl.serverlessbenchmarks.helloworld.App".

  2. There are no dependencies defined in this pom.xml. If your project requires any external libraries, make sure to add them in a <dependencies> section.

To verify the main class exists and follows the package structure, run:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the main class file exists
fd --type f --full-path '.*src/main/java/faas/App.java$'

# Check the package declaration in the main class file
rg --type java '^package faas;' src/main/java/faas/App.java

Length of output: 257


Script:

#!/bin/bash
# Verify the package declaration in the main class file
rg --type java '^package faas;' benchmarks/600.java/601.hello-world/java/src/main/java/faas/App.java

Length of output: 116


Script:

#!/bin/bash
# Extract the version of maven-jar-plugin
rg '<artifactId>maven-jar-plugin<\/artifactId>' -A1 benchmarks/600.java/601.hello-world/java/pom.xml | rg '<version>([^<]+)<\/version>' -o

# Check for the presence of a <dependencies> section
rg '<dependencies>' benchmarks/600.java/601.hello-world/java/pom.xml

Length of output: 460


Script:

#!/bin/bash
# Extract the version of maven-jar-plugin
rg '<artifactId>maven-jar-plugin</artifactId>' -A1 benchmarks/600.java/601.hello-world/java/pom.xml | rg '<version>([^<]+)</version>' -o

# Check for the presence of a <dependencies> section
rg '<dependencies>' benchmarks/600.java/601.hello-world/java/pom.xml

Length of output: 234

Comment on lines +43 to +68
"deployment": {
"openwhisk": {
"shutdownStorage": false,
"removeCluster": false,
"wskBypassSecurity": "true",
"wskExec": "wsk",
"experimentalManifest": false,
"docker_registry": {
"registry": "",
"username": "",
"password": ""
},
"storage": {
"address": "",
"mapped_port": 9011,
"access_key": "",
"secret_key": "",
"instance_id": "",
"output_buckets": [],
"input_buckets": [],
"type": "minio"
}

}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Review security settings and sensitive information handling

The deployment configuration for OpenWhisk is comprehensive. However, there are some security considerations to address:

  1. The wskBypassSecurity is set to "true". This could pose a security risk if used in a production environment.
  2. Sensitive information fields (Docker registry credentials, storage access keys) are left empty. Ensure you have a secure method to populate these fields in different environments.

Consider the following improvements:

  1. Set wskBypassSecurity to false unless absolutely necessary for testing purposes.
  2. Implement a secure method to inject sensitive information (e.g., environment variables, secrets management system) rather than hardcoding them in the configuration file.
  3. Add a comment in the file to indicate how sensitive information should be handled.

Example:

"wskBypassSecurity": false,
// ... other fields ...
"docker_registry": {
  "registry": "", // Set via DOCKER_REGISTRY env var
  "username": "", // Set via DOCKER_USERNAME env var
  "password": ""  // Set via DOCKER_PASSWORD env var
},

Comment on lines +24 to +35
boolean isCold = false;
String fileName = "/cold_run";

File file = new File(fileName);
if (!file.exists()) {
isCold = true;
try {
file.createNewFile();
} catch (IOException e) {
e.printStackTrace();
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve cold start detection mechanism

The current implementation of cold start detection has a few potential issues:

  1. The file path "/cold_run" might not be writable in all environments.
  2. The error handling for file creation only prints the stack trace, which might not be visible in some serverless environments.
  3. This approach may not accurately detect cold starts in all scenarios, especially if the container is reused but the file is deleted.

Consider the following improvements:

  1. Use a more reliable method for cold start detection, such as caching a static variable or using environment variables specific to your serverless platform.
  2. If you must use a file-based approach, use a path that's guaranteed to be writable, such as /tmp/cold_run.
  3. Improve error handling to log errors properly and potentially set isCold to true if file creation fails.

Here's an example of how you might refactor this:

private static boolean isFirstExecution = true;

// In the main method:
boolean isCold = isFirstExecution;
if (isFirstExecution) {
    isFirstExecution = false;
}

This approach uses a static variable to detect the first execution, which is typically equivalent to a cold start in serverless environments.

Comment on lines +37 to +51
// Convert to Unix timestamp in seconds.microseconds
String formattedBegin = String.format("%d.%06d", begin.getEpochSecond(), begin.getNano() / 1000); // Convert nanoseconds to microseconds
String formattedEnd = String.format("%d.%06d", end.getEpochSecond(), end.getNano() / 1000);

String requestId = System.getenv("__OW_ACTIVATION_ID");

JsonObject jsonResult = new JsonObject();
jsonObject.put("begin", formattedBegin);
jsonObject.put("end", formattedEnd);
jsonObject.put("request_id", "requestId");
jsonObject.put("compute_time", computeTime);
jsonObject.put("is_cold", isCold);
jsonObject.put("result", result);
return jsonResult;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typos and correct JsonObject usage

There are a few issues in the result formatting section:

  1. There's a typo in the variable name when adding the request ID to the result.
  2. The method for adding properties to the JsonObject is incorrect. Java's JsonObject uses addProperty instead of put.

Apply the following changes to fix these issues:

-        JsonObject jsonResult = new JsonObject();
-        jsonObject.put("begin", formattedBegin); 
-        jsonObject.put("end", formattedEnd);
-        jsonObject.put("request_id", "requestId");  
-        jsonObject.put("compute_time", computeTime);
-        jsonObject.put("is_cold", isCold);
-        jsonObject.put("result", result);
+        JsonObject jsonResult = new JsonObject();
+        jsonResult.addProperty("begin", formattedBegin);
+        jsonResult.addProperty("end", formattedEnd);
+        jsonResult.addProperty("request_id", requestId);
+        jsonResult.addProperty("compute_time", computeTime);
+        jsonResult.addProperty("is_cold", isCold);
+        jsonResult.add("result", result);
         return jsonResult;

Note that for the "result" field, we use add instead of addProperty because result is itself a JsonObject.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Convert to Unix timestamp in seconds.microseconds
String formattedBegin = String.format("%d.%06d", begin.getEpochSecond(), begin.getNano() / 1000); // Convert nanoseconds to microseconds
String formattedEnd = String.format("%d.%06d", end.getEpochSecond(), end.getNano() / 1000);
String requestId = System.getenv("__OW_ACTIVATION_ID");
JsonObject jsonResult = new JsonObject();
jsonObject.put("begin", formattedBegin);
jsonObject.put("end", formattedEnd);
jsonObject.put("request_id", "requestId");
jsonObject.put("compute_time", computeTime);
jsonObject.put("is_cold", isCold);
jsonObject.put("result", result);
return jsonResult;
}
// Convert to Unix timestamp in seconds.microseconds
String formattedBegin = String.format("%d.%06d", begin.getEpochSecond(), begin.getNano() / 1000); // Convert nanoseconds to microseconds
String formattedEnd = String.format("%d.%06d", end.getEpochSecond(), end.getNano() / 1000);
String requestId = System.getenv("__OW_ACTIVATION_ID");
JsonObject jsonResult = new JsonObject();
jsonResult.addProperty("begin", formattedBegin);
jsonResult.addProperty("end", formattedEnd);
jsonResult.addProperty("request_id", requestId);
jsonResult.addProperty("compute_time", computeTime);
jsonResult.addProperty("is_cold", isCold);
jsonResult.add("result", result);
return jsonResult;
}

@@ -63,7 +63,7 @@ def simplified_common_params(func):
@click.option(
"--language",
default=None,
type=click.Choice(["python", "nodejs"]),
type=click.Choice(["python", "nodejs", "java"]),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Potential Inconsistencies: Missing "java" in Other Language Options

Several files still list only "python" and "nodejs" as language options. Please update the following files to include "java":

  • tools/build_docker_images.py
  • scripts/run_experiments.py
  • tests/aws/create_function.py
  • dockerfiles/local/runner.py
🔗 Analysis chain

LGTM: Java language support added

The change adds "java" as an option for the --language parameter, which is consistent with the PR objective of adding Java support to the SeBS framework.

To ensure this change is properly integrated, let's verify if there are any other parts of the code that might need updates to support Java:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other occurrences of language-related code
rg --type python 'language.*python.*nodejs' sebs.py
rg --type python 'language.*python.*nodejs' .

Length of output: 163


Script:

#!/bin/bash
# Search for other occurrences of language-related code with correct file type
rg --type py 'language.*python.*nodejs' sebs.py
rg --type py 'language.*python.*nodejs' .

Length of output: 1219

Comment on lines +304 to +313
def add_deployment_package_java(self, output_dir):
# append to the end of requirements file
packages = self._system_config.deployment_packages(
self._deployment_name, self.language_name
)
if len(packages):
with open(os.path.join(output_dir, "requirements.txt"), "a") as out:
for package in packages:
out.write(package)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Incorrect dependency handling: Appending to 'requirements.txt' in Java project

The add_deployment_package_java method appends Java packages to requirements.txt, which is used for Python dependencies. In Java Maven projects, dependencies should be managed through the pom.xml file, not requirements.txt.

Consider modifying the method to add the deployment packages to the pom.xml file by updating the ``` section accordingly.

Comment on lines +261 to +270
# copy src folder of java (java benchmarks are maven project and need directories)
if self.language_name == "java":
output_src_dir = os.path.join(output_dir, "src")

if os.path.exists(output_src_dir):
# If src dir in output exist, remove the directory and all its contents
shutil.rmtree(output_src_dir)
#To have contents of src directory in the direcory named src located in output
shutil.copytree(os.path.join(path, "src"), output_src_dir)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve indentation, comments, and error handling in the Java source copying logic

The indentation of comments is inconsistent, which can affect readability. Additionally, it's advisable to check if the source directory exists before attempting to copy it, to avoid potential exceptions if the directory doesn't exist.

Apply this diff to correct the issues:

    # copy src folder of java (Java benchmarks are Maven projects and need directories)
    if self.language_name == "java":
        output_src_dir = os.path.join(output_dir, "src")
        # If src dir in output exists, remove the directory and all its contents
        if os.path.exists(output_src_dir):
            shutil.rmtree(output_src_dir)
        src_dir = os.path.join(path, "src")
        # Check if src directory exists in the benchmark path
+       if os.path.exists(src_dir):
            shutil.copytree(src_dir, output_src_dir)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# copy src folder of java (java benchmarks are maven project and need directories)
if self.language_name == "java":
output_src_dir = os.path.join(output_dir, "src")
if os.path.exists(output_src_dir):
# If src dir in output exist, remove the directory and all its contents
shutil.rmtree(output_src_dir)
#To have contents of src directory in the direcory named src located in output
shutil.copytree(os.path.join(path, "src"), output_src_dir)
# copy src folder of java (Java benchmarks are Maven projects and need directories)
if self.language_name == "java":
output_src_dir = os.path.join(output_dir, "src")
# If src dir in output exists, remove the directory and all its contents
if os.path.exists(output_src_dir):
shutil.rmtree(output_src_dir)
src_dir = os.path.join(path, "src")
# Check if src directory exists in the benchmark path
if os.path.exists(src_dir):
shutil.copytree(src_dir, output_src_dir)

},
"java": {
"base_images": {
"8": "openwhisk/java8action"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we support newer Java versions, e.g., 11 or 17?

self._deployment_name, self.language_name
)
if len(packages):
with open(os.path.join(output_dir, "requirements.txt"), "a") as out:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mahlashrifi This does not look correct - requirements.txt should only be used for Python projects. In Python, we add packages needed for specific functions to this file such that pip installs them.

Shouldn't we use something else for Java?

for file_type in FILES[self.language_name]:
for f in glob.glob(os.path.join(path, file_type)):
shutil.copy2(os.path.join(path, f), output_dir)

# copy src folder of java (java benchmarks are maven project and need directories)
if self.language_name == "java":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use here the enum you created below :)

// Logger logger = Logger.getLogger(FunctionHandler.class.getName());
// logger.setLevel(Level.INFO);

Gson gson = new Gson();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use that anywhere?

Comment on lines +18 to +22
Instant begin = Instant.now();
JsonObject result = function.handler(args);
Instant end = Instant.now();

long computeTime = Duration.between(begin, end).toNanos() / 1000; // Convert nanoseconds to microseconds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mahlashrifi As this comment suggest, we should return an error on exception.

long computeTime = Duration.between(begin, end).toNanos() / 1000; // Convert nanoseconds to microseconds

boolean isCold = false;
String fileName = "/cold_run";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use /tmp/cold_run - /tmp is writable, other partitions are not.

FROM $BASE_IMAGE
COPY . /function/

RUN apt-get update && apt-get install -y maven
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mahlashrifi Can you please add that?

@@ -0,0 +1,8 @@
ARG BASE_IMAGE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dockerfile is very good :-) But we should also be able to deploy the code as zip package; particularly for cloud platforms that do not support containers (Azure, GCP). Can you add that or should I take a look into that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants