Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(robot): fix Test Longhorn dynamic provisioned RWX volume recovery #2168

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

chriscchien
Copy link
Contributor

@chriscchien chriscchien commented Nov 25, 2024

Which issue(s) this PR fixes:

longhorn/longhorn#9822

What this PR does / why we need it:

Update test case Test Longhorn dynamic provisioned RWX volume recovery to delete sharemanager pod instead of delete the sharemanager CRD during testing."

Special notes for your reviewer:

Test report: https://ci.longhorn.io/job/private/job/longhorn-e2e-test/1971/

Additional documentation or context

N/A

Summary by CodeRabbit

  • New Features

    • Introduced new methods for managing sharemanager pods, enhancing clarity in operations.
    • Added retry mechanisms for pod deletion and status checks.
  • Bug Fixes

    • Improved error handling for pod recreation and running status checks.
  • Documentation

    • Updated keywords and test cases for better clarity and precision in pod management.

@chriscchien chriscchien self-assigned this Nov 25, 2024
@chriscchien chriscchien requested a review from a team as a code owner November 25, 2024 08:58
Copy link

coderabbitai bot commented Nov 25, 2024

Walkthrough

The changes in this pull request involve updates to the sharemanager keywords and associated methods in the sharemanager.resource and sharemanager_keywords.py files. The modifications clarify the terminology used for managing sharemanager pods, ensuring that keywords accurately reflect their operations. Additionally, new methods for deleting sharemanager pods and waiting for their status have been introduced, replacing older methods. Test cases in component_resilience.robot have also been updated to align with these changes, enhancing clarity without altering the underlying logic.

Changes

File Path Change Summary
e2e/keywords/sharemanager.resource - Updated keyword: Delete sharemanager of deployment ${deployment_id} and wait for recreationDelete sharemanager pod of deployment ${deployment_id} and wait for recreation
- Updated keyword: Wait for sharemanager of deployment ${deployment_id} runningWait for sharemanager pod of deployment ${deployment_id} running
- Renamed function: delete_sharemanager_and_wait_for_recreationdelete_sharemanager_pod_and_wait_for_recreation
- Renamed function: wait_for_share_manager_runningwait_for_share_manager_pod_running
e2e/libs/keywords/sharemanager_keywords.py - Removed methods: delete_sharemanager, delete_sharemanager_and_wait_for_recreation
- Added method: delete_sharemanager_pod_and_wait_for_recreation
- Added method: wait_for_share_manager_pod_running
e2e/tests/negative/component_resilience.robot - Updated keyword: Delete sharemanager of deployment 0 and wait for recreationDelete sharemanager pod of deployment 0 and wait for recreation
- Updated keyword: And Wait for sharemanager of deployment 0 runningAnd Wait for sharemanager pod of deployment 0 running
- Refined test cases for Longhorn component recovery and volume management without altering logic.

Possibly related PRs

Suggested reviewers

  • yangchiu

Poem

🐇 In the land of pods, we hop and play,
With keywords refined, we brighten the day.
Deleting with care, waiting with cheer,
Our sharemanager pods, now crystal clear!
So let’s dance in the code, with joy we’ll sing,
For every small change, a new hope we bring! 🌼

Warning

Rate limit exceeded

@chriscchien has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 26 minutes and 44 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 1e777ba and 581f998.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (2)
e2e/libs/keywords/sharemanager_keywords.py (2)

59-59: Rename unused loop variable i to _

The loop variable i is not used within the loop body. It's idiomatic in Python to rename unused variables to _ to indicate that they are intentionally unused.

Apply this diff to rename the variable:

-for i in range(retry_count):
+for _ in range(retry_count):
🧰 Tools
🪛 Ruff (0.7.0)

59-59: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


74-74: Rename unused loop variable i to _

The loop variable i is not used within the loop body. Renaming it to _ indicates that it is intentionally unused.

Apply this diff to rename the variable:

-for i in range(retry_count):
+for _ in range(retry_count):
🧰 Tools
🪛 Ruff (0.7.0)

74-74: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1e777ba and 581f998.

📒 Files selected for processing (3)
  • e2e/keywords/sharemanager.resource (1 hunks)
  • e2e/libs/keywords/sharemanager_keywords.py (2 hunks)
  • e2e/tests/negative/component_resilience.robot (1 hunks)
🧰 Additional context used
🪛 Ruff (0.7.0)
e2e/libs/keywords/sharemanager_keywords.py

59-59: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


62-62: Comparison to None should be cond is None

Replace with cond is None

(E711)


68-68: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)


74-74: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


79-79: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

🔇 Additional comments (4)
e2e/keywords/sharemanager.resource (2)

29-32: LGTM! Complementary change for pod-based approach.

The keyword rename from "Wait for sharemanager" to "Wait for sharemanager pod" maintains consistency with the pod-based testing approach.


24-27: LGTM! The keyword change aligns with PR objectives.

The rename from "Delete sharemanager" to "Delete sharemanager pod" makes the operation more explicit and matches the PR's goal of testing pod deletion instead of CRD deletion.

Let's verify the implementation of the new method:

✅ Verification successful

Implementation verified and aligns with test requirements

The implementation in e2e/libs/keywords/sharemanager_keywords.py correctly handles the pod deletion and recreation:

  • Deletes the sharemanager pod using delete_pod in the longhorn-system namespace
  • Tracks the pod recreation by comparing creation timestamps
  • Uses proper retry mechanism with configurable count and interval
  • Integrates with the test case "Test Longhorn dynamic provisioned RWX volume recovery" which verifies volume recovery after component deletion
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of delete_sharemanager_pod_and_wait_for_recreation

# Search for the implementation to ensure it handles pod deletion correctly
ast-grep --pattern 'def delete_sharemanager_pod_and_wait_for_recreation($$$)'

Length of output: 79


Script:

#!/bin/bash
# Let's try searching with ripgrep for the implementation
rg -A 10 "delete_sharemanager_pod_and_wait_for_recreation" 

# Also search for any Python/Robot Framework files that might contain this implementation
fd -e py -e robot -e resource | xargs rg "delete_sharemanager_pod_and_wait_for_recreation"

Length of output: 2008


Script:

#!/bin/bash
# Let's verify the related functions used in the implementation
rg -A 5 "def (get_pod|delete_pod|get_retry_count_and_interval)" 

# Also check if there are any tests using this functionality
rg "Test Longhorn dynamic provisioned RWX volume recovery" -A 10

Length of output: 6919

e2e/tests/negative/component_resilience.robot (1)

177-178: LGTM! The change improves the testing approach.

The modification to delete the sharemanager pod instead of the CRD is a better approach because:

  1. It tests a more realistic failure scenario
  2. It's less disruptive as it only affects a single sharemanager instance
  3. It maintains the CRD definition while verifying the pod recovery mechanism
e2e/libs/keywords/sharemanager_keywords.py (1)

10-10: Import statements approved

The import of get_pod and delete_pod is necessary for the new functionality and is correctly implemented.

if creation_time > last_creation_time:
return

assert False, f"sharemanager pod {sharemanager_pod_name} not recreated"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assert False with raising an exception

Using assert False is not reliable because assertions can be disabled with optimization flags (e.g., python -O). Instead, raise an AssertionError or a more specific exception to ensure the error is always reported.

Apply this diff to raise an exception:

-assert False, f"sharemanager pod {sharemanager_pod_name} not recreated"
+raise AssertionError(f"sharemanager pod {sharemanager_pod_name} not recreated")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert False, f"sharemanager pod {sharemanager_pod_name} not recreated"
raise AssertionError(f"sharemanager pod {sharemanager_pod_name} not recreated")
🧰 Tools
🪛 Ruff (0.7.0)

68-68: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

Comment on lines +54 to +55
sharemanager_pod = get_pod(sharemanager_pod_name, "longhorn-system")
last_creation_time = sharemanager_pod.metadata.creation_timestamp
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle possible None return value from get_pod

get_pod may return None if the pod does not exist. Attempting to access sharemanager_pod.metadata.creation_timestamp when sharemanager_pod is None will raise an AttributeError. Ensure that sharemanager_pod is not None before accessing its attributes.

Apply this diff to add a check for None:

 sharemanager_pod = get_pod(sharemanager_pod_name, "longhorn-system")
+if sharemanager_pod is None:
+    raise Exception(f"Pod {sharemanager_pod_name} not found in namespace longhorn-system")
 last_creation_time = sharemanager_pod.metadata.creation_timestamp
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sharemanager_pod = get_pod(sharemanager_pod_name, "longhorn-system")
last_creation_time = sharemanager_pod.metadata.creation_timestamp
sharemanager_pod = get_pod(sharemanager_pod_name, "longhorn-system")
if sharemanager_pod is None:
raise Exception(f"Pod {sharemanager_pod_name} not found in namespace longhorn-system")
last_creation_time = sharemanager_pod.metadata.creation_timestamp

for i in range(retry_count):
time.sleep(retry_interval)
sharemanager_pod = get_pod(sharemanager_pod_name, "longhorn-system")
if sharemanager_pod == None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use is None instead of == None for None comparison

In Python, the preferred way to compare to None is to use is None for correctness and readability.

Apply this diff to fix the comparison:

-if sharemanager_pod == None:
+if sharemanager_pod is None:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if sharemanager_pod == None:
if sharemanager_pod is None:
🧰 Tools
🪛 Ruff (0.7.0)

62-62: Comparison to None should be cond is None

Replace with cond is None

(E711)


def wait_for_share_manager_running(self, name):
return self.sharemanager.wait_for_running(name)
assert False, f"sharemanager pod {sharemanager_pod_name} not running"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assert False with raising an exception

Using assert False is not recommended as assertions can be skipped. Raise an AssertionError or a more appropriate exception instead.

Apply this diff to raise an exception:

-assert False, f"sharemanager pod {sharemanager_pod_name} not running"
+raise AssertionError(f"sharemanager pod {sharemanager_pod_name} not running")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert False, f"sharemanager pod {sharemanager_pod_name} not running"
raise AssertionError(f"sharemanager pod {sharemanager_pod_name} not running")
🧰 Tools
🪛 Ruff (0.7.0)

79-79: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

Comment on lines +75 to +76
sharemanager_pod = get_pod(sharemanager_pod_name, "longhorn-system")
if sharemanager_pod.status.phase == "Running":
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle possible None return value from get_pod

If get_pod returns None, accessing sharemanager_pod.status.phase will raise an AttributeError. Check if sharemanager_pod is not None before accessing its attributes.

Apply this diff to handle None values:

 sharemanager_pod = get_pod(sharemanager_pod_name, "longhorn-system")
+if sharemanager_pod is None:
+    continue
 if sharemanager_pod.status.phase == "Running":
     return

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Member

@yangchiu yangchiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@yangchiu yangchiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants