Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard-code consensus.timeout_commit to 3.5s for mainnet. #2196

Merged
merged 13 commits into from
Oct 24, 2024

Conversation

SpicyLemon
Copy link
Contributor

@SpicyLemon SpicyLemon commented Oct 22, 2024

Description

closes: #2121

For non-testnet nodes, hard-code the consensus.timeout_commit value to 3.5s. I.e. for mainnet nodes, that field in the config is now ignored and will always be 3.5s. Testnet nodes will still use the config file's value; it's only mainnet nodes that are hard-coded to 3.5s.


Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (see CONTRIBUTING.md).
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/).
  • Added relevant godoc comments.
  • Added relevant changelog entries under .changelog/unreleased (see Adding Changes).
  • Re-reviewed Files changed in the Github PR explorer.
  • Review Codecov Report in the comment section below once CI passes.

Summary by CodeRabbit

Release Notes

  • New Features

    • Standardized the consensus timeout for commit operations to 3.5 seconds in the mainnet environment.
    • Introduced a new utility to manage the PIO_TESTNET environment variable for improved test isolation.
  • Bug Fixes

    • Enhanced error handling and logging for configuration settings related to consensus timeout.
    • Improved command-line interface usability by refining flag handling and preventing conflicts.
  • Tests

    • Added new test cases to validate the behavior of the pre-upgrade command under various configurations.
    • Expanded test cases for checking the isTestnetFlagSet function across different scenarios.

@SpicyLemon SpicyLemon requested a review from a team as a code owner October 22, 2024 22:42
Copy link
Contributor

coderabbitai bot commented Oct 22, 2024

Walkthrough

The changes in this pull request introduce a hard-coded configuration value for consensus.timeout_commit, set to 3.5 seconds, to standardize timeout durations in the consensus mechanism. Additionally, the test suite for the pre-upgrade command is enhanced with new test cases to validate behavior under various configurations. Modifications are made to ensure that environment variables do not interfere with tests, and a new utility for managing the PIO_TESTNET environment variable is added to improve test reliability.

Changes

File Path Change Summary
.changelog/unreleased/improvements/2121-commit-timeout.md Hard-coded consensus.timeout_commit value set to 3.5s.
cmd/provenanced/cmd/pre_upgrade_test.go Updated TestPreUpgradeCmd with new test cases for timeout commit under mainnet and testnet configurations.
cmd/provenanced/cmd/root_test.go Added import for testutil, updated TestIsTestnetFlagSet for better isolation of environment variables.
cmd/provenanced/config/interceptor.go Modified InterceptConfigsPreRunHandler to set consensus.timeout_commit for non-testnet environments.
testutil/testnet.go Introduced TestnetEnvVar constant and UnsetTestnetEnvVar function for managing the PIO_TESTNET variable.

Assessment against linked issues

Objective Addressed Explanation
Hard-code consensus.timeout_commit value (2121)

Possibly related PRs

Suggested reviewers

  • Taztingo
  • nullpointer0x00

Poem

In the meadow where bunnies play,
A timeout's set, come what may.
With tests that hop and bounce around,
Ensuring all is safe and sound.
So let's rejoice, both near and far,
For changes bright as a shining star! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (4)
testutil/testnet.go (1)

5-5: Add godoc comment for exported constant.

As TestnetEnvVar is an exported constant, it should have a godoc comment explaining its purpose and usage.

+// TestnetEnvVar is the environment variable name used to indicate if a node is running in testnet mode.
 const TestnetEnvVar = "PIO_TESTNET"
cmd/provenanced/config/interceptor.go (1)

61-63: Consider implementing a more flexible configuration strategy.

The current implementation hard-codes a critical consensus parameter at the configuration interceptor level. Consider these architectural improvements:

  1. Implement a more flexible configuration strategy that allows for different network profiles (e.g., mainnet, testnet, devnet).
  2. Consider moving network-specific configurations to a dedicated configuration provider.
  3. Add telemetry or logging when overriding user configurations to aid in debugging.

This would make the system more maintainable and easier to adapt for different network requirements in the future.

cmd/provenanced/cmd/pre_upgrade_test.go (2)

270-272: Consider using a more realistic test value.

The test timeout commit value of 777 seconds (almost 13 minutes) seems unrealistically high for a consensus timeout. Consider using a value closer to real-world scenarios, such as 10s, to make the tests more representative of actual use cases.


439-488: LGTM: Comprehensive test coverage for timeout commit behavior.

The new test cases thoroughly validate the timeout commit behavior:

  1. Mainnet tests (unpacked/packed) verify that the timeout is hard-coded to 3.5s regardless of config
  2. Testnet tests (unpacked/packed) verify that the configured timeout (777s) is preserved

However, there's room for improvement in the test coverage.

Consider adding these test cases:

  1. Edge cases for timeout values (0s, negative values)
  2. Boundary testing around the 3.5s value
  3. Tests with decimal values for timeout

Example test case:

+		{
+			name: "unpacked mainnet zero timeout commit",
+			setup: func(t *testing.T) (string, func(), bool) {
+				zeroTimeoutCfg := config.DefaultCmtConfig()
+				zeroTimeoutCfg.Consensus.TimeoutCommit = 0
+				home, success := newHome(t, "unpacked_mainnet_zero_timeout", appCfgD, zeroTimeoutCfg, clientCfgD)
+				return home, nil, success
+			},
+			expExitCode:  0,
+			expInStdout:  []string{successMsg},
+			expAppCfg:    appCfgD,
+			expCmtCfg:    cmtCfgD,
+			expClientCfg: clientCfgD,
+		},
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 0bf059c and 50b40d2.

📒 Files selected for processing (5)
  • .changelog/unreleased/improvements/2121-commit-timeout.md (1 hunks)
  • cmd/provenanced/cmd/pre_upgrade_test.go (3 hunks)
  • cmd/provenanced/cmd/root_test.go (2 hunks)
  • cmd/provenanced/config/interceptor.go (1 hunks)
  • testutil/testnet.go (1 hunks)
🧰 Additional context used
🪛 LanguageTool
.changelog/unreleased/improvements/2121-commit-timeout.md

[uncategorized] ~1-~1: This word is normally spelled with a hyphen.
Context: * Hard code the mainnet consensus.timeout_commit ...

(HARD_CODE_COMPOUND)

🔇 Additional comments (5)
testutil/testnet.go (3)

1-3: LGTM!

The package name and import are appropriate for the functionality provided.


7-17: LGTM! Excellent documentation.

The documentation is comprehensive, explaining the purpose, context, and usage clearly. It also provides important information about why this utility exists instead of using t.Setenv.


18-28: 🛠️ Refactor suggestion

Consider error handling and concurrent test scenarios.

While the implementation is clean and follows Go patterns, there are a few considerations:

  1. The os.Unsetenv and os.Setenv calls can return errors that are currently ignored.
  2. Environment variables are process-wide, which could lead to race conditions in concurrent tests.

Let's check if there are any concurrent tests that might be affected:

Consider this enhanced implementation:

 func UnsetTestnetEnvVar() func() {
 	if origVal, ok := os.LookupEnv(TestnetEnvVar); ok {
-		os.Unsetenv(TestnetEnvVar)
+		if err := os.Unsetenv(TestnetEnvVar); err != nil {
+			panic(fmt.Sprintf("failed to unset %s: %v", TestnetEnvVar, err))
+		}
 		return func() {
-			os.Setenv(TestnetEnvVar, origVal)
+			if err := os.Setenv(TestnetEnvVar, origVal); err != nil {
+				panic(fmt.Sprintf("failed to restore %s: %v", TestnetEnvVar, err))
+			}
 		}
 	}
 	return func() {
-		os.Unsetenv(TestnetEnvVar)
+		_ = os.Unsetenv(TestnetEnvVar) // Ignore error on cleanup
 	}
 }

Also, consider adding a warning in the documentation about concurrent test usage:

 // Standard usage: defer testutil.UnsetTestnetEnvVar()()
+//
+// Note: This function modifies process-wide environment variables and should be used with
+// caution in concurrent tests. Consider using t.Setenv for concurrent test cases instead.
 func UnsetTestnetEnvVar() func() {
cmd/provenanced/cmd/pre_upgrade_test.go (1)

255-255: LGTM: Default timeout commit aligns with PR objective.

The default timeout commit value of 3.5s (3500ms) aligns with the PR objective to hard-code this value for mainnet nodes.

cmd/provenanced/cmd/root_test.go (1)

344-345: LGTM! Good test hygiene with environment cleanup.

The addition of testutil.UnsetTestnetEnvVar() cleanup is a good practice as it prevents test pollution by ensuring the PIO_TESTNET environment variable is reset after each test run, regardless of the test outcome.

cmd/provenanced/config/interceptor.go Outdated Show resolved Hide resolved
iramiller
iramiller previously approved these changes Oct 22, 2024
…p-timeout-commit fields separately. Issue a warning if skip-timeout-commit is true. Issue a warning if the timeout commit is not exactly what we want it to be.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
cmd/provenanced/cmd/root.go (2)

334-346: Consider enhancing the validation scope.

The validation only occurs for the "pio-mainnet-1" chain ID. Consider:

  1. Adding validation for other production chain IDs that might exist.
  2. Adding a comment explaining why these specific values are required for mainnet.

Add a comment explaining the validation:

 if chainID == "pio-mainnet-1" {
+    // Validate consensus timeout settings for mainnet.
+    // These specific values are required to ensure consistent block times across all mainnet nodes.
     skipTimeoutCommit := cast.ToBool(appOpts.Get(config.ConsensusSkipTimeoutCommitKey))

334-346: Consider improving error handling.

Currently, errors are only logged. For critical consensus settings in mainnet, consider:

  1. Making these errors more prominent (e.g., panic after logging all validation errors).
  2. Adding metrics or alerts for monitoring these misconfigurations.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 50b40d2 and 4b30ae4.

📒 Files selected for processing (3)
  • cmd/provenanced/cmd/root.go (1 hunks)
  • cmd/provenanced/cmd/root_test.go (4 hunks)
  • cmd/provenanced/config/interceptor.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/provenanced/cmd/root_test.go
🧰 Additional context used
📓 Learnings (1)
cmd/provenanced/config/interceptor.go (1)
Learnt from: SpicyLemon
PR: provenance-io/provenance#2196
File: cmd/provenanced/config/interceptor.go:61-63
Timestamp: 2024-10-22T23:48:27.590Z
Learning: When setting `consensus.timeout_commit` for non-testnet environments in `cmd/provenanced/config/interceptor.go`, it's acceptable to override user configurations without additional documentation, as consistency across nodes is prioritized.
🔇 Additional comments (4)
cmd/provenanced/config/interceptor.go (2)

26-30: LGTM! Well-structured constant definitions.

The new constants are well-organized and clearly named, making the configuration keys and values easily maintainable.


66-69: LGTM! Configuration override implements the standardization objective.

The implementation correctly enforces the consensus timeout settings for non-testnet environments, which aligns with the goal of standardizing timing across nodes.

Let's verify the configuration override:

✅ Verification successful

Let me gather more information about the consensus timeout configuration to verify the values being set.

The previous search didn't show all references, so I'm expanding the search to find the constant definitions and all related configurations. This will help us verify if the values are consistently defined and applied.


Based on the gathered information, I can now provide the final verification response:

Configuration override values are correctly defined and consistently enforced

The verification confirms that:

  • The consensus timeout values are properly defined as constants:
    • ConsensusTimeoutCommitValue = "3.5s"
    • ConsensusSkipTimeoutCommitValue = "false"
  • These values are correctly set in the interceptor for non-testnet environments
  • The values are properly referenced in validation checks (cmd/provenanced/cmd/root.go)
  • The implementation aligns with the standardization goal for mainnet nodes

The implementation is working as intended, enforcing consistent consensus timeout settings across non-testnet nodes while allowing flexibility for testnet environments.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the consensus timeout settings are correctly applied
# Test: Check if the timeout values are consistently set across the codebase
echo "=== Checking consensus timeout references ==="
rg "consensus\.timeout_commit.*3\.5s" --type go

echo -e "\n=== Checking skip timeout references ==="
rg "consensus\.skip_timeout_commit.*false" --type go

Length of output: 373


Script:

#!/bin/bash
# Check the actual constant values and their usage
echo "=== Checking consensus timeout constant definitions ==="
rg "ConsensusTimeoutCommitValue|ConsensusSkipTimeoutCommitValue" -A 2 --type go

echo -e "\n=== Checking all consensus timeout related configurations ==="
rg "consensus\.timeout_commit" --type go -A 2

echo -e "\n=== Checking config interceptor implementation ==="
cat cmd/provenanced/config/interceptor.go

Length of output: 9211

cmd/provenanced/cmd/root.go (2)

334-339: LGTM: Skip timeout commit validation looks good.

The validation logic for consensus.skip_timeout_commit is well-implemented, with clear error messaging when the configuration doesn't match the expected value.


341-346: LGTM: Timeout commit validation looks good.

The validation logic for consensus.timeout_commit is well-implemented, with clear error messaging when the configuration doesn't match the expected value.

Let's verify that the expected timeout commit value is correctly defined:

✅ Verification successful

The search results show that the expected timeout commit value is defined as "3.5s" in the config package, which aligns with the validation being performed in the root.go file. Let me verify one more aspect to ensure the consistency of this configuration value across the codebase.


Timeout commit validation is correctly implemented and consistent

The verification confirms that:

  1. The expected timeout commit value "3.5s" is properly defined in cmd/provenanced/config/interceptor.go
  2. The value is consistently used across the codebase, including test cases
  3. The validation logic in root.go correctly uses this configuration value
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the consensus timeout commit value definition
# Expected: Should find the value definition in the config package

rg -A 1 "ConsensusTimeoutCommitValue\s*=\s*" 

Length of output: 231


Script:

#!/bin/bash
# Let's verify if this timeout commit value is referenced or defined elsewhere in the codebase
# to ensure there are no conflicting definitions

rg -g '!*.md' -g '!*.txt' "3.5s" -A 2

Length of output: 527

@SpicyLemon SpicyLemon enabled auto-merge (squash) October 23, 2024 17:53
@SpicyLemon SpicyLemon merged commit ae45757 into main Oct 24, 2024
34 checks passed
@SpicyLemon SpicyLemon deleted the dwedul/2121-commit-timeout branch October 24, 2024 20:32
SpicyLemon added a commit that referenced this pull request Oct 24, 2024
* [2121]: Change the default consensus timeout value to 3.5 seconds.

* [2121]: Hard-code the consensus.timeout_commit value.

* [2121]: Fix TestIsTestnetFlagSet to not be affected by existing env vars.

* [2121]: Fix a couple unit tests that broke when I changed the default commit timout.

* [2121]: Only hard-code the timeout commit on non-testnets.

* [2121]: Change the default back to 1.5s for faster default testnets.

* [2121]: Fix the TestPreUpgradeCmd that broke because of the hard-coded timeout commit.

* [2121]: Add some unit tests that make sure the consensus timeout commit value is behaving as expected.

* [2121]: Add changelog entry.

* [2121]: When forcing the timeout_commit to be 3.5 seconds, also force the skip flag to be false.

* [2121]: Update warnAboutSettings: Evaluate the timeout commit and skip-timeout-commit fields separately. Issue a warning if skip-timeout-commit is true. Issue a warning if the timeout commit is not exactly what we want it to be.
SpicyLemon added a commit that referenced this pull request Oct 24, 2024
…tion events), #2196 (timeout_commit), #2197 (recordspec cmd), #2198 (ParameterChangeProposal) #2199 (wasm build-address cmd). (#2200)

* Suppress scope value owner migration events. (#2195)

* Create a no-op event manager and use that during the metadata module migration.

* Do not suppress the events for a testnet upgrade since they were emitted when the migration ran on testnet.

* Add changelog entry.

* Update all the spec proto links to reference v1.20.0 (#2192)

* Update all the spec proto links to reference v1.20.0 (instead of 1.19.0).

* Add changelog entry.

* When prepping a release, combine the dependency bump changelog entries. (#2181)

* Add a note to get-dep-changes to alert folks that changing those formats might break other things.

* Create an awk script that will combine dependency changelog entries. Update prep-release to use it. Also apply a couple fixes that are alread in the release branch (and will be in main shortly). Also tweak the step 4 and 5 names to provide more context, and fix the verbose output header when recombining the sections.

* Add changelog entry.

* Clarify the new comment in get-dep-changes.sh.

* Update stuff that uses or talks about RELEASE_NOTES.md because it should actually be RELEASE_CHANGELOG.md. The SDK uses _NOTES but only puts a blurb in there, so it's not a changelog. But we include a changelog, so it makes sense to keep it named that way.

* Fix the `query metadata recordspec` command when given a rec-spec-id. (#2197)

* [2148]: Fix the query metadata recordspec command to correctly use the RecordSpecification query (instead of RecordSpecificationsForContractSpecification) when provided a record specification id.

* [2148]: Add changelog entry.

* Fix decoding of gov props with a ParameterChangeProposal in them. (#2198)

* Write a unit test that fails to parse a gov proposal with a ParameterChangeProposal in it because that type isn't being registered anymore.

* Register the params module stuff with the codecs since there's some gov props with a ParameterChangeProposal in them.

* Add changelog entry.

* Hard-code consensus.timeout_commit to 3.5s for mainnet. (#2196)

* [2121]: Change the default consensus timeout value to 3.5 seconds.

* [2121]: Hard-code the consensus.timeout_commit value.

* [2121]: Fix TestIsTestnetFlagSet to not be affected by existing env vars.

* [2121]: Fix a couple unit tests that broke when I changed the default commit timout.

* [2121]: Only hard-code the timeout commit on non-testnets.

* [2121]: Change the default back to 1.5s for faster default testnets.

* [2121]: Fix the TestPreUpgradeCmd that broke because of the hard-coded timeout commit.

* [2121]: Add some unit tests that make sure the consensus timeout commit value is behaving as expected.

* [2121]: Add changelog entry.

* [2121]: When forcing the timeout_commit to be 3.5 seconds, also force the skip flag to be false.

* [2121]: Update warnAboutSettings: Evaluate the timeout commit and skip-timeout-commit fields separately. Issue a warning if skip-timeout-commit is true. Issue a warning if the timeout commit is not exactly what we want it to be.

* Fix: Add node flag to WASM queries (build-address) (#2199)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Centralize some of the consensus timing config fields.
4 participants