Skip to content
This repository has been archived by the owner on Dec 4, 2024. It is now read-only.

Discussion: MaxFaultyNodes and NumValid wrong math? #236

Closed
mrwillis opened this issue Nov 23, 2021 · 3 comments · Fixed by #238
Closed

Discussion: MaxFaultyNodes and NumValid wrong math? #236

mrwillis opened this issue Nov 23, 2021 · 3 comments · Fixed by #238
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@mrwillis
Copy link
Contributor

mrwillis commented Nov 23, 2021

MaxFaultyNodes and NumValid wrong math?

Description

If I'm reading the spec correctly from ethereum/EIPs#650, the system can tolerate AT MOST F faulty nodes in a N validator nodes network, where N = 3F + 1 implies F = (N - 1)/3.

The function looks like this

// MaxFaultyNodes returns the maximum number of allowed faulty nodes, based on the current validator set
func (v *ValidatorSet) MaxFaultyNodes() int {
	// numberOfValidators / 3
	return int(math.Ceil(float64(len(*v))/3)) - 1
}

Shouldn't this be

// MaxFaultyNodes returns the maximum number of allowed faulty nodes, based on the current validator set
func (v *ValidatorSet) MaxFaultyNodes() int {
	// numberOfValidators / 3
	return int(math.Ceil(float64((len(*v)-1)/3)))
}

I am not sure if this should be a math.Ceil or math.Floor. I think intuitively it should be a ceiling.

Moreover this function

https://github.com/0xPolygon/polygon-sdk/blob/7f2e61d19b63bde38f52925dee4eefc47e03ddf5/consensus/ibft/state.go#L101

is 2F, but according to the spec it should be 2F + 1. Eventhough this looks wrong, it ends up being okay here because of the greater than https://github.com/0xPolygon/polygon-sdk/blob/7f2e61d19b63bde38f52925dee4eefc47e03ddf5/consensus/ibft/ibft.go#L627 and here https://github.com/0xPolygon/polygon-sdk/blob/7f2e61d19b63bde38f52925dee4eefc47e03ddf5/consensus/ibft/ibft.go#L632

but it looks incorrect here.

https://github.com/0xPolygon/polygon-sdk/blob/7f2e61d19b63bde38f52925dee4eefc47e03ddf5/consensus/ibft/ibft.go#L784

Believe it should be greater than, or just change the NumValid to be consistent with the spec.

Your environment

  • OS and version Ubuntu 20
  • version of the Polygon SDK
  • branch that causes this issue develop

Steps to reproduce

  • Tell us how to reproduce this issue
  • Where the issue is, if you know
@mrwillis mrwillis changed the title Discussion: MaxFaultyNodes wrong math? Discussion: MaxFaultyNodes and NumValid wrong math? Nov 23, 2021
@zivkovicmilos zivkovicmilos self-assigned this Nov 24, 2021
@zivkovicmilos zivkovicmilos linked a pull request Nov 24, 2021 that will close this issue
8 tasks
@zivkovicmilos
Copy link
Contributor

Hey @mrwillis,

As always, thank you for reaching out and for opening up a discussion on this topic.

I've went through the code and concluded that the implementation that we currently have in runRoundChangeState (that's causing the confusion with NumValid), although very clunky is indeed correct.

The node that sends out messages to other nodes (gossips them) will also have the same messages in its own msg queue.
This happens with all message types, apart from Preprepare messages.
Essentially, the check you've pointed out in runRoundChangeState will already consider the missing 1, since it was prepended previously when the sendRoundChange was gossiped (you can think of it as it starts counting from 1, instead of 0).

I've updated the in-code documentation on the PR #238 to reflect these changes.

@zivkovicmilos
Copy link
Contributor

zivkovicmilos commented Nov 24, 2021

Additionally, I've looked into the math behind MaxFaultyNodes, turns out it was okay, just that it was written differently than it should. I've opened up a small snippet illustrating the change - your hunch was right, it was indeed supposed to be a floor value with the -1 being inside the equation 🙂
https://goplay.tools/snippet/SBRHSMZoXSU

@mrwillis
Copy link
Contributor Author

Awesome thanks @zivkovicmilos

@zivkovicmilos zivkovicmilos added the documentation Improvements or additions to documentation label Nov 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants