Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Create config regardless of connection failures #179

Merged
merged 21 commits into from
Mar 26, 2024
Merged

Conversation

aleksander-vedvik
Copy link
Collaborator

A config can be created regardless of a connection to the nodes have failed or not. Instead the channel will try to connect (only once) to the node for each message.

The channel needs to know whether a connection has been established. If not, it will dial the node. Otherwise, it will try to create a stream to the node.

The channel will try to connect to the node for each message if a
connection has not yet been established or if the node has disconnected.
@aleksander-vedvik aleksander-vedvik linked an issue Mar 13, 2024 that may be closed by this pull request
@meling meling changed the title feat: A config can be feat: Create config regardless of connection failures Mar 19, 2024
channel.go Outdated Show resolved Hide resolved
Copy link
Member

@meling meling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this nice work! I've reviewed this without any experimentation; our current tests are passing. I would feel much more confident about approving this if you could add some tests that will fail on the current master branch, but passes with this PR.

Some of my comments are nits that aren't important for the semantics, but will help to improve code quality and style.

channel.go Outdated Show resolved Hide resolved
channel.go Outdated Show resolved Hide resolved
channel.go Outdated Show resolved Hide resolved
channel.go Outdated Show resolved Hide resolved
channel.go Outdated Show resolved Hide resolved
node.go Outdated Show resolved Hide resolved
node.go Outdated Show resolved Hide resolved
node.go Outdated Show resolved Hide resolved
node.go Outdated Show resolved Hide resolved
node.go Outdated Show resolved Hide resolved
@meling meling requested a review from johningve March 19, 2024 22:04
aleksander-vedvik and others added 7 commits March 20, 2024 00:01
The only code change is the move of "if conn == nil" from connect()
to tryConnect(). It seems more logical to have this "guard check"
at the top of a method, rather than right after starting the
sendMsgs goroutine. There should be no semantic differences here,
but who knows... this code is a bit intricate.

Also adds some doc comment above some (internal) functions, mainly
for documentation purposes and some language polish and typos.
Added a guardclause to prevent goroutines being started more than once.

Also, the channel will only lock when trying to recreate a
nodestream, making it possible to retry when sending messages.
@aleksander-vedvik
Copy link
Collaborator Author

aleksander-vedvik commented Mar 21, 2024

Have added a few tests and more documentation. There are now two ways in which the channel will try to reconnect to a node:

  1. If a connection has never been established: It will try to reconnect when sending a message. (see c.sendMsgs)
  2. If a connection has been successfully established: It will in addition to point 1 try to reconnect in the background by using a back off strategy. (see c.recvMsgs)

Had to add adjust when locking happened in c.reconnect() since sendMsgs and recvMsgs are running in two different goroutines. Now, it will only lock when creating the NodeStream.

Regarding the tests:
The last test is supposed to represent a server first being offline, then online, and then offline again. To do this, it is necessary to use unexported methods on the channel, meaning it is not possible to use "black box testing". As a result, I created some custom proto types in tests/mock which is not dependent on gorums (to prevent cyclic imports). However, I suspect that gorums uses a custom encoding for the proto messages (placed in the init() function in each generated gorums proto file), causing an error when unmarshalling a non-gorums proto message. I might be wrong in this assumption, but for now none of the test are actually checking the response at the server-side because it fails to unmarshal the message.

EDIT: The error is coming from encoding.go:113

channel.go Show resolved Hide resolved
channel.go Outdated Show resolved Hide resolved
backoffCfg := c.backoffCfg

var retries float64
for {
var err error

c.streamMut.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some overlap in logic between this code and the tryConnect method. I wonder if we could make this into a separate method? It is not super obvious though. Still thinking about it.

channel_test.go Outdated Show resolved Hide resolved
channel_test.go Outdated Show resolved Hide resolved
testing_gorums.go Outdated Show resolved Hide resolved
@@ -0,0 +1,10 @@
syntax = "proto3";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider reusing tests/dummy/dummy.proto instead; feel free to add message types to that file. Alternatively, you may rename it to mock if that is more appropriate. You should add/update the tests/Makefile to compile the proto file if mock is the name we go with.

meling added a commit that referenced this pull request Mar 24, 2024
This will conflict with PR #179, but should be easy to resolve.
We can avoid passing the context to channel, since we anyway have
access to the relevant RawNode.
Since we now cancel pending messages, we have introduced a data
race that doesn't appear to have been triggered before.
@meling meling linked an issue Mar 26, 2024 that may be closed by this pull request
To avoid having to support another test function, let's make it
unexported for now. Currently, we only use it in one place; maybe
we can harmonize it with some other helper.
It does not appear to be needed.
@meling meling merged commit 8dd87ab into master Mar 26, 2024
3 checks passed
@meling meling deleted the connection branch March 26, 2024 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants