Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in handling connection close #219

Closed
Tracked by #342 ...
mostafa opened this issue Mar 24, 2023 · 1 comment
Closed
Tracked by #342 ...

Fix bug in handling connection close #219

mostafa opened this issue Mar 24, 2023 · 1 comment
Assignees
Labels
bug Something isn't working triage Triage based on the content
Milestone

Comments

@mostafa
Copy link
Member

mostafa commented Mar 24, 2023

Some of the errors are benign and they should not force close the client connection, but they do. For example, when there is a long-running query and the client (psql) sends a query cancellation request, the connection gets closed, leaving psql hanging.

gatewayd/network/proxy.go

Lines 353 to 373 in 2c79962

// The connection to the server is closed, so we MUST reconnect,
// otherwise the client will be stuck.
// TODO: Fix bug in handling connection close
// See: https://github.com/gatewayd-io/gatewayd/issues/219
if IsConnClosed(received, err) || IsConnTimedOut(err) {
pr.logger.Debug().Fields(
map[string]interface{}{
"function": "proxy.passthrough",
"local": client.LocalAddr(),
"remote": client.RemoteAddr(),
}).Msg("Client disconnected")
client.Close()
client = NewClient(pr.ctx, pr.ClientConfig, pr.logger)
pr.busyConnections.Remove(gconn)
if err := pr.busyConnections.Put(gconn, client); err != nil {
span.RecordError(err)
// This should never happen
return err
}
}

gatewayd/network/proxy.go

Lines 375 to 388 in 2c79962

// If the response is empty, don't send anything, instead just close the ingress connection.
// TODO: Fix bug in handling connection close
// See: https://github.com/gatewayd-io/gatewayd/issues/219
if received == 0 {
pr.logger.Debug().Fields(
map[string]interface{}{
"function": "proxy.passthrough",
"local": client.LocalAddr(),
"remote": client.RemoteAddr(),
}).Msg("No data to send to client")
span.AddEvent("No data to send to client")
span.RecordError(err)
return err
}

gatewayd/network/server.go

Lines 255 to 273 in 2c79962

// Pass the traffic from the client to server and vice versa.
// If there is an error, log it and close the connection.
if err := s.proxy.PassThrough(gconn); err != nil {
s.logger.Trace().Err(err).Msg("Failed to pass through traffic")
span.RecordError(err)
switch {
case errors.Is(err, gerr.ErrPoolExhausted),
errors.Is(err, gerr.ErrCastFailed),
errors.Is(err, gerr.ErrClientNotFound),
errors.Is(err, gerr.ErrClientNotConnected),
errors.Is(err, gerr.ErrClientSendFailed),
errors.Is(err, gerr.ErrClientReceiveFailed),
errors.Is(err, gerr.ErrHookTerminatedConnection),
errors.Is(err.Unwrap(), io.EOF):
// TODO: Fix bug in handling connection close
// See: https://github.com/gatewayd-io/gatewayd/issues/219
return gnet.Close
}
}

See: #32

Investigations

I have took a few stabs at this, but so far no luck in completely fixing it. It is a logical bug in the code that causes the client connection (GatewayD => database) to get stuck reading from a connection for a response, yet the actual client (client => GatewayD), sends a cancel request. This cancel request won't get through unless the response is read from the client connection and sent back to the client, which opens up the path for further requests. This is the expected behavior in a synchronous system. Yet, the cancel request MUST work, otherwise we can't cancel queries in progress.

@mostafa mostafa converted this from a draft issue Mar 24, 2023
@mostafa mostafa self-assigned this Mar 24, 2023
@mostafa mostafa added the bug Something isn't working label Mar 24, 2023
@mostafa mostafa added this to the v0.6.x milestone Mar 24, 2023
@mostafa mostafa moved this from 🆕 New to 📋 Backlog in GatewayD Core Public Roadmap Apr 29, 2023
@mostafa mostafa moved this from 📋 Backlog to 🏗 In progress in GatewayD Core Public Roadmap May 19, 2023
@mostafa mostafa moved this from 🏗 In progress to 📋 Backlog in GatewayD Core Public Roadmap May 24, 2023
@mostafa mostafa added the triage Triage based on the content label Jun 9, 2023
@mostafa mostafa removed this from the v0.6.x milestone Jun 9, 2023
@mostafa mostafa mentioned this issue Oct 7, 2023
4 tasks
@mostafa
Copy link
Member Author

mostafa commented Oct 18, 2023

Fixed #344.

@mostafa mostafa closed this as completed Oct 18, 2023
@mostafa mostafa moved this from 📋 Backlog to 🎉 Done in GatewayD Core Public Roadmap Oct 18, 2023
@mostafa mostafa added this to the v0.8.x milestone Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Triage based on the content
Projects
Development

No branches or pull requests

1 participant