Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending Cease/Hard Reset notification #2800

Merged
merged 2 commits into from
Apr 29, 2024

Conversation

YutaroHayakawa
Copy link
Contributor

@YutaroHayakawa YutaroHayakawa commented Apr 27, 2024

Copying the commit message here:

Send Cease/Hard Reset notification for certain scenario when graceful
restart + notification support (RFC8538) are enabled. In this
implementation, we follow the suggestion of RFC8538 and map following
notification subcodes to Hard Reset subcode.

1. BGP_ERROR_SUB_MAXIMUM_NUMBER_OF_PREFIXES_REACHED

In this case, GoBGP is in the resource shortage and not working
properly. Thus, the peer should stop forwarding packet immediately.

2. BGP_ERROR_SUB_ADMINISTRATIVE_SHUTDOWN

This happens when the user uses DisablePeer API. This clearly indicates
user's intention of shutting down the session. Thus, we should send Hard
Reset.

3. BGP_ERROR_SUB_PEER_DECONFIGURED

This happens when the user uses DeletePeer API or StopBgp API or there's
an ASN mismatch found in the Open phase. The former two cases, the user
shows the intention to shutdown the session, so we should Hard Reset.
The latter case is not so obvious, but I think it's ok to do Hard Reset
because it is an unrecoverable error that cannot be solved without
user's involvement.

4. BGP_ERROR_SUB_HARD_RESET

This case currently doesn't exist, but obviously we should send Hard
Reset when someone explicitly specifies it.

The behavior for the remaining subcodes are unchanged. We may want to
expose a knob to adjust the behavior of BGP_ERROR_SUB_ADMINISTRATIVE_RESET
as suggested by RFC8538, but for this initial implementation, we kept it
as is.

Our use case for this is letting users of Cilium BGP Control Plane hard reset the session when they service out the node for maintenance. In the current GoBGP implementation, there's no way to perform a hard reset, so users must wait to shut down the node (shutdown the data plane on the node, in other words) until the graceful restart timer expires. Otherwise, it causes a traffic disruption.

One concern is that this is a breaking change for people who rely on any modified notification message to trigger a graceful restart procedure. I'm ok with hiding this feature behind a flag if it is harmful, but I guess the impact is limited because the document suggests sending SIGINT or SIGKILL to trigger graceful restart procedure. In those cases, GoBGP shutdown the session without any notification, so the procedure will be triggered anyways.

@YutaroHayakawa YutaroHayakawa marked this pull request as ready for review April 27, 2024 03:30
Send Cease/Hard Reset notification for certain scenario when graceful
restart + notification support (RFC8538) are enabled. In this
implementation, we follow the suggestion of RFC8538 and map following
notification subcodes to Hard Reset subcode.

1. BGP_ERROR_SUB_MAXIMUM_NUMBER_OF_PREFIXES_REACHED

In this case, GoBGP is in the resource shortage and not working
properly. Thus, the peer should stop forwarding packet immediately.

2. BGP_ERROR_SUB_ADMINISTRATIVE_SHUTDOWN

This happens when the user uses DisablePeer API. This clearly indicates
user's intention of shutting down the session. Thus, we should send Hard
Reset.

3. BGP_ERROR_SUB_PEER_DECONFIGURED

This happens when the user uses DeletePeer API or StopBgp API or there's
an ASN mismatch found in the Open phase. The former two cases, the user
shows the intention to shutdown the session, so we should Hard Reset.
The latter case is not so obvious, but I think it's ok to do Hard Reset
because it is an unrecoverable error that cannot be solved without
user's involvement.

4. BGP_ERROR_SUB_HARD_RESET

This case currently doesn't exist, but obviously we should send Hard
Reset when someone explicitly specifies it.

The behavior for the remaining subcodes are unchanged. We may want to
expose a knob to adjust the behavior of BGP_ERROR_SUB_ADMINISTRATIVE_RESET
as suggested by RFC8538, but for this initial implementation, we kept it
as is.

Signed-off-by: Yutaro Hayakawa <[email protected]>
draft-ietf-idr-bgp-gr-notification is promoted to RFC8538.

Signed-off-by: Yutaro Hayakawa <[email protected]>
@fujita fujita merged commit d0bf813 into osrg:master Apr 29, 2024
39 checks passed
@fujita
Copy link
Member

fujita commented Apr 29, 2024

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants