Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingress: Add dynamic-config-manager enhancement #1687

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Miciah
Copy link
Contributor

@Miciah Miciah commented Sep 26, 2024

No description provided.

* enhancements/ingress/dynamic-config-manager.md: New file.
Copy link
Contributor

openshift-ci bot commented Sep 26, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from miciah. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

openshift-ci bot commented Sep 26, 2024

@Miciah: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/markdownlint 15a3db2 link true /test markdownlint

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

override:

```shell
oc -n openshift-ingress-operator patch ingresscontrollers/default --type=merge --patch='{"spec":{"unsupportedConfigOverrides":{"dynamicConfigManager":"false"}}}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For upgrades, I assume you need to remove the unsupported override and re-enable it once the upgrade is complete. However, this raises the question: if DCM is re-enabled, does that imply the cluster may be broken from an ingress perspective until the update is complete and DCM can be disabled again? Is that going to be an acceptable stance for production clusters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For upgrades, I assume you need to remove the unsupported override and re-enable it once the upgrade is complete.

We should, but currently do not, block upgrades when using unsupportedConfigOverrides (reported as https://issues.redhat.com/browse/OCPBUGS-41366).

However, this raises the question: if DCM is re-enabled, does that imply the cluster may be broken from an ingress perspective until the update is complete and DCM can be disabled again? Is that going to be an acceptable stance for production clusters?

I guess you're saying that unsupportedConfigOverrides isn't a good escape hatch after all, right? You can probably force an upgrade using oc adm --force, but I understand that --force is a last resort.

I suppose we could tell people to turn off the featuregate. Is that a reasonable escape hatch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could tell people to turn off the featuregate. Is that a reasonable escape hatch?

Yes, I would say so.

@candita
Copy link
Contributor

candita commented Oct 9, 2024

/assign @alebedev87
/assign

@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 7, 2024
@candita
Copy link
Contributor

candita commented Nov 8, 2024

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 8, 2024
Comment on lines +28 to +35
OpenShift 4.18 enables Dynamic Config Manager with 1 pre-allocated server per
backend, without blueprint routes, and without any configuration options. The
goal is to deliver a minimum viable product on which to iterate. This MVP
provides marginal value by avoiding reloading HAProxy for a single scale-out
event or subsequent scale-in event for a route, at minimal development and
operational cost. More importantly, the MVP gives us CI signal, enables us to
work out defects in DCM, and gives us a starting point from which to enhance DCM
in subsequent OpenShift releases. In the future, we intend to extend DCM with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Says the same thing, just reworded for clarity.

Suggested change
OpenShift 4.18 enables Dynamic Config Manager with 1 pre-allocated server per
backend, without blueprint routes, and without any configuration options. The
goal is to deliver a minimum viable product on which to iterate. This MVP
provides marginal value by avoiding reloading HAProxy for a single scale-out
event or subsequent scale-in event for a route, at minimal development and
operational cost. More importantly, the MVP gives us CI signal, enables us to
work out defects in DCM, and gives us a starting point from which to enhance DCM
in subsequent OpenShift releases. In the future, we intend to extend DCM with
OpenShift 4.18 delivers a Tech Preview version of Dynamic Config Manager as a minimum viable product that enables dynamic configuration without repeated HAProxy reloads. It will be offered with 1 pre-allocated server per backend, without blueprint routes, and without any configuration options. This MVP
provides marginal value by avoiding reloading HAProxy for a single scale-out
event or subsequent scale-in event for a route, at minimal development and
operational cost. More importantly, the MVP gives us CI signal, enables us to
work out defects in DCM, and gives us a starting point from which to enhance DCM
in subsequent OpenShift releases. In the future, we intend to extend DCM with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, this isn't a special version of DCM; it's a specific configuration of DCM. However, I can add mention that we are shipping this as Tech Preview in 4.18.

HAProxy process through a Unix domain socket. This means that no forking is
necessary to update configuration and effectuate the changes.

However, DCM requires some work before it can be enabled. DCM was first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
However, DCM requires some work before it can be enabled. DCM was first
DCM requires substantial work to develop and verify it in order to make it
viable. DCM was first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had to fix several defects in DCM itself before we were comfortable shipping it even in this MVP configuration, and even as a Tech Preview feature. I don't know whether it is considered "viable" at this point, or whether we would consider it "viable" if further testing reveals no new defects. How about this?

Suggested change
However, DCM requires some work before it can be enabled. DCM was first
However, DCM requires development work to fix several known issues as well as extensive testing in order to ensure that enabling it does not introduce regressions. DCM was first

Comment on lines +69 to +71
does not take full advantage of the capabilities of newer HAProxy versions. In
sum, DCM requires substantial work to develop and verify it in order to make it
viable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the last sentence if you change the first sentence.

Suggested change
does not take full advantage of the capabilities of newer HAProxy versions. In
sum, DCM requires substantial work to develop and verify it in order to make it
viable.
does not take full advantage of the capabilities of newer HAProxy versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to follow the usual topic sentence, body, conclusion. Maybe it's a little too College English 101? I can tighten the paragraph up if it's too wordy.

Comment on lines +98 to +99
The degree to which DCM mitigates this issue is dependent on the nature of the
configuration changes: Some changes still require a configuration reload, but
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The degree to which DCM mitigates this issue is dependent on the nature of the
configuration changes: Some changes still require a configuration reload, but
The degree to which DCM mitigates this issue is dependent on the nature of the
configuration changes. Some changes will still require a configuration reload, but

Comment on lines +126 to +127
_As a project administrator, I want HAProxy to balance traffic evenly over old
and new pods when I scale my application up._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ❤️ the explanations in this section.

Is this also true?

Suggested change
_As a project administrator, I want HAProxy to balance traffic evenly over old
and new pods when I scale my application up._
_As a project administrator, I want HAProxy to balance traffic correctly over old
and new pods when I scale my application up or down._

Copy link
Contributor Author

@Miciah Miciah Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would say it's still true. I prefer "evenly" because it is more evocative of the problem we are trying to avoid: an excessive share of the traffic going to the first server in the backend. However, I can write "correctly" if that is clearer. Adding "or down" does make sense. (I should probably make it "out or in" since we are talking about adding or removing pod replicas, not increasing or decreasing the resources available to each pod.)


Steps 5 and 8 are stretch goals and may be done in later OpenShift releases.
The other steps are hard requirements, and we intend to complete them in
OpenShift 4.18.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add the GA step after this paragraph?

enhance DCM by using newer capabilities in HAProxy 2.y to add arbitrarily many
servers dynamically, add backends dynamically, and update certificates
dynamically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there anything DCM-related that we could add via the haproxy config template that might require an openshift/API change? Getting an API framework in place might be useful.

Are all the Unix socket commands listed in https://docs.haproxy.org/2.8/management.html#9.3 considered to be dynamic config options? So, not only can we add/del servers, but we could also add/del acls, maps, etc, and disable frontends, servers , healthchecks, and set maxconns/rate-limit, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there anything DCM-related that we could add via the haproxy config template that might require an openshift/API change? Getting an API framework in place might be useful.

Ideally not. We want DCM to be as transparent and dynamic as possible, requiring no ahead-of-time configuration.

Are all the Unix socket commands listed in https://docs.haproxy.org/2.8/management.html#9.3 considered to be dynamic config options? So, not only can we add/del servers, but we could also add/del acls, maps, etc, and disable frontends, servers , healthchecks, and set maxconns/rate-limit, etc.

Right, any configuration that the router performs using the Unix domain socket is dynamic configuration. We might indeed be able to use additional capabilities to avoid reloads in more cases.

Of course, we do need to do some cost-benefit analysis. It may be worth the engineering effort to implement dynamic configuration of, say certificates but not worth the effort to implement dynamic configuration of rate limiting, depending on how much effort is required and how often people actually update that particular kind of configuration.

https://gist.github.com/frobware/2b527ce3f040797909eff482a4776e0b for an
analysis of the potential memory impact for different choices of balancing
algorithm, number of threads, number of backends, and number of pre-allocated
server slots per backend.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to see a brief summary that shows how fast memory grows for a typical use case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean pulling in Andy's table, or do you mean gathering more data so we can plot some curves?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither - I meant summarizing the data in the table. You could keep the reference to the data set.

### Graduation Criteria

We will introduce the feature as TechPreviewNoUpgrade in OpenShift 4.18, with
the goal of graduating it to GA in the same release. Further improvements to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "goal of graduating it to GA in the same release" still applicable given our 2025 priorities?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on how things go with the Tech Preview, and whether we discover any new major issues.


- OpenShift Enterprise 3.11 added DCM and the `ROUTER_HAPROXY_CONFIG_MANAGER` option ([release notes](https://docs.openshift.com/container-platform/3.11/release_notes/ocp_3_11_release_notes.html#ocp-311-haproxy-enhancements), [documentation](https://docs.openshift.com/container-platform/3.11/install_config/router/default_haproxy_router.html#using-the-dynamic-configuration-manager)).
- OpenShift Container Platform 4.9 added the `dynamicConfigManager` config override, default off ([openshift/cluster-ingress-operator@6a8516a](https://github.com/openshift/cluster-ingress-operator/pull/628/commits/6a8516ab247b00b87a5d7b32e20d4cffcefe1c0f)).
- OpenShift Container Platform 4.18 enables DCM by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly, if we use a feature gate.

Copy link
Contributor

@alebedev87 alebedev87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some nit picks.

superseded-by:
---

# OpenShift Router Dynamic Config Manager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the full name of the feature is "Dynamic Configuration Manager".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. We use "dynamic config manager" in the source code and colloquially, but https://docs.openshift.com/container-platform/3.11/install_config/router/default_haproxy_router.html#using-the-dynamic-configuration-manager has "Dynamic Configuration Manager".

gigabytes per process, ultimately causing out-of-memory errors on the node host.

DCM addresses this issue by reducing the need to reload the configuration.
Instead, DCM configures HAProxy using its Unix domain control socket, which does
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "control socket" term doesn't seem to relate to any HAProxy or router technology.

Suggested change
Instead, DCM configures HAProxy using its Unix domain control socket, which does
Instead, DCM configures HAProxy using its Runtime API, which does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, "control socket" is a term we use colloquially within the team, and it doesn't appear in the HAProxy documentation, which uses the term "stats socket". However, "stats socket" is a bit of a misnomer, I don't see "Runtime API" in https://docs.haproxy.org/2.8/management.html, and "Unix domain control socket" is descriptive of what the socket is, the capabilities of the socket, and our usage of it. Would it be all right by you if I continued using the term "control socket" but defined it up front in the EP?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Runtime API can be accessed via a UNIX socket or via IP:port. Ref: HAProxy Runtime API installation guide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that "Runtime API" really means commands on the stats socket. However, "Runtime API" seems to be part of the nomenclature from the commercial HAProxy product documentation whereas the project does not use this nomenclature. Personally, I find the phrase "Runtime API" confusing and unnecessary.

Would you be all right with using the phrase "Unix Socket commands" from the HAProxy documentation, with a reference to the same?

Suggested change
Instead, DCM configures HAProxy using its Unix domain control socket, which does
Instead, DCM configures HAProxy using [Unix Socket commands](https://docs.haproxy.org/2.8/management.html#9.3), which does

1. Manually verify that the router functions and passes E2E tests with DCM enabled.
2. Add a tech-preview featuregate in openshift/api for DCM.
3. Update cluster-ingress-operator to enable DCM if the featuregate is enabled.
4. Re-enable old E2E tests in openshift/origin for DCM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done if DCM will be in techpreview featureset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The openshift/origin tests create router pods directly, so the tests can enable DCM irrespective of the featuregate: https://github.com/openshift/origin/blob/203d2b124037a543d8719ba7e2a369f52f01686d/test/extended/router/config_manager.go#L328-L489


Dynamic Config Manager has some known defects:

- [OCPBUGS-7466 No prometheus haproxy metrics present for route created by dynamicConfigManager](https://issues.redhat.com/browse/OCPBUGS-7466)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still a risk considering the fact that we won't enable blueprint routes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, OCPBUGS-7466 pertains to blueprint routes. We can note this as irrelevant for our supported configuration.

Dynamic Config Manager has some known defects:

- [OCPBUGS-7466 No prometheus haproxy metrics present for route created by dynamicConfigManager](https://issues.redhat.com/browse/OCPBUGS-7466)
- [OCPBUGSM-20868 Sticky cookies take long to start working if config manager is enabled](https://issues.redhat.com/browse/OCPBUGSM-20868)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, essentially this is what I discovered too (fix PR).


#### Can we use HAProxy's control socket to add backends and certificates dynamically?

Answer: **?**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backends - no. Certificates - seems like yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it does seem that dynamic backends are are long discussed but never implemented possibility. Sadface.

Comment on lines +310 to +312
or for scale-out following scale-in. However, we found that DCM still requires
a reload for scale-in of a server that doesn't use a pre-allocated server slot.
(TODO: This needs to be double-checked.)
Copy link
Contributor

@alebedev87 alebedev87 Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did see stopped not pre-allocated servers. So, scale-ins should still be dynamic.

Copy link
Contributor

@alebedev87 alebedev87 Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to scale a deployment from 4 replicas (3 static slots UP + 1 dynamic slot UP) to 1. Here is the output of show servers state:

# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port srv_check_addr srv_agent_addr srv_agent_port
17 be_http:test:httpd-24 1 pod:httpd-24-8554dcd85d-rrsrl:httpd-24:8080-tcp:10.128.2.15:8080 10.128.2.15 0 1 1 1 156 6 3 0 14 0 0 0 - 8080 - 0 0 - - 0
17 be_http:test:httpd-24 2 pod:httpd-24-8554dcd85d-nvnl5:httpd-24:8080-tcp:10.129.2.22:8080 10.129.2.22 0 1 1 1 9 6 3 0 14 0 0 0 - 8080 - 0 0 - - 0
17 be_http:test:httpd-24 3 pod:httpd-24-8554dcd85d-m2hbj:httpd-24:8080-tcp:10.131.0.16:8080 10.131.0.16 2 0 1 1 490 6 3 4 6 0 0 0 - 8080 - 0 0 - - 0
17 be_http:test:httpd-24 4 _dynamic-pod-1 10.129.2.23 0 5 1 1 234 6 3 0 14 0 0 0 - 8080 - 0 0 - - 0


1 pod:httpd-24-8554dcd85d-rrsrl:httpd-24:8080-tcp:10.128.2.15:8080 10.128.2.15	STOP	MAINT	1	CHECK CONFENABLEPAUSED
2 pod:httpd-24-8554dcd85d-nvnl5:httpd-24:8080-tcp:10.129.2.22:8080 10.129.2.22	STOP	MAINT	1	CHECK CONFENABLEPAUSED
3 pod:httpd-24-8554dcd85d-m2hbj:httpd-24:8080-tcp:10.131.0.16:8080 10.131.0.16	UP	UP	1	CHECK CONFENABLED
4 _dynamic-pod-1 10.129.2.23	STOP	MAINT	1	CHECK CONFENABLEPAUSED

*At the top is the raw output, at the bottom is a simplified human readable view of raw data.

Same behavior if 4 replicas are enabled via 4 statis slots (dynamic slot is not used).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw even the route deletion doesn't lead to a reload if DCM is enabled. All of the servers of the deleted backend will be disabled but the backend will stay in show backend output.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises the question of whether we need to log additional operational debug information somewhere. With DCM enabled, the haproxy.config becomes stale, so relying solely on its contents during debugging could lead to misleading conclusions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I had a user story for this but didn't have enough time to implement something for this iteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to scale a deployment from 4 replicas (3 pre-allocated slots UP + 1 dynamic slot UP) to 1. Here is the output of show servers state:

I'm confused. I've been saying "pre-allocated" to describe the slots that we pre-allocate in haproxy.config for DCM to use to add new server endpoints to a backend. What do you mean when you say "3 pre-allocated slots"? Are these 3 pre-allocated slots for DCM, or are these 3 server lines for endpoints that are defined in haproxy.config?

Are you saying that I can create a route with 3 endpoints (which get written to haproxy.config), and then scale in to 1 endpoint, and no reload is required?

If I have a route with 1 pod endpoint (written to haproxy.config), delete the pod, then recreate the pod, is a reload required?

Btw even the route deletion doesn't lead to a reload if DCM is enabled. All of the servers of the deleted backend will be disabled but the backend will stay in show backend output.

This is excellent news!

Can you reproduce these results with non-TLS, TLS passthrough, edge-terminated, and reencrypt routes? Since there are some nuances (server versus server-template), I'm wary of subtle differences in behavior for different route types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. I've been saying "pre-allocated" to describe the slots that we pre-allocate in haproxy.config for DCM to use to add new server endpoints to a backend. What do you mean when you say "3 pre-allocated slots"? Are these 3 pre-allocated slots for DCM, or are these 3 server lines for endpoints that are defined in haproxy.config?

Sorry, my bad. I should have called them "static" slots. The pre-allocated are dynamic slots.

Are you saying that I can create a route with 3 endpoints (which get written to haproxy.config), and then scale in to 1 endpoint, and no reload is required?

Right, if DCM is enabled.

If I have a route with 1 pod endpoint (written to haproxy.config), delete the pod, then recreate the pod, is a reload required?

No. The deletion of the pod will set the corresponding server to maintenance mode. The recreation of a pod will use the dynamic (pre-allocated) server slot to set the new IP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants