Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator v1: add reproducer for drift detection with null #320

Closed
wants to merge 2 commits into from

Conversation

birdayz
Copy link
Contributor

@birdayz birdayz commented Nov 20, 2024

this is a reproducer that shows "null" is treated as a diff from null.

@@ -17,6 +17,28 @@ import (
"github.com/stretchr/testify/require"
)

func TestDiffWithNull(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test is expected to fail, currently it's a diff,keeping operator spinning forever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall succeed now with the fix commit

@birdayz birdayz marked this pull request as ready for review November 20, 2024 13:21
@@ -145,6 +145,14 @@ func PropertiesEqual(
l logr.Logger, v1, v2 interface{}, metadata rpadmin.ConfigPropertyMetadata,
) bool {
log := l.WithName("PropertiesEqual")

// "null" (string) is the only way to define null as the desired value.
// Therefore, treat `"null"` as equal to `null` (only if `null`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rephrase? The end of the sense is miss leading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apparently i did not save the file. updated the comment

@birdayz birdayz force-pushed the jb/cluster-config-drift-null branch from 09ecb1c to c2188e6 Compare November 20, 2024 13:39
@andrewstucki
Copy link
Contributor

So, QQ about this -- it seems to me like this is potentially working around the fact that we're populating our map[string]any from a map[string]string defined here:

AdditionalConfiguration map[string]string `json:"additionalConfiguration,omitempty"`

I'm wondering if it'd make more sense to just relax the type some and use some sort of schemaless field that we can unmarshal directly to a map[string]any? You can tell me "no" if it's too much work, but it'd be nice to see us not use "magic" strings to mean something other than a string.

@chrisseto
Copy link
Contributor

So there's another case were this will fail. I'm not entirely certain of how it happens but I've seen a cloud cluster have a numeric value specified in scientific notation (1.23e+5)

Is there a reason we can't call json/yaml.Unmarshal on each value and load them into a map[string]any? If we wanted to get really tricky, we could iterate over the admin schema and and unmarshal into the appropriate variable.

@birdayz
Copy link
Contributor Author

birdayz commented Nov 20, 2024

string does not work well with json unmarshal.

example: i set:

key: value 

In yaml.

json unmarshal does not work, because value is not valid JSON. "value" would be valid JSON, strings must always be quoted. it's not directly compatible

ofc, we could work around it, but i fear breaking stuff

@chrisseto i have seen scientific notation as well somewhere. i do not recall for sure how it happens, but we don't want it (anymore).. i would be glad if we rejected that honestly.

@chrisseto
Copy link
Contributor

I'm wondering if it'd make more sense to just relax the type some and use some sort of schemaless field that we can unmarshal directly to a map[string]any? You can tell me "no" if it's too much work, but it'd be nice to see us not use "magic" strings to mean something other than a string.

The problem with schemaless fields is that it makes the CRD "non-structural" which causes terraform's kubernetes_manifest resource to fallback to delete and recreate on any updates. The V2 CRD suffers from this problem but we've apparently side stepped it in V1.

Long term, @RafalKorepta and I have discussed having separate CRDs for different use cases. We'd keep a structural schema to play nice with terraform and then an unstructured/schemaless one for GitOps setups. One would of the two would be a compatibility layer that just creates the other.

json unmarshal does not work, because value is not valid JSON. "value" would be valid JSON, strings must always be quoted. it's not directly compatible

Seems like yaml.Unmarshal would work in that case? Bare strings would be handled, scientific notation would be handled, nulls would be handled.

ofc, we could work around it, but i fear breaking stuff

If we don't have tests in this repo that catch these issues we'll always be afraid of breaking stuff. Only one way to find out ;)

Refactoring all of this will be a good candidate to tackle with the merger work. We can punt until then.

@@ -17,6 +17,28 @@ import (
"github.com/stretchr/testify/require"
)

func TestDiffWithNull(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind copying the setup of a similar test that we have in operator/cmd/syncclusterconfig? We stand up a redpanda test container to make sure that the config is actually accepted by the redpanda instance. Way faster than a full k8s setup with much more validation (and no need to write out the ConfigPropertyMetadata manually)

Copy link
Contributor Author

@birdayz birdayz Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep will add this test, i am interested in a few corner cases


// "null" (string) is the only way to define null as the desired value.
// Therefore, treat `"null"` as equal to `null` (only if `"null"` is on the k8s side)
if metadata.Nullable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this fixes the diffing but are we going to have bugs with actually setting null values? If not does that mean we have duplicate code attempting to massage string values into the correct format? 😬

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cannot set null because our type in the CRD is string, so it can never be an actual nil. well, at least have null on the CRD side of things. but i suspect there's some magic in redpanda to turn "null" into null - this stuff actually worked, "null" turned into null in redpanda.

definitely needs more test for sure, i will add them hopefully tomorrow

this stops operator from detecting non-existing drifts over and over
again, if "null" is used to set `null`. Since
spec.additionalConfiguration is map[string]string, string "null" is the
only way to set `null.`
@birdayz birdayz force-pushed the jb/cluster-config-drift-null branch from c2188e6 to b8a39e9 Compare November 21, 2024 17:38
@RafalKorepta
Copy link
Contributor

Fixed in #324

@RafalKorepta RafalKorepta deleted the jb/cluster-config-drift-null branch December 2, 2024 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants