Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracking: CRD versioning macro #507

Closed
25 tasks done
Tracked by #361
fhennig opened this issue Jan 29, 2024 · 13 comments
Closed
25 tasks done
Tracked by #361

tracking: CRD versioning macro #507

fhennig opened this issue Jan 29, 2024 · 13 comments
Assignees
Labels

Comments

@fhennig
Copy link
Contributor

fhennig commented Jan 29, 2024

This tracks the implementation of a first working version of the #[versioned] macro. Further improvements will be done as part of our normal development workflow as well as alongside #642.

General

Currently, all our CRDs are defined by Rust structs. This works really well for multiple reasons:

  • Structs can directly be used in the operator code
  • Can be serialized in YAML for for use in Helm or other deployments mechanisms
  • Deserialize directly from Kubernetes API calls

This strict type safety ensures we build code which will never crash during runtime due to invalid types. The future introduction of CRD versions warrants the same strict typing for different version of the CRD. Additionally, we want to be able to upgrade (and downgrade) between versions in a controlled and automated manner. Rust already provides traits for conversion of types: From<T>, TryFrom<T> and friends. A goal of this implementation should be support for this native mechanism.

Requirements

  • Common mechanisms for CRD versioning must live in a common repository
  • Different versions of the same struct without much overhead. This overhead includes:
    • The need to copy and paste the same struct for multiple versions
    • Manually keeping track of changes between versions and how to properly convert between them
    • Conversion should be a simple one method call, no need to manually assign fields from the old struct to the new one
  • Support for versioned field changes:
    • Add a new field
    • Rename a field
    • Deprecate a field
  • Field changes can be annotated with notes, hints, etc...
  • The derive macro must surface meaningful error messages
  • Handle recursive structs and enums

Vision

Initial Sketch

Adding the Derive Macro

The derive macro uses a simple and fitting name. A few examples are: Versioned, Version and Update. The macro additionally uses an attribute to customize the code generation. The name of that attribute will be the same as the derive macro name, just all lowercase letters.

// The derive macro is used just like any other derive macro like Debug,
// Deserialize or JsonSchema
#[derive(Debug, Versioned)]
pub struct MyCrd {
    foo: String,
    bar: usize,
}

// The macro can also be used on enums
#[derive(Debug, Versioned)]
pub enum MyEnum {
    Foo,
    Bar,
}

The macro provides a mechanism to provide traits which should be derived on the generated structs/enums. Attributes like #[serde] or #[schemars] will be passed through as well.

#[derive(Debug, Versioned)]
#[versioned(derives(Debug, PartialEq))]
#[serde(rename_all = "camelCase")]
pub struct MyCrd {
    foo: String,
}

Adding Versions

Each distinct version of a struct/enum needs to be declared via the accommodating attribute. The version field can be used as often as one would like. Each definition creates a new versioned version of the base struct/enum. Each version name must be unique, enforced by the macro and surfaced with easy to understand error message.

(Optional) Because we plan to publish this crate publicly for the Rust community, we don't mandate the use of Kubernetes versions as version names. Instead, the validation should be as generic as possible to allow usages outside of the context of Kubernetes. Initially we might want to only support SemVer and Kubernetes version formats. Each version must provide an implementation for PartialEq and PartialOrd.

(Optional) Users can customize the visibility of the generated structs.

#[derive(Debug, Versioned)]
// Default visibility is `pub`
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1", visibility = "pub(crate)")
)]
pub struct MyCrd {
    foo: String,
    bar: usize,
}

#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1", visibility = "pub(crate)")
)]
#[derive(Debug, Versioned)]
pub enum MyEnum {
    Foo,
    Bar,
}

Each added version will generate a new struct or enum specific for that version. A basic example without any fields/variants (and semantics around them) could look like this:

pub mod v1alpha1 {
    pub struct MyCrd {
        // Fields will be discussed in sections below
    }
}

pub(crate) mod v1beta1 {
    pub struct MyCrd {
        // Fields will be discussed in sections below
    }
}

// ################################################

pub mod v1alpha1 {
    pub enum MyEnum {
        // Variants will be discussed in sections below
    }
}

pub(crate) mod v1beta1 {
    pub enum MyEnum {
        // Variants will be discussed in sections below
    }
}

The naming scheme uses Rust modules. Different version of the struct would then be separated by modules, like v1alpha1::MyCrd and v1beta1::MyCrd. Additionally, the macro provides stable modules, which for example always point to the latest version of the struct/enum.

pub mod v1alpha1 {
    pub struct MyCrd;
}

pub mod v1beta1 {
    pub struct MyCrd;
}

// The operator can always use latest::MyCrd
pub mod latest {
    pub use v1beta1::*;
}

Deprecation of Versions

CRDs can mark particular versions as deprecated. This can be achieved by providing the deprecated flag. Versioned structs will additionally include Rust's #[deprecated] attribute.

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1", deprecated),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    foo: String,
    bar: usize,
}

The resulting generated code would look similar to this:

pub mod v1alpha1 {
    #[deprecated]
    pub struct MyCrd {
        // Fields will be discussed in sections below
    }
}

pub mod v1beta1 {
    pub struct MyCrd {
        // Fields will be discussed in sections below
    }
}

Field Attributes

Each field can be attached with additional (but optional) attributes. These attributes indicate when the fields were added, renamed, deprecated and removed. Any field which doesn't have a attribute attached is presumed to exist since the earliest version (version v1alpha1 in the example below). This feature is one reason why versions need to implement PartialEq and PartialOrd. Any attribute field cannot be combined with any other field if the same version is used.

Add Fields

This attribute marks fields as added in a specific version. This field will not appear in generated structs of earlier versions.

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    #[versioned(added(since = "v1beta1"))]
    foo: String,
    bar: usize,
}

The resulting generated code would look similar to this:

pub mod v1alpha1 {
    pub struct MyCrd {
        bar: usize,
    }
}

pub mod v1beta1 {
    pub struct MyCrdV1Beta1 {
        foo: String,
        bar: usize,
    }
}

Adding required fields in CRDs is against the Kubernetes guidelines. Instead, CRD developers should use optional fields with a sane default value. There will be multiple pieces of Rust code to make this work:

  • Each field type needs to implement Default.
  • A custom default value can be provided using #[versioned(added(default = default_fn))].
    This value will be used when using the From<T> trait for conversion between versions.

The From<Old> for New implementation will look similar to this:

Without Custom Default Value
impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        Self {
            foo: Default::default(),
            bar: my_crd.bar,
        }
    }
}
With Custom Default Value
impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        Self {
            foo: default_fn(),
            bar: my_crd.bar,
        }
    }
}

Rename Fields

This attribute marks fields as renamed in a specific version. This field will use the original name up to the version it was renamed in. The From<T> trait implementation will move the value from the old field to the new field.

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    #[versioned(renamed(since = "v1beta1", to = "baz"))]
    foo: String,
    bar: usize,
}

The resulting generated code would look similar to this:

pub mod v1alpha1 {
    pub struct MyCrd {
        foo: String,
        bar: usize,
    }
}

pub mod v1beta1 {
    pub struct MyCrd {
        baz: String,
        bar: usize,
    }
}

The From<Old> for New implementation will look similar to this:

impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        Self {
            baz: my_crd.foo,
            bar: my_crd.bar,
        }
    }
}

Deprecate Fields

This attribute marks fields as deprecated in a specific version. Each deprecated field will be included in later versions but renamed to deprecated_<FIELD_NAME>. They will additionally include Rust's #[deprecated] attribute to indicate the deprecation in the Rust code. Conversions from earlier versions will move the value of the field to the deprecated field automatically.

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    #[versioned(deprecated(since = "v1beta1"))]
    foo: String,
    bar: usize,
}

The resulting generated code would look similar to this:

pub mod v1alpha1 {
    pub struct MyCrd {
        foo: String,
        bar: usize,
    }
}

pub mod v1beta1 {
    pub struct MyCrd {
        #[deprecated]
        deprecated_foo: String,
        bar: usize,
    }
}

The From<Old> for New implementation will look similar to this:

impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        #[allow(deprecated)]
        Self {
            deprecated_foo: my_crd.foo,
            bar: my_crd.bar,
        }
    }
}

Full Example

#[derive(Debug, Versioned)]
#[versioned(
    version(name = "v1alpha1"),
    version(name = "v1beta1")
)]
pub struct MyCrd {
    #[versioned(added(since = "v1beta1"))]
    foo: String,

    #[versioned(renamed(since = "v1beta1", to = "new_bar"))]
    bar: usize,

    #[versioned(deprecated(since = "v1beta1"))]
    baz: bool,
    gau: usize,
}

pub mod v1alpha1 {
    pub struct MyCrd {
        bar: usize,
        baz: bool,
        gau: usize,
    }
}

pub mod v1beta1 {
    pub struct MyCrd {
        foo: Option<String>,
        new_bar: usize,
        deprecated_baz: bool,
        gau: usize,
    }
}

impl From<v1alpha1::MyCrd> for v1beta1::MyCrd {
    fn from(my_crd: v1alpha1::MyCrd) -> Self {
        #[allow(deprecated)]
        Self {
            foo: Default::default(),
            new_bar: my_crd.bar,
            deprecated_baz: my_crd.baz,
            gau: my_crd.gau,
        }
    }
}

impl From<v1beta1::MyCrd> for v1alpha1::MyCrd {
    fn from(my_crd: v1beta1::MyCrd) -> Self {
        #[allow(deprecated)]
        Self {
            bar: my_crd.new_bar,
            baz: my_crd.deprecated_baz,
            gau: my_crd.gau,
        }
    }
}

Kubernetes API Version Crate

Initial Sketch

We want to parse raw Kubernetes (API) version strings into Rust structs. This enables us to more easily work with these versions. It enables sorting, equality checking and ordering. It additionally enables us to provide better error messages in case of invalid versioning.

// (<GROUP>/)<VERSION>
pub struct ApiVersion {
    pub group: Option<String>,
    pub version: Version,
}

// v<MAJOR>(<LEVEL>)
pub struct Version {
    pub major: u64,
    pub level: Option<Level>,
}

// beta<LEVEL_VERSION>/alpha<LEVEL_VERSION>
pub enum Level {
    Beta(u64),
    Alpha(u64),
}

A PoC implementation for this already exists.

This is implemented. See https://github.com/stackabletech/operator-rs/tree/main/crates/k8s-version

Open Questions / Ideas

  1. We may even want to limit the visibility of deprecated fields using pub(crate).
  2. We might want to make the conversion mechanism configurable for added fields via the builder pattern.
    This can be done by implementing the From<Old> for New manually instead of using the automatically derived one.
    Users can opt out of the auto-generation by providing the #[versioned(version(name = "v1beta1", skip("from")))]

References

Selection of Crates

Research

Tasks

Preview Give feedback

Implementation

Tasks

Preview Give feedback
  1. 11 of 11
    demo-needed
    Techassi
  2. 0 of 5
    type/feature-improvement
    Techassi
  3. 0 of 5
    Techassi
  4. 0 of 5
    Techassi
  5. type/feature-new
    Techassi
  6. type/feature-new
    Techassi
  7. 1 of 3
    type/internal-improvement
    Techassi
  8. 0 of 5
    type/feature-new
    Techassi
  9. 0 of 5
    NickLarsenNZ
  10. type/feature-new
    Techassi
  11. 0 of 7
    NickLarsenNZ
  12. type/feature-new
    Techassi
  13. 0 of 4
    type/feature-new
    Techassi
  14. 0 of 4
    type/feature-new
    Techassi

Test Rounds

2024-09-06

#507 (comment)

Fixes

Preview Give feedback
  1. type/feature-new
    Techassi
  2. Techassi
  3. 0 of 3
    Techassi

2024-09-18

Fixes

Preview Give feedback
  1. 0 of 5
    Techassi
  2. Techassi
  3. 0 of 5
    type/feature-new
    Techassi

Acceptance criteria:

  • We know how to version our CRDs in Rust
  • We know how to write our conversion code between these versions
  • We have a PoC implementation/spike for a conversion webhook
  • We have a full-featured derive macro to handle CRD versioning
  • We have a general-purpose crate which can be used by others in the open-source Rust ecosystem
@fhennig
Copy link
Contributor Author

fhennig commented Jan 30, 2024

related: this ticket (#504) tracks a bunch of breaking changes that we want to do when we have CRD versioning in place

@soenkeliebau
Copy link
Member

Love the idea expressed in the description!!!

Just one question, will there be a way to express transformation logic?
For example consider this scenario (not saying this specific one makes sense, but for arguments sake..):

v1 has a field timeout_seconds and v2 deprecates this field and introduces a new field timeout_milliseconds with a default value of timeout_seconds * 1000.
Can we somehow allow specifying a conversion function to be called in the derive macro (can be at a much later stage, just wanted to consider it).

@Techassi
Copy link
Member

Techassi commented Mar 15, 2024

Can we somehow allow specifying a conversion function to be called in the derive macro (can be at a much later stage, just wanted to consider it).

I have some ideas for that in mind indeed. I will add a few more details on that! But I agree, this will probably be added at a later point in time.

Also see the second point in the Questions / Ideas section.

v1 has a field timeout_seconds and v2 deprecates this field and introduces a new field timeout_milliseconds with a default value of timeout_seconds * 1000.

Ideally we want to use Duration which supports human-readable duration strings. When using this type, fields are just named timeout. The field name itself doesn't carry any information about time granularity.

@Techassi
Copy link
Member

I updated the ticket and added "Idea 2" which dives into the customization of the conversion process.

@soenkeliebau
Copy link
Member

Disclaimer: Just to write it down somewhere, no need to respond, we can discuss next week.

How do we want to handle actual code that is currently in the crd module, haven't gone and checked, but I think there is a not insignificant amount of fn's in the crd modules that is needed for the operator to do its work.

Will the operators internally always work with the latest version and all actual code is moved to that module any time a new version is added?

@adwk67
Copy link
Member

adwk67 commented Mar 15, 2024

This is a very thorough overview - great work! Two things occurred to me:

  • in view of

Adding required fields in CRDs is against the Kubernetes guidelines. Instead, CRD developers should use optional fields with a sane default value

will we plan to move new optional fields out into being non-optional in a subsequent update, so that foo: Option<String> can eventually become foo: String? (if it implements Default that shouldn't present a problem?)

  • how will we deal with versioning of fields in the operator framework? Will we use a similar approach in operator-rs?

@maltesander
Copy link
Member

How do we want to handle actual code that is currently in the crd module, haven't gone and checked, but I think there is a not insignificant amount of fn's in the crd modules that is needed for the operator to do its work.

Unfortunately yes, i think we have to traitify the structs and each version of the struct will implement the trait in its own module or something.

Will the operators internally always work with the latest version and all actual code is moved to that module any time a new version is added?

I think the operators will work with the latest version, just keeping the old stuff for crd generation i guess.

how will we deal with versioning of fields in the operator framework? Will we use a similar approach in operator-rs?

Yeah everything that appears in the CRD must be versioned somehow. Not a big deal for standalone CRDs like AuthClass with their own version etc. but e.g. a change in the Role struct will result in a CRD version bump of the implementing operators.

@soenkeliebau
Copy link
Member

I think the operators will work with the latest version, just keeping the old stuff for crd generation i guess.

If the operator always gets the latest version, couldn't we.in theory noop all operations in older versions, as they shouldn't ever be called?

@maltesander
Copy link
Member

If the operator always gets the latest version, couldn't we.in theory noop all operations in older versions, as they shouldn't ever be called?

Actually yeah, we will only need the "implementation" for the version (whatever it will be) the operator is requesting.
Then its purely kept for crd generation purposes.

@Techassi Techassi moved this to Refinement Acceptance: Waiting for in Stackable Engineering Mar 20, 2024
@stefanigel
Copy link
Member

Hi, I read a lot of details, HOW CRD Versioning can be implemented, but nod description WHAT value the feature delivers.
What are the requirements? Which issues can be solved? (Ok, it versions CRDs, but it seems to be more than adding a version number ...) Probably you all know this, but others may not.
How would you explain an interested user, what this feature does for her?

@NickLarsenNZ
Copy link
Member

NickLarsenNZ commented May 23, 2024

Just regarding the modules being the "version", and containing the originally named structs inside it, eg:

pub mod v1alpha1 {
    pub struct MyCrd {
        bar: usize,
        baz: bool,
        gau: usize,
    }
}

I think that should be inverted, like:

pub mod my_crd {
    pub struct V1Alpha1 {
        bar: usize,
        baz: bool,
        gau: usize,
    }
}

I can't think of good wording, but here it is raw:

  • A v1alpha1 of a Foo has no relation to a v1alpha1 of a Bar.
  • A container Foo has a set of versions. A version does not have a set of containers (eg: Foo and Bar).

Related comment: stackabletech/operator-rs#793 (comment)

@Techassi
Copy link
Member

Techassi commented Sep 6, 2024

Results of a first (and quick) test round:

  • Version modules don't have access to imported (at the file root) items.
  • There are two version: v1 and v2. If a field changes the type in v2 and there is no action in v1, the inferred type in v1 is wrong.
  • The #[versioned()] macro needs to appear before other attributes / macros. Mention this in the docs.

Fixes for these issues are tracked in the tasklists above.

@Techassi
Copy link
Member

Closing this tracking issue as the initial implementation is finished. Tracking of CRD versioning moved to #642.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

7 participants