Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Draft PR to discuss the possible
Variant
implementations. By implementingVariant
, we can re-use a large portion of the logic forDynamic
, and thenJSON
. Partially resolves #1430.Implementation
This implementation adds 3 major types to the module:
ColVariant
- the column implementation for (de)serializationVariant
- a container to hold variant values (optional for (de)serialization)VariantWithType
- an extension ofVariant
, with the ability to provide a preferred type in cases where it is ambiguous to existing column type detection (such asArray(UInt8)
vsString
)ColVariant
Serialization
When values are appended via
col.AppendRow()
, the inputv interface{}
type is checked. If it isnil
, aNull
discriminator is appended. If it is aVariantWithType
, then the specified column type will be appended along with its matching discriminator. The underlying column'sAppendRow
function is re-used so that we don't need to re-implement its logic.As a catch-all, the input value will be tested against each column type until it succeeds. For example,
Variant(Bool, Int64, String)
will try to append asbool
,int64
, thenstring
. If a value does not fit into any column type, it will return an error.Sometimes types will conflict. Due to alphabetical sorting of the type,
Array(UInt8)
would be used beforeString
sinceArray
allows forstring
input. I have researched different solutions to this, including a type priority system, but it would be complex to implement. For now it is easiest to let the user simply inputNewVariantWithType(int64(42), "Int64")
orNewVariant(int64(42)).WithType("Int64")
if they want a specific type within the variant. For complex types like maps, reflection will be used if a type isn't specified.After all rows are appended, the Native format is used to serialize the data into the buffer. First with
serializationVersion
, then theuint8
array fordiscriminators
, then each column'sEncode
function is re-used as usual (similar toTuple
).Deserialization
The Native format deserializes the
discriminators
and builds a set ofoffsets
for each column. This allows for storing multiple columns with mixed lengths. When the user wants to read a row, we can index into the correct row of each column to get the corresponding type.In practice this looks like this:
Or, if you know your types ahead of time, you can also scan directly into it:
This pattern works by simply calling the underlying column's
ScanRow
function. It is safest to scan intoVariant
however.If you need to switch types on
Variant
for your own type detection, you can usevariantRow.Any()
orvariantRow.Interface()
to returnany
/interface{}
respectively (provided both for preferred semantics).Variant
Variant is simply a wrapper around
any
. It implements stdlib sql interfaces such asdriver.Value
andScan
. It also has convenience functions for primitives such asInt64
. If you need to access the underlying value you can useAny()
. This type can be constructed with theNewVariant(v)
function.The
Variant
type should be used in structs and when scanning fromColVariant
. It can also be used for insertion, althoughVariantWithType
may be required if there's overlap between types.VariantWithType
VariantWithType
is the same asVariant
, but with astring
included to specify the preferred type. You can use this for insertion when the Variant column has types that overlap. For example if you hadVariant(Array(UInt8), String)
, a Gostring
would be inserted as anArray(UInt8)
. If you wanted to force this to be a ClickHouseString
, you could useNewVariantWithType(v, "String")
to provide the preferred type. If the preferred type is not present in the Variant, the row will fail to append to the block. Types can be added on an existingVariant
by callingexampleVariant.WithType(t string)
, which will return a newVariantWithType
.Checklist
Delete items not relevant to your PR: