Auto-resolve schema by confluent magic bytes #302

Yerachmiel-Feltzman · 2022-06-19T08:51:42Z

Yerachmiel-Feltzman
Jun 19, 2022

HI,

Is there a way to automatically resolve the schema by the leading magic byte for each message, which contains the schema id for that message?

As we all know, confluent AVRO prepends the schema id to the message. So, each message has its own schema id embedded to it.
ABRIS adds this id (the magic byte) when encoding the schema to support the Confluent format.

However, when decoding Confluent encoded messages, we must manually pass the schema configuration beforehand, making it hard to support cases where we can receive messages with different schemas (think of record strategy or even schema evolution).

I could implement a solution parsing mannualy each message magic byte and dynamically constructing the schema configurations for each message (or group of messages to save and avoid creating million of config objects).

Is there such a thing out of the box that I miss?

Thank you very much.

cerveada · 2022-06-20T06:51:48Z

cerveada
Jun 20, 2022
Maintainer

Hello,

Spark DataFrame has a spark schema (data type) itself and you cannot store rows with multiple schemas in one DataFrame
Schema evolution is supported. What you provide is writer schema and Abris is able to obtain the reader schema from the id in the payload and convert the received data automatically according to the schema evolution rules.
also see: multiple-schemas-in-one-topic

I could implement a solution parsing mannualy each message magic byte and dynamically constructing the schema configurations for each message (or group of messages to save and avoid creating million of config objects).

Yes, but how do you represent it in Spark? Since each row now has different data type they cannot be in one DataFrame.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-resolve schema by confluent magic bytes #302

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Auto-resolve schema by confluent magic bytes #302

Yerachmiel-Feltzman Jun 19, 2022

Replies: 1 comment

cerveada Jun 20, 2022 Maintainer

Yerachmiel-Feltzman
Jun 19, 2022

cerveada
Jun 20, 2022
Maintainer