Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema reference support #239

Open
dtoppani-twist opened this issue Jan 24, 2023 · 3 comments
Open

Schema reference support #239

dtoppani-twist opened this issue Jan 24, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@dtoppani-twist
Copy link

dtoppani-twist commented Jan 24, 2023

Assuming an existing schema like:

@dataclass
class Child:
    name: str
    class Meta:
        namespace = 'com.test'

and using this schema like:
@dataclass
class Parent:
    name: str
    child: Child

When calling Parent.avro_schema(), the full child schema is included in the parent schema. I don't see an option to provide Child as a known schema so Parent would use com.test.Child as the type instead.

For example, if I could call Parent.avro_schema(known_types=[Child]) this would allow using schema references that have been previously registered in platform such as confluent

@marcosschroh
Copy link
Owner

Hi @dtoppani-twist

Thanks for you suggestion. Can you provide the final schema using confluent? Would the child field be {name: child, type: com.test.Child}?

@marcosschroh marcosschroh added the enhancement New feature or request label Apr 19, 2024
@Panaetius
Copy link

This would also be nice to be supported when going from schemas to python classes.

from dataclasses_avroschema import ModelGenerator, ModelType

schemas = [
	{
		"type": "record",
		"name": "User",
		"namespace": "com.example",
		fields: [
			{
				"name":"name",
				"type": "string"
			},
			{
				"name": "address",
				"type": "com.example.Address"
			}
		]
	},
	{
		"type": "record",
		"name": "Address",
		"namespace": "com.example",
		"fields": [
			{
				"name": "city",
				"type": "string"
			}
		]
	}
]


model_generator = ModelGenerator()
result = model_generator.render_module(
    schemas=schemas, model_type=ModelType.PYDANTIC.value 
)

with open("models.py", "+w") as f:
    f.write(result)

should generate something like

class Address(BaseModel):
	city: str
        class Meta:
            namespace = 'com.example'

class User(BaseModel):
	name: str
	address: Address
        class Meta:
            namespace = 'com.example'

but gives an error like:

  File "fastavro/_schema.pyx", line 162, in fastavro._schema.parse_schema
  File "fastavro/_schema.pyx", line 173, in fastavro._schema.parse_schema
  File "fastavro/_schema.pyx", line 407, in fastavro._schema._parse_schema
  File "fastavro/_schema.pyx", line 475, in fastavro._schema.parse_field
  File "fastavro/_schema.pyx", line 267, in fastavro._schema._parse_schema
fastavro._schema_common.UnknownType: com.example.Address

On the fastavro side this is supported by passing in a fully-qualifed name -> schema dictionary to the named_schemas field, but probably more work is needed to fully support generation in such a scenario.

An additional caveat is that there could be multiple of the same type but with different namespaces.

E.g.

        {
		"type": "record",
		"name": "User",
		"namespace": "com.example.v1",
		fields: [...]
	},
        {
		"type": "record",
		"name": "User",
		"namespace": "com.example.v2",
		fields: [...]
	}

and then generating modules would lead to having two classes called User that are in conflict. A way around that would be to generate multiple modules/folders based on the namespace.

@marcosschroh
Copy link
Owner

Hey, thanks for the comment. I will try to work on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants