Attach additional metadata to dynamic partitions #13878
f-tremblay
started this conversation in
Ideas
Replies: 2 comments 4 replies
-
@spenczar also had a similar request |
Beta Was this translation helpful? Give feedback.
2 replies
-
@sryza Do you know if this is possible now? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When creating runs for dynamic partitions, there is currently no way to define additional metadata other than the partition_key.
For some use cases, a single value is not enough to define a partition.
An example is dynamic time ranges: each partition is defined by an arbitrary start_datetime and end_datetime (and booleans include_start and include_end). Another example is car make and model. Another example is ML experiment parameters.
In these situations, the user is forced to concatenate them (e.g.
partition_key = f"{metadata_value_1}--{metadata_value_2}--{metadata_value_3}"
), and parse them inside the asset to extract the values. This workaround is not very convenient (you have to handle the parsing), or robust (there's no validation for the types or for missing values, etc.).Another workaround would be to define a Pydantic model to represent the partition metadata, and use (de)-serialization like
my_partition_key = str(my_model.json())
andmy_model = MyModel.parse_raw(my_partition_key)
. This workaround is more convenient and robust. However, it produces long partition names, which might not be ideal in the UI.Ideally, Dagster would provide support for attaching additional metadata to a dynamic partition (or to the RunRequest).
This way, the partition_key would not have to be overloaded with a very long value (like with workaround # 2), and the additionnal metadata values could be displayed and plotted in Dagit partitions tab.
RunRequest would take an additional argument (e.g. a dict, a dataclass or a pydantic Model) for metadata, and it would be accessible from the asset context (e.g.
context.partition_metadata
orcontext.run_config
). When creating RunRequests (i.e., from a sensor), a partition would still be defined with a partition_key, and could also take an optional metadata parameter (or mayberun_config
could be used?).Beta Was this translation helpful? Give feedback.
All reactions