You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our compile api tends to limit the scope to the conf in question, this has a negative effect that encourages append only materialized configs and makes deprecation of group bys quite a challenge. Deleting a group by can affect joins that are already materialized and since deletion is expected to be handled by the users without going through the API is prone to mistakes.
Similarly it's common that when users build a new group by based on a source used by other group bys they are going to need similar job tuning settings (like minimal cores to make sure the streaming job process all partitions). Pointing users to similar group bys during compile could save duplication of group bys, or even improve feature discoverability.
Requirements
A delete mode for group bys that prevents uploads and streaming jobs that are not consumed.
Simple reference to similar group bys (i.e. group bys from the same source table and same key) when developing a group by
Verification
Unit tests
Approach
One approach is to compile the state of the production folder previous to compiling.
This makes it simple to point out similar group bys by linking objects together during compilation. This could be also the basis for an explore.py like API, or even the backend for a python API discoverability UI. Linking objects would make the delete mode fairly trivial. This is probably worth another CHIP but having a compiler that links the objects would allow to decompose a feature name into it's first principles.
Alternative approaches - and the reason for discarding those approaches.
User API (when required)
To delete: compile.py --conf production/group_bys/... --delete
Planning
TBD
The text was updated successfully, but these errors were encountered:
cristianfr
changed the title
[WIP][CHIP] compile.py [delete mode + similar group bys]
[WIP][CHIP] compile.py improvements: object linking
Feb 28, 2024
Problem Statement
Our compile api tends to limit the scope to the conf in question, this has a negative effect that encourages append only materialized configs and makes deprecation of group bys quite a challenge. Deleting a group by can affect joins that are already materialized and since deletion is expected to be handled by the users without going through the API is prone to mistakes.
Similarly it's common that when users build a new group by based on a source used by other group bys they are going to need similar job tuning settings (like minimal cores to make sure the streaming job process all partitions). Pointing users to similar group bys during compile could save duplication of group bys, or even improve feature discoverability.
Requirements
Verification
Approach
User API (when required)
Planning
TBD
The text was updated successfully, but these errors were encountered: