You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a first solution for #466, we need to force users to add the columnStats when indexing Tables with the following characteristics:
Underlying data source changes constantly.
DataFrame contains non-deterministic columns to index.
DataFrame contains non-deterministic predicates.
The usage of columnStats would infer the data's min/max values before the DataFrame Analysis, which can produce inconsistent results when loading the DataFrame twice for Indexing in any of the above use cases.
The idea is to enforce the user to explicit the columnStats when the query source is non-deterministic.
The text was updated successfully, but these errors were encountered:
Before: Analyze to what extent is possible to know the determinism of a column/query in advance.
osopardo1
changed the title
Error control + enforce columnStats when indexing non-deterministic or source-changing DataFrames
Error control when indexing non-deterministic or source-changing DataFrames
Dec 4, 2024
osopardo1
changed the title
Error control when indexing non-deterministic or source-changing DataFrames
Error control when indexing Non-Deterministic Source Queries
Dec 10, 2024
As a first solution for #466, we need to force users to add the
columnStats
when indexing Tables with the following characteristics:The usage of
columnStats
would infer the data's min/max values before the DataFrame Analysis, which can produce inconsistent results when loading the DataFrame twice for Indexing in any of the above use cases.The idea is to enforce the user to explicit the
columnStats
when the query source is non-deterministic.The text was updated successfully, but these errors were encountered: