-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update semantic model length validation for literal retrieval #72
Conversation
@@ -19,17 +72,40 @@ def validate_context_length(model: ProtoMsg, throw_error: bool = False) -> None: | |||
""" | |||
|
|||
model.ClearField("verified_queries") | |||
_truncate_sample_values(model) | |||
num_search_services = _count_search_services(model) | |||
|
|||
yaml_str = proto_to_yaml(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this'll happen often: but i don't recall we strip any whitespaces during loading yaml/convert to proto; Shall we add stripping for whitespaces?
We can potentially add to proto field options like what Daniel had here: https://github.com/snowflakedb/cortex/pull/114221/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if whitespace is a problem, have you seen any issues with it? For yaml -> proto, I don't know of strictyaml
not being robust to extra white space. For proto -> yaml, I don't see where we would output excess whitespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, probably strictyaml
handles that. Can you help add a todo at the top of this file just reminder to keep an eye on the white spaces?
Now that we can do literal retrieval based on sample values, we no longer will show all sample values to the model by default. Update the model length verification to take this into account.