Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update semantic model length validation for literal retrieval #72

Merged
merged 5 commits into from
Jul 2, 2024

Conversation

sfc-gh-nlimtiaco
Copy link
Collaborator

Now that we can do literal retrieval based on sample values, we no longer will show all sample values to the model by default. Update the model length verification to take this into account.

@@ -19,17 +72,40 @@ def validate_context_length(model: ProtoMsg, throw_error: bool = False) -> None:
"""

model.ClearField("verified_queries")
_truncate_sample_values(model)
num_search_services = _count_search_services(model)

yaml_str = proto_to_yaml(model)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this'll happen often: but i don't recall we strip any whitespaces during loading yaml/convert to proto; Shall we add stripping for whitespaces?
We can potentially add to proto field options like what Daniel had here: https://github.com/snowflakedb/cortex/pull/114221/files

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if whitespace is a problem, have you seen any issues with it? For yaml -> proto, I don't know of strictyaml not being robust to extra white space. For proto -> yaml, I don't see where we would output excess whitespace.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, probably strictyaml handles that. Can you help add a todo at the top of this file just reminder to keep an eye on the white spaces?

@sfc-gh-nlimtiaco sfc-gh-nlimtiaco merged commit 139f2bb into main Jul 2, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants