-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issues in ResponseMatching operator #589
Fix issues in ResponseMatching operator #589
Conversation
Could you also let me know how I can update the same here: https://docs.haystack.deepset.ai/docs/uptrainevaluator |
Many thanks for the PR, @Dominastorm. Seems like it fixes the issue with the response matching metric, but I'm unfortunately still seeing some other metrics fail with
Log:
Is this something you can look into as well? We can extend the scope of this PR to include them, for what it's worth. On a related note, can/does UpTrain version its API endpoints? This will help prevent such retroactive failures in the future. |
The issue was that This was causing the evaluations to fail. Changing it to We do not currently version our endpoints, but that's a good suggestion and we will look into it! |
I was very surprised that changing those lines fixed the issue because none of those tests were actually invoking your client to begin with (also, in some cases, a mock client/backend is being used). Digging in a bit more, I discovered that the This is very problematic - Is there a good reason why this is being done? The code shouldn't be modifying the user/execution environment unless it has an extremely good reason to do so, and definitely not silently. It's very insidious behaviour as it can break other code that's executing side-by-side in a hard-to-debug manner. Speculating here, but maybe this code was primarily written to be executed in your sever/deployment environment where making such modifications are handled gracefully? Further more, all of this happens even before your client is invoked, which also violates the principle of least surprise. In any event, this appears to be the underlying issue here w.r.t the above failures. However, I'm still stumped as to why only some of the metrics fail during the integration test even if the API token is junk. Could you shed some light on this? |
You are correct. This was being done as it is handled gracefully on our servers. The removal of the assignments from the Settings class is in our pipeline. However, since it is a breaking change, it will require extensive testing from our end to make sure that it does not cause any of our current code to fail.
This happened because we didn’t have a check for valid OpenAI keys. We have added it in our latest release. The four evaluations that failed gave an Authentication Error. The rest, however, returned dictionaries with None values. This was an oversight from our end, which we have now resolved. With the current code, all of them should throw an error when an invalid key is passed. This also brings up a shortcoming of the tests that are being performed. They are not checking whether the result contains valid data and instead only checking if the data type is correct. So, that is another change that needs to be made. As for upgrading to the latest version of UpTrain. I will make those changes as well. Upgrading to v0.6.9 should primarily change two things:
I will make the above changes and support the latest version of UpTrain. We will take up resolving the environment variables at a later stage once we ensure that it is stable on our end. |
Hey @Dominastorm. I'm wondering if there is any update on this? |
Hey @shadeMe, you can merge it now. It will work for uptrain>=0.6.13 |
Closing as the ownership of this integration has been transferred over to UpTrain. |
Hello everyone,
Apologies for the inconvenience experienced with #564 and #571.
In this pull request, I've included
question
as a mandatory column for the ResponseMatching operator. This update addresses the challenges encountered with uptrain==0.5.0. We are currently in the process of making further modifications, hence think it to be best to support the latest version of uptrain later. However, for the time being, implementing this should resolve the issues.Thanks!