-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: rag evaluation results presentation #7462
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're almost there. 👍 I am not sure about the idea of passing other
incomparative_summary(self, other: "RAGPipelineEvaluation"):
but we can iterate on it later. I wouldn't block the proposal because of that. Better to start with the implementation.
One decision that should be made when implementation of this proposal starts is whether we want to rely on pandas or not. We should check whether we want the Document class to rely on pandas or not for that. We could definitely implement the new eval features without pandas and have one data export option that makes it easy for advanced users to use pandas if they want to.
One change request: could you map the user stories directly to the methods please? For example, that mapping should explain when the user uses find_thresholds
for one of the stories from the issue.
I left some minor comments in the proposal too.
Co-authored-by: Madeesh Kannan <[email protected]>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of minor changes before it's good to merge (from my side) 🎉
Co-authored-by: Madeesh Kannan <[email protected]>
@julian-risch @mrm1001 do you want to add, suggest anything else? if not I will merge it |
Proposal for a presentation of RAG pipeline evaluation results