Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: rag evaluation results presentation #7462

Merged
merged 18 commits into from
Apr 8, 2024
Merged

Conversation

davidsbatista
Copy link
Contributor

@davidsbatista davidsbatista commented Apr 4, 2024

Proposal for a presentation of RAG pipeline evaluation results

@davidsbatista davidsbatista requested review from a team as code owners April 4, 2024 07:52
@davidsbatista davidsbatista requested review from dfokina and shadeMe and removed request for a team April 4, 2024 07:52
@github-actions github-actions bot added proposal 2.x Related to Haystack v2.0 labels Apr 4, 2024
@shadeMe shadeMe added the ignore-for-release-notes PRs with this flag won't be included in the release notes. label Apr 4, 2024
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're almost there. 👍 I am not sure about the idea of passing other incomparative_summary(self, other: "RAGPipelineEvaluation"): but we can iterate on it later. I wouldn't block the proposal because of that. Better to start with the implementation.
One decision that should be made when implementation of this proposal starts is whether we want to rely on pandas or not. We should check whether we want the Document class to rely on pandas or not for that. We could definitely implement the new eval features without pandas and have one data export option that makes it easy for advanced users to use pandas if they want to.
One change request: could you map the user stories directly to the methods please? For example, that mapping should explain when the user uses find_thresholds for one of the stories from the issue.
I left some minor comments in the proposal too.

proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
@davidsbatista
Copy link
Contributor Author

We could definitely implement the new eval features without pandas and have one data export option that makes it easy for advanced users to use pandas if they want to
@julian-risch I agree with you, I would make clear that we don't want pandas, and have JSON that can easily be transformed into a pandas

Copy link
Contributor

@shadeMe shadeMe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor changes before it's good to merge (from my side) 🎉

proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
proposals/text/7462-rag-evaluation.md Outdated Show resolved Hide resolved
@davidsbatista
Copy link
Contributor Author

@julian-risch @mrm1001 do you want to add, suggest anything else? if not I will merge it

@davidsbatista davidsbatista merged commit 1b10a83 into main Apr 8, 2024
7 checks passed
@davidsbatista davidsbatista deleted the rag-dataset-eval branch April 8, 2024 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 ignore-for-release-notes PRs with this flag won't be included in the release notes. proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants