diff --git a/docs/changelog/main.mdx b/docs/changelog/main.mdx index f7dc41850c..41368020fd 100644 --- a/docs/changelog/main.mdx +++ b/docs/changelog/main.mdx @@ -2,9 +2,38 @@ title: "Changelog" --- +## v0.8.0 - Revamping evaluation +*22th January 2024* + +We've spent the past month re-engineering our evaluation workflow. Here's what's new: + +**Running Evaluations** + +1. Simultaneous Evaluations: You can now run multiple evaluations for different app variants and evaluators concurrently. + + + +2. Rate Limit Parameters: Specify these during evaluations and reattempts to ensure reliable results without exceeding open AI rate limits. + + + +3. Reusable Evaluators: Configure evaluators such as similarity match, regex match, or AI critique and use them across multiple evaluations. + + + +**Evaluation Reports** + +1. Dashboard Improvements: We've upgraded our dashboard interface to better display evaluation results. You can now filter and sort results by evaluator, test set, and outcomes. + + + +2. Comparative Analysis: Select multiple evaluation runs and view the results of various LLM applications side-by-side. + + + ## v0.7.1 - Adding Cost and Token Usage to the Playground -*12th January 2023* +*12th January 2024* This change requires you to pull the latest version of the agenta platform if you're using the self-serve version. diff --git a/docs/images/changelog/eval_1.png b/docs/images/changelog/eval_1.png new file mode 100644 index 0000000000..fc83d2d324 Binary files /dev/null and b/docs/images/changelog/eval_1.png differ diff --git a/docs/images/changelog/eval_2.png b/docs/images/changelog/eval_2.png new file mode 100644 index 0000000000..4896269cd4 Binary files /dev/null and b/docs/images/changelog/eval_2.png differ diff --git a/docs/images/changelog/eval_3.png b/docs/images/changelog/eval_3.png new file mode 100644 index 0000000000..46d74d8fe3 Binary files /dev/null and b/docs/images/changelog/eval_3.png differ diff --git a/docs/images/changelog/eval_4.png b/docs/images/changelog/eval_4.png new file mode 100644 index 0000000000..4f8f4796d4 Binary files /dev/null and b/docs/images/changelog/eval_4.png differ diff --git a/docs/images/changelog/eval_5.png b/docs/images/changelog/eval_5.png new file mode 100644 index 0000000000..1b7dc9064d Binary files /dev/null and b/docs/images/changelog/eval_5.png differ