Skip to content

Commit

Permalink
Merge pull request #1269 from Agenta-AI/docs/changelog
Browse files Browse the repository at this point in the history
changelog 0.8.0
  • Loading branch information
mmabrouk authored Jan 24, 2024
2 parents 11a37ff + 0153b8e commit 03f0d8d
Show file tree
Hide file tree
Showing 6 changed files with 30 additions and 1 deletion.
31 changes: 30 additions & 1 deletion docs/changelog/main.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,38 @@
title: "Changelog"
---

## v0.8.0 - Revamping evaluation
*22th January 2024*

We've spent the past month re-engineering our evaluation workflow. Here's what's new:

**Running Evaluations**

1. Simultaneous Evaluations: You can now run multiple evaluations for different app variants and evaluators concurrently.

<img height="600" src="/images/changelog/eval_1.png" />

2. Rate Limit Parameters: Specify these during evaluations and reattempts to ensure reliable results without exceeding open AI rate limits.

<img height="600" src="/images/changelog/eval_2.png" />

3. Reusable Evaluators: Configure evaluators such as similarity match, regex match, or AI critique and use them across multiple evaluations.

<img height="600" src="/images/changelog/eval_3.png" />

**Evaluation Reports**

1. Dashboard Improvements: We've upgraded our dashboard interface to better display evaluation results. You can now filter and sort results by evaluator, test set, and outcomes.

<img height="600" src="/images/changelog/eval_4.png" />

2. Comparative Analysis: Select multiple evaluation runs and view the results of various LLM applications side-by-side.

<img height="600" src="/images/changelog/eval_5.png" />

## v0.7.1 - Adding Cost and Token Usage to the Playground

*12th January 2023*
*12th January 2024*
<Warning> This change requires you to pull the latest version of the agenta platform if you're using the self-serve version.</Warning>

<img height="600" src="/images/changelog/screenshot_cost_and_token_usage.png" />
Expand Down
Binary file added docs/images/changelog/eval_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/changelog/eval_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/changelog/eval_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/changelog/eval_4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/changelog/eval_5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 03f0d8d

Please sign in to comment.