Release v0.23.0 · Agenta-AI/agenta

What's Changed

Test: modify single model evaluation scores by @ashrafchowdury in #1981
Test: upload testset as csv by @ashrafchowdury in #1979
fix(backend): fix json evaluator description by @mmabrouk in #1990
Test: save single model testset by @ashrafchowdury in #1980
fix(tool): AGE-612 fix dependencies with langchain_community by @mmabrouk in #1992
AGE 486 - SDK v3 - Configuration management with Pydantic by @mmabrouk in #1961
feat(backend): AGE-276 and AGE-471 Improves LLM-as-a-judge reliability by @mmabrouk in #1938
Bump versions by @github-actions in #1996

Full Changelog: v0.22.0...v0.23.0