v0.2.3
What's Changed
- feat: measure cost of testset generator by @jjmachan in #1560
- docs: added luka's blogs by @jjmachan in #1554
- Fix: add LLMContextPrecisionWithReference to init.py by @licux in #1561
- chore: add verbose option (V=1) to make targets by @trevorbowen in #1556
- add embeddings to TestsetGenerator by @hunter-walden2113 in #1562
- fixed verb agreement @ available_metrics by @gabrielhomsi in #1574
- Fix: Limit number of retries for parse failures by @rskew in #1569
- Fix: reference key error in LLMContextPrecisionWithoutReference by @xizhou-vw in #1570
- Updated return type description for evaluate function by @taihim in #1579
- Feat: add multimodal eval support by @Yunnglin in #1559
- fix: add
reference_topics
as default required columns inTopicAdherenceScore
#1564 by @luqmansen in #1566 - fix: add reference tool call to required cols by @shahules786 in #1580
- Improve efficiency in factual correctness for precision mode by @Jeff-67 in #1578
- This commit implements the F-beta score metric by @Yuri-Albuquerque in #1543
- fix: agent goal accuracy by @shahules786 in #1583
- chores: fix pypi rendering by @shahules786 in #1581
- fix: typo: ROUGE is a metric, ROGUE is a scoundrel by @ahgraber in #1585
New Contributors
- @trevorbowen made their first contribution in #1556
- @hunter-walden2113 made their first contribution in #1562
- @gabrielhomsi made their first contribution in #1574
- @rskew made their first contribution in #1569
- @xizhou-vw made their first contribution in #1570
- @taihim made their first contribution in #1579
- @luqmansen made their first contribution in #1566
- @Yuri-Albuquerque made their first contribution in #1543
Full Changelog: v0.2.2...v0.2.3