Optimization of non-GPT 4 major model outputs #548

warlockedward · 2024-03-29T02:38:12Z

Currently I try to use Qwen1.5-72B, deepseek-33b, and mixtral-8x7b models to drive mentat, and I find that the answers given always have some errors, there are misunderstandings and inaccurate code modifications, and I'm not sure what is causing them, and I don't know if there is any later support for the ability to target non-GPT4 models? Thank you very much.

biobootloader · 2024-03-29T22:44:30Z

in our testing, no models other than GPT-4 and Claude 3 Opus can handle the complex edit format required for Mentat.

We do have some changes coming that might make things easier for local models though. This experiment is a step in that direction: #530

biobootloader closed this as completed Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization of non-GPT 4 major model outputs #548

Optimization of non-GPT 4 major model outputs #548

warlockedward commented Mar 29, 2024

biobootloader commented Mar 29, 2024

Optimization of non-GPT 4 major model outputs #548

Optimization of non-GPT 4 major model outputs #548

Comments

warlockedward commented Mar 29, 2024

biobootloader commented Mar 29, 2024