From 6862f88c8cfc41e05a42500b4aed60b059a82fcd Mon Sep 17 00:00:00 2001
From: IcyFeather <mengzhuo.happy@gmail.com>
Date: Fri, 19 Jul 2024 11:17:03 +0800
Subject: [PATCH] update llm benchmark proposal

Signed-off-by: IcyFeather <mengzhuo.happy@gmail.com>
---
 .../scenarios/llm-benchmarks/llm-benchmarks.md         | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/proposals/scenarios/llm-benchmarks/llm-benchmarks.md b/docs/proposals/scenarios/llm-benchmarks/llm-benchmarks.md
index 5689a530..57b31349 100644
--- a/docs/proposals/scenarios/llm-benchmarks/llm-benchmarks.md
+++ b/docs/proposals/scenarios/llm-benchmarks/llm-benchmarks.md
@@ -409,6 +409,8 @@ BenchMark 的相关信息数据都需要设计成单独存储，以保持稳定
 }
 ```
 
+如果需要有别的prompt信息，也可以加进去。
+
 至于是使用 ZeroShot/OneShot/FewShot，其实都是用增加 chat message history 的方式，这部分由不同模型自己实现即可。
 
 chat history：
@@ -450,13 +452,13 @@ print(tokenizer.decode(tokenized_chat[0]))
 
 ## 时间规划
 
-- 6月到7月中旬
+- **6月到7月中旬**
 实现 OpenCompass 项目的集成，在 Ianvs 上实现 LLM single task learning 样例
-- 7月中旬到8月中旬
+- **7月中旬到8月中旬**
 以政务数据集为例，构建一套测试集、测试指标、测试环境、使用指南
-- 8月中旬到9月中旬
+- **8月中旬到9月中旬**
 优化 Ianvs 上的评估，实现任务监控、可视化等功能
-- 9月中旬到9月底
+- **9月中旬到9月底**
 如果有时间精力，看情况实现工业/医疗大模型测试套件，包括指标和样例