[Feat] Support Knowledge-based Retriever #348
Open
+421
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thank you very much for your contributions to the community. The open-compass/opencompass project is truly outstanding, and I envision engaging in further research based on opencompass foundation.
In this Pull Request, I introduce a new
KnowledgeRetriever
that builds upon the LangChain [Code] to incorporate a knowledge base into the LLM evaluation. This feature achieves the following:knowledge_docs
parameter ininfer_cfg
.retrieve_keys
ininfer_cfg
.opencompass/openicl/icl_retriever/icl_knowledge_retriever.py
, adding just a newicl_retriever
. The logic for configs and method of this PR corresponds to the existing framework, with clear comments and consistent coding style.This PR introduces the LangChain option to the Retrievers, which significantly alleviates the phenomenon of hallucinations in LLM's performance on some test questions. Furthermore, it enables the evaluation of the LLM's ability to summarize existing relevant knowledge.
In the
configs/eval_demo_knowledge.py
file, I provide an example configuration for theKnowledgeRetriever
using theFewCLUE_chid
dataset (about choosing the correct idiom according to the context). The knowledge base can be found here: Knowledge Base Link (Extraction Code: 0g25, please put it in./data/
). The final implementation results are as follows:"origin_prompt": "
以下是参考内容:【严陈以待】见“严阵以待”。 【严阵以待】亦作“严陈以待”。 谓以严整的阵势,等待着敌人进犯,予以打击。; 借指改朝换代。 多指改朝换代。 【改朝换代】旧的朝代为新的朝代所代替。; 【不同戴天】同“不共戴天”。 【不共戴天】谓不共存于人世间。; 【海北天南】形容距离很远。 【天南海北】①形容距离遥远的不同地区。; 【物归原主】把东西归还原来的主人。; 后用“波谲云诡”以喻文章如波云变化多致。 【云谲波诡】谓像云气和水波那样千态万状,变化无穷。 【波谲云诡】①汉扬雄《甘泉赋》:“於是大厦云谲波诡,摧摧而成观。”; 【视远步高】高视阔步。 【高步阔视】同“高视阔步”。 【高视阔步】形容气宇轩昂或态度傲慢。,结合上述参考内容,考虑接下来的问题:
这意味着,在不久的将来,HJT异质结电池或将迎来爆发,光伏电池或也将迎来从PERC到HJT______的历史性投资机遇期。 自3月18日以来,HJT龙头迈为股份已经大涨58.46%,捷佳纬创已经大涨22.53%。 01 什么是HJT电池? HJT,中文名称异质结电...
请选择______处所填的词
A. 严阵以待
B. 改朝换代
C. 不共戴天
D. 天南海北
E. 物归原主
F. 波谲云诡
G. 高视阔步
请从“A”,“B”,“C”,“D”,“E”,“F”,“G”中进行选择。答:
"
where contents between “以下是参考内容:” and “结合上述参考内容,考虑接下来的问题:” is the content of the that is retrieved from the knowledge base.
I'm looking forward to your feedback, and if there are any issues with the code, I'm committed to making further improvements. Thank you!