한국어에 대해서 LLM 으로 부터 generated 되었는지, Human generated 되었는지 탐지하는 모델을 개발
- 한국어 LLM 탐지 연구기여
- 한국어 데이터셋 생성기여
- Models Used:
- Mistral 7B-ko with Lora Fine Tuning
- Ghost Buster:
- LightGBM (0.45)
- Multinomial Naive Bayes (0.45)
- SGD Classifier (0.1)
- Weak Labeling Approach:
- Thresholds set for human-generated scores (>0.8) and AI-generated scores (<0.2)
- Models Utilized:
- GPT 3.5, GPT 4, GPT 4o
- Google Gemini 1.0 Pro
The project utilizes a Google translator API to create a bilingual dataset: DAIGT DataSet
- Source: English dataset
- Target: Translated Korean dataset
We made ourselves by collecting..
-
university entrance exams in Korea, specifically in the humanities and social sciences
- Human Generated : 인문논술 모범답안
- LLM Generated : GPT Generated
Performance metrics for the models are as follows:
- Supervised Learning: Best ROC-AUC of 0.98 with Mistral 7B and F1 score of 0.92 with Ghost Buster.
- Unsupervised Learning: Best ROC-AUC of 0.96 with Mistral 7B and F1 score of 0.91.
- Zero-Shot Learning: Stable performance across different versions of GPT with an F1 score around 0.85.
Our results demonstrate the feasibility of using advanced machine learning techniques for distinguishing between human and AI-generated texts in Korean. Ongoing improvements and expansions of the dataset will further enhance the model's accuracy and reliability.
Category | Metric | Test | Ghost Buster | Mistral 7B | Gemini | GPT 3.5 | GPT 4 | GPT 4o |
---|---|---|---|---|---|---|---|---|
Supervised | ROC-AUC | 0.91 | 0.98 | 0.98 | - | - | - | - |
F1 | 0.89 | 0.92 | 0.92 | - | - | - | - | |
Unsupervised | ROC-AUC | 0.95 | 0.96 | 0.96 | 0.38 | 0.54 | 0.45 | 0.40 |
F1 | 0.90 | 0.91 | 0.91 | 0.80 | 0.82 | 0.85 | 0.85 | |
Zero shot | ROC-AUC | - | - | - | 0.46 | 0.53 | 0.58 | 0.45 |
F1 | - | - | - | 0.78 | 0.85 | 0.85 | 0.85 |