한국어 LLM Text Generation 유사 탐지 모델 개발

Introduction

한국어에 대해서 LLM 으로 부터 generated 되었는지, Human generated 되었는지 탐지하는 모델을 개발

Motivation

한국어 LLM 탐지 연구기여
한국어 데이터셋 생성기여

Models and Methods

1. Supervised Learning

Models Used:
- Mistral 7B-ko with Lora Fine Tuning
Ghost Buster:
- LightGBM (0.45)
- Multinomial Naive Bayes (0.45)
- SGD Classifier (0.1)

2. Unsupervised Learning

Weak Labeling Approach:
- Thresholds set for human-generated scores (>0.8) and AI-generated scores (<0.2)

3. Zero-Shot Inference

Models Utilized:
- GPT 3.5, GPT 4, GPT 4o
- Google Gemini 1.0 Pro

Train Dataset

The project utilizes a Google translator API to create a bilingual dataset: DAIGT DataSet

Source: English dataset
Target: Translated Korean dataset

Inference Dataset

We made ourselves by collecting..

university entrance exams in Korea, specifically in the humanities and social sciences
- Human Generated : 인문논술 모범답안
- LLM Generated : GPT Generated

Results

Performance metrics for the models are as follows:

Supervised Learning: Best ROC-AUC of 0.98 with Mistral 7B and F1 score of 0.92 with Ghost Buster.
Unsupervised Learning: Best ROC-AUC of 0.96 with Mistral 7B and F1 score of 0.91.
Zero-Shot Learning: Stable performance across different versions of GPT with an F1 score around 0.85.

Conclusion

Our results demonstrate the feasibility of using advanced machine learning techniques for distinguishing between human and AI-generated texts in Korean. Ongoing improvements and expansions of the dataset will further enhance the model's accuracy and reliability.

Model Performance Metrics

Category	Metric	Test	Ghost Buster	Mistral 7B	Gemini	GPT 3.5	GPT 4	GPT 4o
Supervised	ROC-AUC	0.91	0.98	0.98	-	-	-	-
	F1	0.89	0.92	0.92	-	-	-	-
Unsupervised	ROC-AUC	0.95	0.96	0.96	0.38	0.54	0.45	0.40
	F1	0.90	0.91	0.91	0.80	0.82	0.85	0.85
Zero shot	ROC-AUC	-	-	-	0.46	0.53	0.58	0.45
	F1	-	-	-	0.78	0.85	0.85	0.85

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Unsupervised		Unsupervised
data		data
images		images
.gitattributes		.gitattributes
.gitignore		.gitignore
Mistral_1600_Train_Epoch4.ipynb		Mistral_1600_Train_Epoch4.ipynb
Mistral_800_Train_Epoch4.ipynb		Mistral_800_Train_Epoch4.ipynb
Mistral_Lora_7B.ipynb		Mistral_Lora_7B.ipynb
README.md		README.md
gemini.ipynb		gemini.ipynb
ghost-bursting.ipynb		ghost-bursting.ipynb
gpt.py		gpt.py
gpt_score.ipynb		gpt_score.ipynb
presentation.pptx		presentation.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

한국어 LLM Text Generation 유사 탐지 모델 개발

Introduction

Motivation

Models and Methods

1. Supervised Learning

2. Unsupervised Learning

3. Zero-Shot Inference

Train Dataset

Inference Dataset

Results

Conclusion

Model Performance Metrics

About

Releases

Packages

Languages

aza1200/Korean_LLM_Detection

Folders and files

Latest commit

History

Repository files navigation

한국어 LLM Text Generation 유사 탐지 모델 개발

Introduction

Motivation

Models and Methods

1. Supervised Learning

2. Unsupervised Learning

3. Zero-Shot Inference

Train Dataset

Inference Dataset

Results

Conclusion

Model Performance Metrics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages