ZhihuRec Data-mining

A flask app for analyzing ZhihuRec dataset.

Requirement

 pip install requirements.txt

1.First, run this command to get answers' csv files:

 python tools/io.py

Or just download from here:

Baidu NetDisk 
Link:https://pan.baidu.com/s/1Ey-R9yo6_HNuoZuhEJivjg 
Code: 8rc7

Unzip and put the folder answer_csv into source/

2.Then you can use this command to run the flask app:

 python app.py

The flask app will run on the "127.0.0.1:5000"

[model] The tf-idf model will be saved here.
[source] Processed files
- [answer_csv] Answers' csv files. All files are sorted.
  - [xxxx.csv] The xxxx means the start(min) answer's index in this file.
[tools] Tools help you analyze the dataset.
- [io.py] Used to read/write/convert dataset.
- [tfidf.py] TF-IDF algorithm. its mainly functions are
  - train()
  - load_tfidf()
  - save_tfidf()
  - compare_similarity().
[zhihuRec] The dataset. You should put txt files here.
[app.py] The entry of the flask app.
[preprocess.py] Use the code in tools to create tfidf matrix, and save the result into model.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
classes		classes
model		model
tools		tools
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
app.py		app.py
main.py		main.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt