Kaggle Competition (Predict Fire Peril Loss Cost).
Host by: Liberty Mutual Group.
Link: https://www.kaggle.com/c/liberty-mutual-fire-peril
###1. Observation
Training Data: ~ 450,000 rows; ~360 features.
Non-zero target: ~ 1,200;
After studying the original data, most of the target value are 0, only about 2.6% of the target are non zero.
###2. Solution:
- Treat the non-zero data as anormal. Use Anormal detection methodology.
- Use Logistic regression to predict whether a data can be non-zero, then use Linear regression to further predict the value.
- Cosine Similarity. for each test data, find the most similar entry in the training set.
- Neuro Network (not implemented)