This repository contains my solution for the Kaggle competition on predicting the status of patients after a certain number of days post-liver transplantation. The dataset includes information such as age, sex, and other relevant features. The objective is to predict the probability of patients being alive, dead, or alive until a specified number of days with minimal log loss.
-
Data Cleanup:
- Performed data cleaning to handle missing values and ensure data quality.
-
Dimensionality Reduction using Principal Component Analysis (PCA):
- Utilized PCA to reduce the dimensionality of the dataset while retaining essential information.
-
Classification using XGBoost:
- Employed XGBoost, a powerful gradient boosting algorithm, for multi-class classification and probability prediction.
-
Hyperparameter Tuning:
- Fine-tuned hyperparameters to optimize the performance of the XGBoost model.
-
Performance:
- Achieved a final log loss of 0.47 on the training dataset.
To reproduce the results or experiment with the code, follow these steps:
-
Clone the Repository:
git clone https://github.com/shubhamgupta1017/Multi-Class-Prediction-of-Cirrhosis-Outcomes.git cd Multi-Class-Prediction-of-Cirrhosis-Outcomes
-
Install Dependencies:
pip install -r requirements.txt
-
Run the Jupyter Notebook:
- Open and run the provided Jupyter Notebook to execute the entire workflow.
-
Experiment and Fine-Tune:
- Feel free to experiment with different parameters or modify the code to further fine-tune the model for your specific requirements.
The model achieved a final log loss of 0.47 on the training dataset, showcasing its effectiveness in predicting cirrhosis outcomes.
Special thanks to Kaggle for hosting the competition and providing the dataset.
Shubham Gupta