Predicting possible rainfall of today, tomorrow and the day after tomorrow of the requested location from client
[email protected] |
GitHub |
Wooyong Jeong: Woo |
AI Developer |
- Development
- Model Implementation
- Google Colab
- Google Drive
- Scikit Learn
- Pandas
- Numpy
- Model Server Implementation
- FastAPI
- Joblib
- Pydantic
- Model Implementation
- Deployment
- EC2
- Nginx: Proxy and HTTPS(Let's Encrypt and CertBot)
- Docker
.
├── .venv
├── app/
│ ├── core/
│ │ └── config.py
│ ├── crud/
│ │ ├── classification_crud.py
│ │ └── prediction_crud.py
│ ├── project_enum/
│ │ └── obs_enum.py
│ ├── router/
│ │ └── prediction_router.py
│ ├── schema/
│ │ ├── prediction_request_schema.py
│ │ └── prediction_response_schema.py
│ ├── trained_models/
│ │ └── ..._esm_model.pkl
│ └── main.py
├── static
├── .env
├── .gitignore
├── build_and_push.sh
├── Dockerfile
└── requirements.txt
- Hypothesis
- Previous three days of today might have characteristics of today's weather
- Thus, highly correlated features to rainfall of previous three days can be used as features to specify whether it's going to be raining or not in the future.
- Training Process
- Data acquisition and preprocessing: data_preprocessing_assign.ipynb
- API Request to JejuDataHub to get weather data
- Remove invalid dataset under the conditions
- If the dataset has not been updated for a month
- If the dataset hasn't been recorded for at least 8 years
- ML model training: ensemble_model_jeju.ipynb
- Use
corr()
to find correlated features from dataset - Train a Ensemble Model with Voting feature
- SVM
- KNN
- Logistic Regression
- Use
- Data acquisition and preprocessing: data_preprocessing_assign.ipynb
- Returning 4 days of classification based on location
-
Request Structure(Request DTO)
# schema/prediction_request_schema.py from pydantic import BaseModel class PredictionRequestDto(BaseModel): obs_code : int
-
Response Structure(Response DTO)
# schema/prediction_response_schema.py class PredictionResponseDto(BaseModel): obs_name : str predicted_date : str raining_status : bool class PredictionsResponseDto(BaseModel): data: List[PredictionResponseDto]
-
Method
# routers/prediction_router.py router = APIRouter() @router.post("/predict/classify", response_model=PredictionsResponseDto) def read_classifications(request: PredictionRequestDto): return get_classification(request = request) # crud/classification_crud.py def get_prediction(request): # find obs ... # update with latest data ... # get model ... # generate prediction return {"data": predictions_list}
-
Response
{ "data": [ { "obs_name": "마라도", "predicted_date": "20240725", "raining_status": true, "raining_amount": 0 }, { "obs_name": "마라도", "predicted_date": "20240726", "raining_status": true, "raining_amount": 0 }, { "obs_name": "마라도", "predicted_date": "20240727", "raining_status": true, "raining_amount": 0 }, { "obs_name": "마라도", "predicted_date": "20240728", "raining_status": true, "raining_amount": 0 } ] }
-