Skip to content

Predicting Red Wine Quality with MLOPS: A complete system for wine quality prediction, data management, and data drift monitoring. This project uses Docker containers, Streamlit for user interfaces, PostgresSQL to save the data, and Kafka for data communication, providing a robust pipeline for wine quality predictions.

Notifications You must be signed in to change notification settings

Mousteph/MLOPS_Wine_Quality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLOPS: Red Wine Quality Prediction

Project Image

Predicting the quality of red wine has never been easier. Welcome to our MLOPS project, where we've built a complete system to predict red wine quality and manage the model effectively. Our goal is not just accurate predictions but to provide a robust system that includes data management, API access, and data drift monitoring using the Eurybia library.

Project Overview

Architecture

Our project employs a modular architecture with Docker containers. Each component communicates through sub-networks.

  • Frontend User: We have a user-friendly front end created with Streamlit. This allows users to input wine information and receive quality predictions. The front end communicates with the prediction backend.

  • Prediction Backend: This server hosts the machine learning model used for wine quality prediction. After each prediction, the backend returns the results to the user and also sends a copy to the "Filter/Save Data" block via a Kafka queue.

  • Filter/Save Data: This block reads the Kafka queue, filters data if necessary, and stores all the data in a PostgresSQL database.

  • PostgresSQL Database: All user data is stored in the PostgresSQL database.

  • Data Drift Monitoring: We have another Streamlit front end responsible for displaying and alerting if any data drift occurs. This front end communicates with a backend that checks data drift.

  • Data Drift Backend: This server is responsible for checking data drift. It periodically requests the PostgresSQL database and analyzes the distribution of new data sent by users. If data drift is detected, it sends an alert to the "Data Drift Monitoring" front.

Getting Started

  1. Start the project with the following command:

    docker-compose up -d
    
  2. Access the following URLs in your favorite browser:

  3. To stop the project, use the following command:

    docker-compose down
    
  4. To delete all data in the containers, use:

    docker-compose rm -svf
    

Testing Data Monitoring

To test the monitoring of data drift, you can use the following command:

python send_batch_data.py [-h] {corrupted,normal,noisy}

Command Usage:

Positional Arguments:

  • {corrupted,normal,noisy}: Specifies the type of data to process. You have three options:
    • corrupted: Data that does not respect the initial distribution.
    • normal: Data that respects the initial distribution.
    • noisy: Data that respects the initial distribution but with added noise.

Options:

  • -h, --help: Shows the help message and exits.

Use this command to simulate different data scenarios and test the data monitoring capabilities of the system.

Example Usage:

To send a batch of normal data for monitoring, use the following command:

python send_batch_data.py normal

For corrupted or noisy data, replace normal with corrupted or noisy accordingly.

Retraining the Model

To retrain the model, you can use the following command:

python train_model.py

Resources

AUTEURS: Moustapha DIOP, Mathieu RIVIER

About

Predicting Red Wine Quality with MLOPS: A complete system for wine quality prediction, data management, and data drift monitoring. This project uses Docker containers, Streamlit for user interfaces, PostgresSQL to save the data, and Kafka for data communication, providing a robust pipeline for wine quality predictions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published