Tasks performed:
- Imports
- Reading Data
- Reading Folders
- Reading Files
- Preprocessing
- Remove duplicate data
- Add type_id
- Data Cleaning
- Stopword
- Lemmatization
- Lower Case
- Removing all numeric values
- Feature Engineering
- Adding Features
- Selecting Features
- Adding Custom Transformer
- Pipelines
- Training model
- Prediction
- Performance
Steps to run the Code:
-
Open Terminal
-
Clone the repo
git clone https://github.com/kundanmail55/bbc-classification
- Navigate to the repo
cd bbc-classification
- Install all the packages
pip install -r requirements.txt
- Run Command
python3 bbc_classification.py