Skip to content

Latest commit

 

History

History
47 lines (40 loc) · 776 Bytes

README.md

File metadata and controls

47 lines (40 loc) · 776 Bytes

BBC text Classification

Tasks performed:

  1. Imports
  2. Reading Data
    • Reading Folders
    • Reading Files
  3. Preprocessing
    • Remove duplicate data
    • Add type_id
  4. Data Cleaning
    • Stopword
    • Lemmatization
    • Lower Case
    • Removing all numeric values
  5. Feature Engineering
    • Adding Features
    • Selecting Features
    • Adding Custom Transformer
    • Pipelines
    • Training model
    • Prediction
    • Performance

Steps to run the Code:

  1. Open Terminal

  2. Clone the repo

    git clone https://github.com/kundanmail55/bbc-classification
  1. Navigate to the repo
    cd bbc-classification
  1. Install all the packages
    pip install -r requirements.txt
  1. Run Command
    python3 bbc_classification.py