This project implements a Recurrent Neural Network (RNN) to classify websites into predefined categories based on their content. The primary workflow involves fetching the title of a webpage from its URL, processing it, and predicting its category using a pre-trained model.
- Utilizes an RNN with pre-trained embeddings for efficient and accurate text categorization.
- Automatically fetches and processes webpage titles using the BeautifulSoup library.
- Balances training data for top categories to improve model generalization.
- Leverages GloVe embeddings for enhanced word representation.
- Fine-tuned for better performance on the provided dataset.
For a detailed walkthrough, check out the full article on Medium:
Website Classifier using RNN