Skip to content

A mini project of text classification and sentiment analysis on IMDB movie reviews

Notifications You must be signed in to change notification settings

AnitaSoroush/IMDBReviewsSentimentAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 

Repository files navigation

IMDBReviewsSentimentAnalysis

This project is a sentiment analysis of 50k movie reviews on Internet Movie Database (IMDB) and the dataset is available on: IMDB dataset of 50k movie reviews

SentimentAnalysis_IMDBReviews.ipynb consists of the following steps:

Preprocessing:

  • Lowercasting
  • Removing URLs
  • Removing Punctuations
  • Removing Stopwords
  • Handling Emogies
  • stemming

vectorizing using TF-IDF

building, training and testing the model using 3 methods:

  • Logistic Regression
  • Random Forest Classifier
  • Decision Tree Classifier

At the end, Logistic Regression turned out to be the best model, with accuracy score of 0.8941