This repository contains various projects that explore classical Natural Language Processing (NLP) methods. Each project demonstrates different aspects of traditional NLP techniques and their applications. The projects included are Naive Bayes Sentiment Analysis, Logistic Regression Sentiment Analysis, Structured Perceptron for POS Tagging, and Distributional Semantics and Word Embeddings.
Description: This project implements a Naive Bayes classifier for sentiment analysis on movie reviews. The goal is to classify movie reviews as either positive or negative based on their content.
Features:
- Data preparation and tokenization
- Naive Bayes classifier implementation
- Model training and evaluation
Description: This project implements a logistic regression classifier for sentiment analysis on movie reviews. The aim is to classify movie reviews as either positive or negative by leveraging various features and optimizing the classifier's performance through hyperparameter tuning.
Features:
- Data preparation and feature extraction
- Logistic regression classifier implementation
- Model training, evaluation, and hyperparameter tuning
Description: This project implements a structured perceptron to perform part-of-speech (POS) tagging. The goal is to accurately tag words in sentences with their corresponding parts of speech using a structured learning approach.
Features:
- Data preparation and dictionary creation
- Structured perceptron implementation with Viterbi algorithm
- Model training and evaluation
Description: This project explores the creation and application of distributional semantic word vectors. The goal is to develop semantic representations of words from a corpus and utilize these representations in various computational lexical semantic tasks, such as synonym detection and analogy resolution.
Features:
- Co-occurrence matrix computation and PPMI transformation
- Dimensionality reduction using SVD
- Similarity computation and synonym detection
- Solving SAT analogy questions using word vectors