Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 1.24 KB

README.md

File metadata and controls

7 lines (4 loc) · 1.24 KB

Tag-Recommendation-System

This project attempted to comprehend, construct, and test tag recommendation algorithms tailored exclusively for the NFCA dataset (This is a private dataset of the National Fairground and Circus Archive, Sheffield specifically accesible for research purposes). The research began with a thorough analysis of the literature on the relevance of archives, the impact of ML and NLP on the archive industry, and the unique requirements of tag recommendation systems. Various models such as Word2Vec, Doc2Vec, TF-IDF, LDA, and BERT were examined in depth to understand their working and applications across different industries. After careful consideration, Word2Vec and LDA are chosen as the main models for this research due to their distinct characteristics and potential applicability to the NFCA dataset. Through procedures like data pre-processing, model development, and comparative evaluation, valuable insights are gained on how these models perform. Word2Vec proved to be a strong model that aligned with both quantitative measurements and the qualitative expectations of an assumed NFCA expert.

Workflow

workflow