The following project compares different approaches for information retrieval. We use a postings list and inverted index, and compare various methods like TF-IDF, Word Embeddings and Boolean retrieval, with commercial search engines like ElasticSearch and Apache Solr, in terms of performance metrics precision, recall, F1-Score and accuracy,and timing.
- Open terminal, and run $git clone https://github.com/Abilityguy/Postings-List-and-Inverted-Index
- Run $cd Postings-List-and-Inverted-Index
- run $python3 search_engine_and_performance_metrics.py
- Go to http://127.0.0.1:5000 on your browser. Enter any query and choose any model and click on search to view the result.
- Go to http://127.0.0.1:5000/api/v1/performance_metrics on your browser to view comparison of performance metrics between different models.