- Description
- Installation
- Usage
- Support
- Contributing
- Authors and Acknowledgment
- License
TOMMY is a tool for interactive topic modeling. It is designed to be used by researchers and data scientists who want to explore their data and extract topics from it without requiring extensive programming experience. TOMMY is built on top of the LDA, NMF and BERTopic algorithms. It provides a user-friendly interface for exploring the topics extracted from the data.
To run TOMMY, you can create a virtual environment. Once you're in your virtual environment, running the following command will download all the necessary packages.
pip install -r ./requirements.txt
By running the following command, TOMMY will start up. Note that this might take some time (30 seconds).
python -m tommy.main
Alternatively, the executables can be downloaded off of tommy.fyor.nl. Instructions for the installation process can be found in the installation guide. Note that this website is written in Dutch.
The software can be used to explore topics in a dataset. The user can import a dataset by selecting the 'Import' button in the top left corner. This will open a window where you can select the folder containing the dataset. TOMMY will then try to read all the files in the folder and display them in the file overview. After this is done, you can select the different parameters on the left, and run the topic modeling algorithm by clicking the 'Toepassen' button just below.
You can exclude words from the analysis by selecting the 'Blacklist' tab, by filling in the words you want to exclude in the text box. These must be separated by an enter.
If you have any questions or issues, feel free to make a post on this repository. We will try to respond as soon as possible.
If you want to contribute to this project, feel free to make a pull request. We ask you to follow the styleguide provided in the repository. Additionally, it would be appreciated if you could provide a brief description of the changes and the reason for them. This will help us to understand the changes and to merge them more quickly. Please try to include tests for the changes, reducing the likelihood of breaking the software for everyone.
This software has been developed by students of Utrecht University as part of our graduation project. This project has been commissioned by the Dutch company EMMA.
The students who have contributed to this project are:
- Jasper Hofman
- Nick Jordan
- Erben Klaver
- Fyor Lambermont
- Thomas Loef
- Mees Notermans
- Wessel van der Ven
- Rens Versnel
- Isabelle de Wolf
Our supervisors are:
Our client is:
This project is licensed under the GNU AGPLv3 License - see the LICENSE.md file for more details.