Multiverso is a parameter server based framework for training machine learning models on big data with numbers of machines. It is currently a standard C++ library and provides a series of friendly programming interfaces. With such easy-to-use APIs, machine learning researchers and practitioners do not need to worry about the system routine issues such as distributed model storage and operation, inter-process and inter-thread communication, multi-threading management, and so on. Instead, they are able to focus on the core machine learning logics: data, model, and training.
For more details, please view our website http://www.dmtk.io.
Linux (Tested on Ubuntu 12.04)
- Run
cd third_party
;./install.sh
to install the dependence. - Run
make all -j4
to build the multiverso.
Windows
For windows users, please refer to README in windows folder.
Current distributed systems based on multiverso:
- lightlda: Scalable, fast, lightweight system for large scale topic modeling
- distributed_word_embedding Distributed system for word embedding
- distributed_skipgram_mixture Distributed skipgram mixture for multi-sense word embedding