hogwild-spark

Abstract

The goal of this project is to design, implement and experiment a synchronous version in Spark of a distributed stochastic gradient descent (SGD) used in Support Vector Machines (SVMs) by comparing it with previous synchronous and asynchronous implementations in Python.

The main reference for this project is the Hogwild! paper. The Hogwild! paper is an important paper in the Machine Learning and Parallel Computing community that shows that SGD can be implemented without any locking when the associated optimization problem is sparse. hogwild-python is a synchronous and asynchronous implementation in Python of the Hogwild! algorithm by EPFL students. This project is part of the CS-449 Systems for Data Science course taught at EPFL in the Spring semester of 2019.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
analysis		analysis
src/main/scala		src/main/scala
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hogwild-spark

Abstract

About

Releases

Packages

Contributors 3

Languages

adriguerra/hogwild-spark

Folders and files

Latest commit

History

Repository files navigation

hogwild-spark

Abstract

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages