Skip to content

edstenson/LearnDataScience

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Who

What

  • A collection of Data Science Learning materials in the form of IPython Notebooks.
  • Associated data sets.

The initial beta release consists of four major topics

  • Linear Regression
  • Logistic Regression
  • Random Forests
  • K-Means Clustering

Each of the above has at least three IPython Notebooks covering

  • Overview (an exposition of the technique for the math-wary)
  • Data Exploration (the nuts and bolts of real world data wrangling)
  • Analysis (using the technique to get results)

One or more of these may have supplementary material. Each of these have worksheets that contain mostly the code sections so you can iteratively explore the code.

Three openly available data sets are used.

Why

There's a need for open content to raise the level of awareness and training in basics, in the Data Science field (circa early 2013).

IPython Notebook provides an appropriate platform for rapid iterative exploration and learning.

When

Starting in 2013 and intended to extend for a long while.

Where

Today github, tomorrow the world.

How

Learn Data Science is based on content developed by me (Nitin Borwankar) for the Open Data Science Training project http://opendst.org Most of the content (circa July 2013) is copyright (c) Alpine Data Labs as per the license at opendst.org, and is freely available. Extensions to the content embodied in this projects content are also released under the same license - see the LICENSE.txt file.

IPython Notebooks at Beta.

About

Open Content for self-directed learning in data science

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Perl 63.6%
  • Python 32.0%
  • CSS 4.4%