Course materials for General Assembly's Data Science course in San Francisco (10/27/15 - 1/17/16).
- Dates: 10/27/15 - 1/19/16, Tuesday - Thursday 6:30-9:30
- Holidays (no class): 11/26 (Thanksgiving), 12/21 - 1/1 (winter break)
- Location: 225 Bush Street, Classroom 4
- Instructor: Francesco Mosconi
- Expert-in-Residence: Dylan Hercher, Otto Stegmaier
Foundational course in data science, including machine learning theory, case studies and real-world examples, introduction to various modeling techniques, and other tools to make predictions and decisions about data. Students will gain practical computational experience by running machine learning algorithms and learning how to choose the best and most representative data models to make predictions. Students will be using Python throughout this course.
- Own laptop
- Continuum Analytics Anaconda python
- Git & Github
In order to receive a General Assembly Certificate in Data Science, upon completion of the course, students must:
- Complete and submit 80% of all course assignments (homework, homework reviews, labs, quizzes). Students who miss more than 20% of assignments will not be eligible for the course certificate.
- Complete and subimt the course midterm test.
- Complete and submit the course final project, completing all functional and technical requirements on the project rubric, including delivering a presentation.
Assignments, milestones and feedback throughout the course are designed to prepare students to deliver a quality course project.
The weekly schedules for lecture content, lab content, and homework assignments are subject to change according to the needs & preferences of the class.
Week | Tuesday | Thursday |
---|---|---|
1 | 10/27: Introduction to Data Science, Git setup | 10/29: Python & Linear Algebra review |
2 | 11/03: Cleaning and imputing Data | 11/05: Data Sources |
3 | 11/10: Introduction to Machine Learning, Regression | 11/12: Cross Validation and Naïve Bayes |
4 | 11/17: Regression and Regularization | 11/19 Logistic Regression |
5 | 11/24: Imbalanced Classes and Evaluation Metrics | 11/26: Thanksgiving -- No Class |
6 | 11/31: Decision Trees | 12/01: Support Vector Machines |
7 | 12/01: Ensemble Techniques | 12/03: Review of Supervised Learning |
8 | 12/08: K-Means Clustering and Unsupervised learning | 12/10: Dimensionality Reduction |
9 | 12/15: Recommendation systems | 12/17: Natural Language Processing and Text Mining |
10 | 01/05: Database Technologies | 01/07: Map Reduce |
11 | 01/12: Data Products | 01/11: Final Project Work session |
12 | 01/17: Final project presentations |
HW | Topics | Dataset | Assigned | Due | Review Due |
---|---|---|---|---|---|
1 | Github setup | 10/29 | 11/3 | 11/5 |
Instructor | Times | Available method |
---|---|---|
Dylan | ||
Otto | Tuesday 5:30pm -6:30pm | Classroom 4 or slack |
Francesco | Tuesday & Thursday | slack (quickest response) or hangouts by appointment |
You've all been invited to use Slack for chat during class and the day. Please consider this the primary way to contact other students. Dylan will be in Slack during class to handle questions. All instructors will be available on Slack during office hours (listed above).
- Data Science at the Command Line
- Learning Bash Scripting For Beginners
- Advanced Bash Scripting Guide
- Data Science from Scratch Use the discount code "ASSEMBLY" for 40% off print version and 50% off ebook