A more detailed description of the course, policies, and available materials can be found on our Canvas page (requires enrollment in the course).
This course provides a comprehensive introduction to computation and programming for Statistics and Data Science majors. Students will learn how to load and manipulate data, explore and visualize data, and design and implement algorithms for statistical analysis. Students will learn to use the R programming language, along with useful packages for data manipulation and visualization. We will also emphasize the skills and techniques required for team based collaboration in research and industry.
- Lead Instructor: Dr. Mark Fredrickson, [email protected]
- GSIs: Yijia Wang ([email protected])
We will use the R programming language and the RStudio development environment. For additional details on installing R and RStudio or using the Great Lakes cluster environment, please see our Canvas page.
We will use git to distribute lecture notes, lab assignments, and homework assignments. Clone this repository to get started.
To install git,
- Download git from https://git-scm.com/ or using a package manager for your operating system.
- You will also need Git Large File Support
- Our main text book is R for Data Science by Hadley Wickham and Garett Grolemund
- Time depending, we will pull additional material from
- For learning git, we will use Beginning Git and GitHub by Mariot Tsitoara.
- For statistical background, we suggest Practical Statistics for Data Scientists
- Homework Assignments (50%)
- Quizzes (20%)
- Labs (10%)
- Project 1 (10%)
- Project 2 (10%)
Late homework will not be accepted. Please email the professor ([email protected]) if you have circumstances requiring an extension. Please contact the professor as soon as possible to discuss any accommodations.
Additional policies available on our course Canvas page.