CS109 Data Science
Predicting Hubway Stations Status by Lauren Alexander, Gabriel Goulet-Langlois, Joshua Wolff
Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries.
We will be using Python for all programming assignments and projects.
The course is also listed as AC209, STAT121, and E-109.
Instructors
- Pavlos Protopapas, SEAS
- Kevin Rader, Statistics
- Mark Glickman, Statistics
- Chris Tanner, SEAS
- Joe Blitzstein, Statistics
- Hanspeter Pfister, Computer Science
- Verena Kaynig-Fittkau, Computer Science
Material from CS 109 taught from present to 2013
- 2020: Protopapas, Rader, Tanner and Glickman
- 2019: Protopapas, Rader, Tanner and Glickman
- 2018 Protopapas, Rader, Tanner and Glickman
- 2017 Protopapas, Rader
- 2016
- 2015 Blitzstein, Pfister, Kaynig-Fittkau
- 2014
- 2013 Blitzstein, Pfister