Data Science

Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries.

We will be using Python for all programming assignments and projects. All lectures will be posted here and should be available 24 hours after meeting time.

The course is also listed as AC209, STAT121, and E-109.


  • Rafael Irizarry, Biostatistics
  • Verena Kaynig-Fittkau, Computer Science


  • Stephanie Hicks


2:30-4pm on Tuesdays & Thursdays


10am-12pm on Fridays

Resources from fall semester 2013