CS345
Probabilistic Foundations of Machine Learning

In recent years, Machine Learning has enabled applications that were previously not thought possible—from systems that propose novel drugs or generate new art/music, to systems that accurately and reliably predict outcomes of medical interventions in real-time. But what has enabled these developments? Faster computing hardware, large amounts of data, and the Probabilistic paradigm of Machine Learning (ML), a paradigm that casts recent advances in ML, like neural networks, into a statistical learning framework. In this course, we introduce the foundational concepts behind this paradigm—statistical model specification, and statistical learning and inference—focusing on connecting theory with real-world applications and hands-on practice. While expanding our methodological toolkit, we will simultaneously introduce critical perspectives to examine the ethics of ML within sociotechnical systems. This course lays the foundation for advanced study and research in ML. Topics include: directed graphical models, deep Bayesian regression/classification, generative models (latent variable models) for clustering, dimensionality reduction, and time-series forecasting. Students will get hands-on experience building models for specific tasks, most taken from healthcare contexts, using NumPyro, a Python-based probabilistic programming language.

Units: 1

Max Enrollment: 18

Prerequisites: (One of the following - CS 244, CS 344, STAT 260, STAT 318, MIT 6.3900, or the QAI Summer Program) and (one of the following - MATH 205, MATH 206, MATH 220, MATH 225), comfort in Python, and permission of the instructor.

Distribution Requirements: MM - Mathematical Modeling and Problem Solving

Degree Requirements: DL - Data Literacy (Formerly QRF)

Typical Periods Offered: Fall

Semesters Offered this Academic Year: Fall

Notes: