STAT 288: Deep Statistics: AI and Earth Observations for Sustainable Development

Graduate Course, Teaching Assistant, Harvard University, Department of Statistics, 2025

Instructors: Xiao-Li Meng and Adel Daoud

Offered: Spring 2024, Spring 2025

Course Abstract

With the aim to enhance concomitantly the rigor and efficiency of data science for scientific inquires, deep statistics emphasizes principled systems thinking throughout the entire data science ecosystem, from data conception to their postmortem examination for scientific reproducibility and replicability. This course introduces the trinity of multi-source, multi-phase, and multi-resolution statistical learning, and invites participants think through their implications and implementations in the context of AI and Earth Observations (EO) for sustainable human development. Theoretically, the course contemplates many trade-offs for ‘data science for science’ such as data quality vs. quantity, data privacy vs. utility, statistical vs. computational efficiencies, inferential robustness vs. relevance. Practically, it scrutinizes issues such as conceptualizing and collecting complex socioeconomic data, handling messy survey and satellite data, assessing uncertainties with black-box learning, and contemplating causal implications from AI-EO data. High-level overviews of topics such as data collection, messy data, data privacy, causality, uncertainty analysis, and deep learning will be provided on an as-needed basis.