About me

I am an Assistant Professor of Statistics at the University of Chicago. My research develops statistical methods for high-dimensional and multimodal data integration, with applications in biology, environmental science, and public health. I work on structured estimation, graph-constrained models, and graph neural networks, with recent projects including uncertainty quantification for sparse canonical correlation analysis, denoising on graphs, and statistical foundations for GNNs.
📢 NEWS: The group is recruiting!
We welcome applications on a rolling basis for postdocs, rotation PhD students (already accepted to UChicago), and a limited number of MS or advanced undergraduate students. Learn more about our research and available opportunities.
  • Postdoctoral fellows: Email cdonnat [at] uchicago.edu with your interests and CV.
  • UChicago PhD/Masters/Undergrad students: Email your interests, CV, and transcript.
  • Note: For those outside UChicago, we may not have the bandwidth to respond.
đź’Ľ NEWS: Consulting Availability
I am available for research consulting in areas including statistical methodology, high-dimensional data analysis, graph-based modeling, and data integration for the life sciences. For inquiries, please contact me at cdonnat [at] uchicago.edu.

Research Interests

My research develops statistical methodology for the analysis, integration, and uncertainty quantification of high-dimensional and structured data, with a particular emphasis on graphs, networks, and multimodal biological datasets. Graphs provide a unifying framework for representing complex relationships—whether between microbial taxa, cell populations in tissues, or interacting variables in a multivariate system. They differ from traditional Euclidean data in two key ways:
(1) Dependence structure: Nodes are connected, breaking the independence assumptions underlying many classical methods. Inference must explicitly incorporate graph topology, edge weights, and neighborhood structure. (2) Irregularity: Graphs lack a canonical ordering, reference frame, or homogeneity in node roles (degree, centrality, etc.), requiring statistical and machine learning methods that adapt to structural heterogeneity.

Methodological Directions

GNNs
1 — Theory and Methods for Graph Neural Networks
Developing a principled statistical foundation for GNNs, including bias–variance tradeoffs, topology-aware generalization, interpretable architectures, and reliable model selection. Learn more.
Structured Estimation
2 — Structured Estimation and Graph-Constrained Models
Designing algorithms for dimension reduction and matrix factorization that incorporate known network structure or sparsity, with provable guarantees and interpretable outputs.
Multimodal Integration
3 — Multimodal Data Integration and Uncertainty Quantification
Integrating heterogeneous datasets (e.g., genomic, transcriptomic, spatial) using sparse CCA, regularized regression, and probabilistic graphical models, with an emphasis on uncertainty assessment.

Applications

  • Thermotolerance in photosynthetic microbes — Linking genomic, transcriptomic, and metabolomic data of cyanobacteria and Chlamydomonas to phenotypes under temperature or light stress.
  • Family networks and child welfare — Modeling kinship structures to study their impact on outcomes in child protective services.
  • Spatial transcriptomics — Detecting spatial gene expression patterns and cell–cell interactions.
  • Microbial communities — Modeling environmental and host-associated microbiomes using graph-based approaches.