Research

Papers

AISTATS 2026

Empirically Calibrated Conditional Independence Tests

Milleno Pan, Antoine de Mathelin, Wesley Tansey

Conditional independence tests (CIT) are widely used for causal discovery and feature selection. Even with false discovery rate (FDR) control procedures, they often fail to provide frequentist guarantees in practice. We highlight two common failure modes: (i) in small samples, asymptotic guarantees for many CITs can be inaccurate and even correctly specified models fail to estimate the noise levels and control the error, and (ii) when sample sizes are large but models are misspecified, unaccounted dependencies skew the test's behavior and fail to return uniform p-values under the null. We propose Empirically Calibrated Conditional Independence Tests (ECCIT), a method that measures and corrects for miscalibration. For a chosen base CIT (e.g., GCM, HRT), ECCIT optimizes an adversary that selects features and response functions to maximize a miscalibration metric. ECCIT then fits a monotone calibration map that adjusts the base-test p-values in proportion to the observed miscalibration. Across empirical benchmarks on synthetic and real data, ECCIT achieves valid FDR with higher power than existing calibration strategies while remaining test agnostic.

View on arXiv View Poster View Code

Research Experience

Causal Inference on Gene Expression Analysis

Memorial Sloan Kettering Cancer Center

Apr 2024 - Feb 2025

Master's thesis research on causal inference methods for gene expression analysis and treatment-response modeling across cancer datasets.

Implemented conditional independence testing methods, including permutation and residual-based approaches, to balance Type-I error and statistical power.
Built hypothesis testing frameworks with graph-based subset approaches to assess causal effects across genes and samples.

Full Text

Tracking the Opioid Epidemic with Social Media

Stanford University

Jan 2023 - Apr 2023

Research in Stanford Bioengineering on social-media-based prediction of emerging drug epidemics; funded by an NIH R21 grant.

Used NLP tooling (DLATK) for topic modeling and sentiment analysis on Reddit and Twitter language usage.
Generated reporting workflows to support granular risk-tracking analyses.

Reinforcement Learning for Optimal Semantic Grouping

Berkeley School of Information

Aug 2021 - May 2022

Research on reinforcement-learning methods for improved element grouping in mobile UI layouts.

Applied Monte Carlo Tree Search methods and HCI-oriented layout-tree techniques for better hierarchical subgrouping.
Improved performance over prior LSTM/RNN implementations on the project samples.