Aleksandr (Sasha) Podkopaev

Aleksandr (Sasha) Podkopaev

Senior Data Scientist

Walmart Global Tech

About me

Hi there! I am a Senior Data Scientist at Walmart Global Tech (AdTech team). I have recently graduated from the joint PhD program between Statistics & Data Science and Machine Learning departments at Carnegie Mellon University where I had been fortunate to be advised by Professor Aaditya Ramdas.

My most recent research interests include predictive uncertainty quantification (conformal prediction, posthoc recalibration) and sequential testing (sequential nonparametric two-sample and independence testing). I am also highly interested in developing statistical methods that are robust to distribution shifts for enhancing their applicability to practice. Before joining CMU, I obtained BSc and MSc degrees at Moscow Institute of Physics and Technology and Skoltech.

Interests
  • Sequential testing and safe, anytime-valid inference
  • Distribution-free uncertainty quantification (conformal prediction, calibration)
  • Distribution shifts
Education
  • PhD in Statistics & Machine Learning, 2023

    Carnegie Mellon University

  • MSc in Applied Mathematics & Computer Science, 2018

    Skolkovo Institute of Science and Technology, Moscow Institute of Physics and Technology

  • BSc in Applied Mathematics & Physics, 2016

    Moscow Institute of Physics and Technology

Publications

Sequential predictive two-sample and independence testing

We study the problems of sequential nonparametric two-sample and independence testing. Sequential tests process data online and allow using observed data to decide whether to stop and reject the null hypothesis or to collect more data while maintaining type I error control. We build upon the principle of (nonparametric) testing by betting, where a gambler places bets on future observations and their wealth measures evidence against the null hypothesis. While recently developed kernel-based betting strategies often work well on simple distributions, selecting a suitable kernel for high-dimensional or structured data, such as text and images, is often nontrivial. To address this drawback, we design prediction-based betting strategies that rely on the following fact : if a sequentially updated predictor starts to consistently determine (a) which distribution an instance is drawn from, or (b) whether an instance is drawn from the joint distribution or the product of the marginal distributions (the latter produced by external randomization), it provides evidence against the two-sample or independence nulls respectively. We empirically demonstrate the superiority of our tests over kernel-based approaches under structured settings. Our tests can be applied beyond the case of independent and identically distributed data, remaining valid and powerful even when the data distribution drifts over time.

Sequential kernelized independence testing

Independence testing is a fundamental and classical statistical problem that has been extensively studied in the batch setting when one fixes the sample size before collecting data. However, practitioners often prefer procedures that adapt to the complexity of a problem at hand instead of setting sample size in advance. Ideally, such procedures should (a) allow stopping earlier on easy tasks (and later on harder tasks), hence making better use of available resources, and (b) continuously monitor the data and efficiently incorporate statistical evidence after collecting new data, while controlling the false alarm rate. It is well known that classical batch tests are not tailored for streaming data settings, since valid inference after data peeking requires correcting for multiple testing, but such corrections generally result in low power. In this paper, we design sequential kernelized independence tests (SKITs) that overcome such shortcomings based on the principle of testing by betting. We exemplify our broad framework using bets inspired by kernelized dependence measures such as the Hilbert-Schmidt independence criterion (HSIC) and the constrained-covariance criterion (COCO). Importantly, we also generalize the framework to non-i.i.d. time-varying settings, for which there exist no batch tests. We demonstrate the power of our approaches on both simulated and real data.

Tracking the risk of a deployed model and detecting harmful distribution shifts

When deployed in the real world, machine learning models inevitably encounter changes in the data distribution, and certain – but not all – distribution shifts could result in significant performance degradation. In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially, making interventions by a human expert (or model retraining) unnecessary. While several works have developed tests for distribution shifts, these typically either use non-sequential methods, or detect arbitrary shifts (benign or harmful), or both. We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate. In this work, we design simple sequential tools for testing if the difference between source (training) and target (test) distributions leads to a significant drop in a risk function of interest, like accuracy or calibration. Recent advances in constructing time-uniform confidence sequences allow efficient aggregation of statistical evidence accumulated during the tracking process. The designed framework is applicable in settings where (some) true labels are revealed after the prediction is performed, or when batches of labels become available in a delayed fashion. We demonstrate the efficacy of the proposed framework through an extensive empirical study on a collection of simulated and real datasets.

Contact

  • alexpodkopaev94 AT gmail DOT com