【数据科学名家讲坛】A Framework for Statistical Inference via Randomized Algorithms & Testing Significant Dependencies in High Dimensions via Maxima of U-statistics & On Eigenvalue Distributions of Large Autocovariance Matrices
主题:1. A Framework for Statistical Inference via Randomized Algorithms
2. Testing Significant Dependencies in High Dimensions via Maxima of U-statistics
3. On Eigenvalue Distributions of Large Autocovariance Matrices
报告人:1. Edgar Dobriban, Associate Professor, University of Pennsylvania
2. Johannes Heiny, Associate Professor, Stockholm University
3. Wangjun Yuan, Postdoc Researcher, University of Luxembourg
主持人:Jeff J YAO, Presidential Chair Professor & Statistics Area Coordinator, School of Data Science, CUHK-Shenzhen
日期:3 July (Monday), 2023
时间:9:00 AM to 12:00 PM, Beijing Time
形式:Hybrid
地点:103 Meeting Room, Daoyuan Building
Zoom链接:https://cuhk-edu-cn.zoom.us/j/5304767369?pwd=aFErUGFSSDlLNWJld0VNNmpTL0k0UT09
Zoom会议号:5304767369
密码:852648
语言:English
摘要:
1. Large datasets are now common in many areas, and their analysis poses major challenges. Randomized algorithms, such as randomized sketching or projections, and subsampling, are a promising approach to ease the computational burden. However, randomized algorithms also produce non-deterministic outputs, leading to the problem of quantifying their accuracy. In this paper, we develop a statistical inference framework for uncertainty quantification via randomized methods. Within this framework we develop methods for inference using either multiple runs of the same randomized algorithm (similar to the use of subsampling), or by estimating the unknown parameters of the limiting distribution (akin to pivotal inference). As an example, we develop methods for statistical inference in the fundamental problem of least squares regression. These methods rest on characterizing the asymptotic distribution of estimators obtained via random projections. Our analysis is inspired by, and further develops, techniques from random matrix theory. The results are supported via a broad range of simulations. This is joint work with Zhixiang Zhang and Sokbae Lee.
2. We consider the problem of statistical testing for high-dimensional data. Independence tests, for example, often rely on limit theorems for empirical versions of certain dependence coefficients such as sample covariances, sample correlations, Spearman’s rho or Kendall’s tau. The latter four dependence coefficients are zero in case of independence. In this talk, we take a different point of view and work under the null hypothesis that the dependence coefficients do not exceed a small fixed threshold. We propose a max-type test that is asymptotically consistent and analyze its behavior under local alternatives. The talk is based on joint work with Patrick Bastian and Holger Dette. Reference for this talk: https://arxiv.org/pdf/2210.17439
3. In this talk, we establish a limiting distribution for eigenvalues of a class of autocovariance matrices. The same distribution has been found in the literature for a regularized version of these autocovariance matrices. The original nonregularized autocovariance matrices are noninvertible, thus introducing supplementary difficulties for the study of their eigenvalues through Girko’s Hermitization scheme. The key result in this paper is a new polynomial lower bound for a specific family of least singular values associated to a rank-defective quadratic function of a random matrix with independent and identically distributed entries. Another innovation from the paper is that the lag of the autocovariance matrices can grow to infinity with the matrix dimension.
简介:
1. Edgar Dobriban is an associate professor in the Department of Statistics and Data Science at the University of Pennsylvania, with a secondary appointment in Computer and Information Science. He obtained a PhD in statistics from Stanford University in 2017. His research interests include the statistical analysis of large datasets, and the foundations of machine learning. He has received a Committee of Presidents of Statistical Societies (COPSS) Emerging Leader Award, a Sloan Research Fellowship in Mathematics, a Bernoulli Society New Researcher Award, a Junior Research Award from the International Chinese Statistical Association, and a U.S. National Science Foundation CAREER award.
2. Johannes Heiny is an Associate Professor in Mathematical Statistics at the Department of Mathematics, Stockholm University. His research interests lie in random matrix theory, high-dimensional statistics, extreme value theory and data science. In particular, his research aims at describing the dependence structure of large data sets which is fundamental for statistical inference and prediction in high dimension, where the sample size is of the order of the data dimension. Before joining Stockholm University Johannes Heiny held postdoc positions at Ruhr University Bochum and Aarhus University, and he obtained a PhD degree from the University of Copenhagen.
3. Dr. Wangjun Yuan is a postdoc researcher at the University of Luxembourg working with Mark Podolskij. Before joining University of Luxembourg, he has been a postdoc fellow at University of Ottawa supported by Raluca Balan from 2021 to 2022. He got a bachelor's degree in mathematics from University of Science and Technology of China in 2017, and then a Ph.D. degree in mathematics from The University of Hong Kong in 2021 under the supervision of Jian Song, Guangyue Han and Jianfeng Yao. His research interests are mainly in the field of random matrix theory, stochastic differential equation, stochastic partially differential equation and Malliavin calculus. He has published papers on excellent international journal such as Transaction AMS, AAP, EJP, SPA.


