CHENG, Guang

Presidential Chair Professor

Education Background

Ph.D. Statistics, University of Wisconsin-Madison, 2006

B.S. Economics, Tsinghua University, 2002

Research Field
Deep Learning Theory: Theory-Inspired Algorithms and Theoretical Understanding of Deep Learning Phenomena;Trustworthy AI: Interpretability, Privacy, Causality and Robustness; Statistical Machine Learning for Big Data or High Dimensional Data

Professor Guang Cheng is the Presidential Chair Professor at The Chinese University of Hong Kong, Shenzhen since June 2020. Professor Cheng graduated at Tsinghua University, obtained his bachelor’s degree in Economics, then earned his Ph.D. of statistics at University of Wisconsin-Madison in 2006. Besides, he is a tenured full professor of Statistics at the Purdue University. During his tenure, he was bestowed with the University Faculty Scholar and College of Science Professional Achievement Award.

Professor Cheng's research focus areas are big data, machine learning, deep/reinforcement learning, semi-nonparametric inferences and high dimensional statistical inferences. His research has been published extensively on prestigious conferences and world-leading journals such as the International Conference on Machine Learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, and Biogeochemistry. Professor Cheng is editor or associate editor of the renowned journals, including Statistical Analysis and Data Mining, and associate editor for JASA -- T&M, Electronic Journal of Statistics, Statistica Sinica, Canadian Journal of Statistics, Journal of Blockchain Research.

Prof. Cheng has received research grants from multiple organizations including the National Science Foundation, Office of Naval Research, Adobe, Simons Foundation. He is a member of Institute for Advanced Study, elected member for International Statistical Institute. He was awarded the Noether Young Scholar Award from American Statistical Association, CAREER Award from NSF – DMS.

Academic Publications

*: former/current PhD Student; **: former/current Postdoc

  1. Liu*, Yang, Shang** and Cheng (2021) Nonparametric Testing under Random Projection, IEEE Transactions on Pattern Analysis and Machine Intelligence Talk Slides
  2. Li*, Wang* and Cheng (2021) Online Forgetting Process for Linear Regression Models, AISTATS
  3. Xing*, Song and Cheng (2021) On the Generalization Properties of Adversarial Training, AISTATS
  4. Hu*, Wang, Lin and Cheng (2021) Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network , AISTATS
  5. Xing*, Zhang and Cheng (2021) Adversarially Robust Estimate and Risk Analysis in Linear Regression, AISTATS
  6. Xing*, Song and Cheng (2021) Predictive Power of Nearest Neighbors Algorithm under Random Perturbation, AISTATS
  7. Chen, Wan, Cai and and Cheng (2020) Machine Learning in/for Blockchain: Future and Challenges, Canadian Journal of Statistics
  8. Chao**, Wang*, Xing* and and Cheng (2020) Directional Pruning of Deep Neural Networks, NeurIPS
  9. Bai*, Song and and Cheng (2020) Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee, NeurIPS
  10. Duan*, Qiao and and Cheng (2020) Statistical Guarantees of Distributed Nearest Neighbor Classification Talk Slides, NeurIPS
  11. Li*, Wang*, Zhang and and Cheng (2020) Variance Reduction on Adaptive Stochastic Mirror Descent, NeurIPS OPT Workshop
  12. Guo and and Cheng (2020) Moderate-Dimensional Inferences on Quadratic Functionals in Ordinary Least Squares , JASA-T&M
  13. Yu*, Chao** and Cheng (2020) Simultaneous Inference for Massive Data: Distributed Bootstrap, ICML
  14. Cheng*, Qiao** and Cheng (2020) Mutual Transfer Learning for Massive Data, ICML
  15. Yang, Shang** and Cheng (2020) Non-asymptotic Theory for Nonparametric Testing, COLT, Talk Slides
  16. Zheng** and Cheng (2020) Finite Time Analysis of Vector Autoregressive Models under Linear Restrictions, Biometrika, Talk Slides
  17. Hao*, Zhang and Cheng (2020) Sparse and Low-rank Tensor Estimation via Cubic Sketchings, IEEE-Information Theory, a short version published in AISTATS.
  18. Wang* and Cheng (2020) Online Batch Decision-Making with High-Dimensional Covariates, AISTATS
  19. Y.  Zheng∗∗ and G. Cheng (2019). Finite Time Analysis of Vector Autoregressive Models under Linear Restrictions.  Biometrika. Invited Revision.
  20. X. Guo and G. Cheng (2019). Moderate-Dimensional Inferences on Quadratic Functionals in Ordinary Least Squares. Journal of American Statistical Association – T&M. Invited Revision.
  21. B. Hao∗, A. Zhang and G. Cheng (2019). Sparse and Low-rank Tensor Estimation via Cubic Sketchings. IEEE – Information Theory. Invited Revision.
  22. B. Hao∗, Y. Abbasi-Yadkori, Z. Wen and G. Cheng (2019). Bootstrapping Upper Confidence Bound. NeurIPS. To Appear.
  23. X. Qiao, J. Duan∗, and G. Cheng (2019). Rates of Convergence for Large-scale Nearest Neighbor Classification. NeurIPS. To Appear.
  24. Q. Song and G. Cheng (2019). Bayesian Fusion Estimation via t-Shrinkage. Sankhya A. Invited Article. To Appear.
  25. X. Lyu, W.W. Sun∗, Z. Wang, H. Liu, J. Yang and G. Cheng (2019). Sparse Tensor Graphical Model: Non-convex Optimization and Statistical Inference.  IEEE Transactions on Pattern Analysis and Machine Intelligence. To Appear.
  26. G. Xu, Z. Shang∗∗ and G. Cheng (2019). Distributed Generalized Cross-Validation for Divide-and-Conquer Kernel Ridge Regression and its Asymptotic Optimality. Journal of Computational and Graphical Statistics. To Appear.
  27. Z. Shang∗∗, B. Hao∗, and G. Cheng (2019). Nonparametric Bayesian Aggregation for Massive Data. Journal of Machine Learning Research. 20 (140), 1-81.
  28. M. Liu∗, Z. Shang∗∗ and G. Cheng (2019). Sharp Theoretical Analysis for Nonparametric Testing under Random Projection. COLT. 99, 2175-2209.
  29. Y. Zhu, Z. Yu∗ and G. Cheng (2019). High Dimensional Inference in Partially Linear Models. AISTATS. 89, 2760-2769.
  30. S. Volgushev, S.-K. Chao∗∗ and G. Cheng (2019). Distributed Inference for Quantile Regression Processes. Annals of Statistics. 47, 1634-1662.
  31. Z. Yu∗, M. Levine, and G.  Cheng (2019). Minimax Optimal Estimation in High Dimensional Partially Linear Additive Models. Bernoulli. 25, 1289-1325.
  32. M. Liu∗ and G. Cheng (2018). Early Stopping for Nonparametric Testing. NeurIPS. 32, 3989-3998.
  33. G.  Xu, Z. Shang∗∗ and G. Cheng (2018). Optimal Tuning for Divide-and-Conquer Kernel Ridge Regression with Massive Data. ICML (ORAL). 80, 5483-5491.
  34. B. Hao∗, W. Sun∗, Y. Liu and G. Cheng (2018). Simultaneous Clustering and Estimation of Heterogeneous Graphical Models. Journal of Machine Learning Research. 18 (217), 1-58.
  35. M. Liu∗, J. Honorio and G. Cheng (2018). Statistically and Computationally Efficient Variance Estimator for Kernel Ridge Regression. 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton). 56, 1005-1011.
  36. Z. Shang∗∗ and G. Cheng (2018). Gaussian Approximation of General Nonparametric Posterior Distribution. Information and Inference. 7, 509-529.
  37. X. Zhang and G. Cheng (2018). Gaussian Approximation for High Dimensional Vector under Physical Dependence. Bernoulli. 24, 2640-2675.
  38. Q. Li, G.  Cheng, J. Fan and Y. Wang (2018). Embracing the Blessing of Dimensionality in Factor Models.  Journal of American Statistical Association – T&M. 113, 380-389.
  39. W. Sun∗, G.  Cheng and Y. Liu (2018). Stability Enhanced Large-Margin Classifier Selection. Statistica Sinica. 28, 1-25.
  40. Z. Shang∗∗ and G. Cheng (2017). Computational Limits of a Distributed Algorithm for Smoothing Spline. Journal of Machine Learning Research. 18, 1-37.
  41. S.-K. Chao∗∗, S. Vogushev and G. Cheng (2017). Quantile Processes for Semi and Nonparametric Regression. Electronic Journal of Statistics. 11, 3272-3331.
  42. T. Zhao, G. Cheng and H. Liu (2016). A Partially Linear Framework for Massive Heterogeneous Data. Annals of Statistics. 44, 1400-1437.
  43. S. Minsker, Y. Zhao and G. Cheng (2016). Active Clinical Trials for Personalized Medicine. Journal of American Statistical Association – T&M. 111, 875-887.
  44. W. Sun∗, X. Qiao and G. Cheng (2016). Stabilized Nearest Neighbor Classifier and Its Statistical Properties.  Journal of American Statistical Association – T&M. 111, 1254-1265.
  45. X. Zhang and G. Cheng (2016). Simultaneous Inference for High-dimensional Linear Models. Journal of American Statistical Association – T&M. 112, 757-768.
  46. W. Sun∗, J. Lu, H. Liu and G. Cheng (2016). Provable Sparse Tensor Decomposition. Journal of Royal Statistical Society – B. 79, 899-916.
  47. G. Cheng and Z. Shang∗∗ (2015). Joint Asymptotics for Semi-Nonparametric Regression Models with Partially Linear Structure. Annals of Statistics. 43, 1351-1390.
  48. W. Sun∗, Z. Wang, H. Liu and G. Cheng (2015). Non-convex Statistical Optimization for Sparse Tensor Graphical Model. NIPS. 28, 1081-1089.
  49. D. Pati, A. Bhattacharya and G. Cheng (2015). Optimal Bayesian Estimation in Random Covariate Design with a Rescaled Gaussian Process Prior. Journal of Machine Learning Research. 16, 2837-2851.
  50. G. Cheng (2015). Moment Consistency of the Exchange ably Weighted Bootstrap for Semiparametric M-Estimation. Scandinavian Journal of Statistics. 42, 665-684.
  51. Z. Shang∗∗ and G. Cheng (2015). Nonparametric Inference in Generalized Functional Linear Models. Annals of Statistics. 43, 1742-1773.
  52. G. Cheng, H. H. Zhang and Z. Shang∗∗ (2015). Sparse and Efficient Estimation for Partial Spline Models with Increasing Dimension. Annals of Institute of Statistical Mathematics. 67, 93-127.
  53. G. Cheng, L. Zhou and J. Z. Huang (2014). Efficient Semiparametric Estimation in Generalized Partially Linear Additive Models for Longitudinal/Clustered Data. Bernoulli. 141, 141-163.
  54. G. Cheng, L. Zhou, X. Chen and J. Z. Huang (2014).  Efficient Estimation of Semi- parametric Copula Models for Bivariate Survival Data. Journal of Multivariate Analysis. 123, 330-344.
  55. Z. Shang∗∗ and G. Cheng (2013). Local and Global Asymptotic Inference in Smoothing Spline Models. Annals of Statistics. 41, 2608-2638.
  56. G. Cheng, Z. Yu∗ and J. Z. Huang (2013). The Cluster Bootstrap Consistency in Generalized Estimating Equations, Journal of Multivariate Analysis. 115, 33-47.
  57. G. Cheng (2013). How Many Iterations Are Sufficient for Efficient Semiparametric Estimation? Scandinavian Journal of Statistics. 40, 592-618.
  58. G.  Cheng, Y. Zhao and B. Li (2012). Empirical Likelihood Inferences for Semiparametric Additive Isotonic Regression. Journal of Multivariate Analysis. 112, 172-182.
  59. P. Du, G. Cheng and H. Liang (2012) Semiparametric Regression Models with Additive Nonparametric Components and High Dimensional Parametric Components. Computational Statistics and Data Analysis. 56, 2006-2017.
  60. H. H. Zhang, G. Cheng and Y. Liu (2011). Linear or Nonlinear? Automatic Discovery for Partially Linear Models.  Journal of American Statistical Association – Theory & Methods. 106, 1099-1112.
  61. C. Liang, G. Cheng, D. Wixon.  and T. Balser (2011).  An Absorbing Markov Chain Approach to Understanding the Microbial Role in Soil Carbon Stabilization.  Biogeochemistry. 106, 303-309.
  62. G. Cheng and X. Wang (2011). Semiparametric Additive Transformation Models under Current Status Data. Electronic Journal of Statistics. 5, 1735 - 1764.
  63. G. Cheng and J. Z. Huang (2010). Bootstrap Consistency for General Semiparametric M-estimate. Annals of Statistics. 38, 2884-2915.
  64. G. Cheng (2009). Semiparametric Additive Isotonic Regression. Journal of Statistical Planning and Inference. 139, 1980-1991.
  65. G. Cheng and M. R. Kosorok (2009). The Penalized Profile Sampler.  Journal of Multivariate Analysis. 100, 345-362.
  66. G. Cheng and M. R. Kosorok (2008b). General Frequentist Properties of the Posterior Profile Distribution. Annals of Statistics. 36, 1819-1853.
  67. G. Cheng and M. R. Kosorok (2008a).  Higher Order Semiparametric Frequentist Inference with the Profile Sampler. Annals of Statistics. 36, 1786-1818.