程光

校长讲座教授

教育背景

博士(威斯康辛大学麦迪逊分校)

学士(清华大学)

研究领域
深度学习理论:具有理论启发性的算法,深度学习现象的理论理解,可信赖AI:可解释性,隐私性,因果关系和鲁棒性,大数据或高维数据的统计机器学习
电子邮件
chengguang@cuhk.edu.cn
个人简介

教授于20206月加入香港中文大学(深圳)数据科学学任校长讲座教授一职教授2002年获得清华大学经济学学士学位,后在威斯康辛大学麦迪逊分校获得博士学位。加入数据科学学院前,程教授于2008年在普渡大学统计系任教, 期间曾荣获普渡大学的University Faculty ScholarCollege of Science Professional Achievement Award荣誉称号

程教授的研究方向包括大数据、机器学习、深度/强化学习、半非参推断、高维统计推断。他的研究成果曾发表在国际机器学习大会、国际电气和电子工程师协会模式分析与机器智能期刊、生物地球化学等著名会议和期刊上。此外,程教授亦担任数个知名学术期刊的编辑,如统计分析和数据挖掘、美国统计学会会刊韩国统计学会会刊、加拿大统计学期刊等。

程教授是普林斯顿高等研究院和国际统计学会的会员,曾多次美国国家科学基金、美国海军实验室,Adobe公司、西蒙斯基金会等多机构与公司的研究资助。他还获颁国家科学基金会学者职业发展奖、Noether年轻学者奖等荣誉。

学术著作

*: former/current PhD Student; **: former/current Postdoc

  1. Liu*, Yang, Shang** and Cheng (2021) Nonparametric Testing under Random Projection, IEEE Transactions on Pattern Analysis and Machine Intelligence Talk Slides
  2. Li*, Wang* and Cheng (2021) Online Forgetting Process for Linear Regression Models, AISTATS
  3. Xing*, Song and Cheng (2021) On the Generalization Properties of Adversarial Training, AISTATS
  4. Hu*, Wang, Lin and Cheng (2021) Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network , AISTATS
  5. Xing*, Zhang and Cheng (2021) Adversarially Robust Estimate and Risk Analysis in Linear Regression, AISTATS
  6. Xing*, Song and Cheng (2021) Predictive Power of Nearest Neighbors Algorithm under Random Perturbation, AISTATS
  7. Chen, Wan, Cai and and Cheng (2020) Machine Learning in/for Blockchain: Future and Challenges, Canadian Journal of Statistics
  8. Chao**, Wang*, Xing* and and Cheng (2020) Directional Pruning of Deep Neural Networks, NeurIPS
  9. Bai*, Song and and Cheng (2020) Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee, NeurIPS
  10. Duan*, Qiao and and Cheng (2020) Statistical Guarantees of Distributed Nearest Neighbor Classification Talk Slides, NeurIPS
  11. Li*, Wang*, Zhang and and Cheng (2020) Variance Reduction on Adaptive Stochastic Mirror Descent, NeurIPS OPT Workshop
  12. Guo and and Cheng (2020) Moderate-Dimensional Inferences on Quadratic Functionals in Ordinary Least Squares , JASA-T&M
  13. Yu*, Chao** and Cheng (2020) Simultaneous Inference for Massive Data: Distributed Bootstrap, ICML
  14. Cheng*, Qiao** and Cheng (2020) Mutual Transfer Learning for Massive Data, ICML
  15. Yang, Shang** and Cheng (2020) Non-asymptotic Theory for Nonparametric Testing, COLT, Talk Slides
  16. Zheng** and Cheng (2020) Finite Time Analysis of Vector Autoregressive Models under Linear Restrictions, Biometrika, Talk Slides
  17. Hao*, Zhang and Cheng (2020) Sparse and Low-rank Tensor Estimation via Cubic Sketchings, IEEE-Information Theory, a short version published in AISTATS.
  18. Wang* and Cheng (2020) Online Batch Decision-Making with High-Dimensional Covariates, AISTATS
  19. Y.  Zheng∗∗ and G. Cheng (2019). Finite Time Analysis of Vector Autoregressive Models under Linear Restrictions.  Biometrika. Invited Revision.
  20. X. Guo and G. Cheng (2019). Moderate-Dimensional Inferences on Quadratic Functionals in Ordinary Least Squares. Journal of American Statistical Association – T&M. Invited Revision.
  21. B. Hao∗, A. Zhang and G. Cheng (2019). Sparse and Low-rank Tensor Estimation via Cubic Sketchings. IEEE – Information Theory. Invited Revision.
  22. B. Hao∗, Y. Abbasi-Yadkori, Z. Wen and G. Cheng (2019). Bootstrapping Upper Confidence Bound. NeurIPS. To Appear.
  23. X. Qiao, J. Duan∗, and G. Cheng (2019). Rates of Convergence for Large-scale Nearest Neighbor Classification. NeurIPS. To Appear.
  24. Q. Song and G. Cheng (2019). Bayesian Fusion Estimation via t-Shrinkage. Sankhya A. Invited Article. To Appear.
  25. X. Lyu, W.W. Sun∗, Z. Wang, H. Liu, J. Yang and G. Cheng (2019). Sparse Tensor Graphical Model: Non-convex Optimization and Statistical Inference.  IEEE Transactions on Pattern Analysis and Machine Intelligence. To Appear.
  26. G. Xu, Z. Shang∗∗ and G. Cheng (2019). Distributed Generalized Cross-Validation for Divide-and-Conquer Kernel Ridge Regression and its Asymptotic Optimality. Journal of Computational and Graphical Statistics. To Appear.
  27. Z. Shang∗∗, B. Hao∗, and G. Cheng (2019). Nonparametric Bayesian Aggregation for Massive Data. Journal of Machine Learning Research. 20 (140), 1-81.
  28. M. Liu∗, Z. Shang∗∗ and G. Cheng (2019). Sharp Theoretical Analysis for Nonparametric Testing under Random Projection. COLT. 99, 2175-2209.
  29. Y. Zhu, Z. Yu∗ and G. Cheng (2019). High Dimensional Inference in Partially Linear Models. AISTATS. 89, 2760-2769.
  30. S. Volgushev, S.-K. Chao∗∗ and G. Cheng (2019). Distributed Inference for Quantile Regression Processes. Annals of Statistics. 47, 1634-1662.
  31. Z. Yu∗, M. Levine, and G.  Cheng (2019). Minimax Optimal Estimation in High Dimensional Partially Linear Additive Models. Bernoulli. 25, 1289-1325.
  32. M. Liu∗ and G. Cheng (2018). Early Stopping for Nonparametric Testing. NeurIPS. 32, 3989-3998.
  33. G.  Xu, Z. Shang∗∗ and G. Cheng (2018). Optimal Tuning for Divide-and-Conquer Kernel Ridge Regression with Massive Data. ICML (ORAL). 80, 5483-5491.
  34. B. Hao∗, W. Sun∗, Y. Liu and G. Cheng (2018). Simultaneous Clustering and Estimation of Heterogeneous Graphical Models. Journal of Machine Learning Research. 18 (217), 1-58.
  35. M. Liu∗, J. Honorio and G. Cheng (2018). Statistically and Computationally Efficient Variance Estimator for Kernel Ridge Regression. 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton). 56, 1005-1011.
  36. Z. Shang∗∗ and G. Cheng (2018). Gaussian Approximation of General Nonparametric Posterior Distribution. Information and Inference. 7, 509-529.
  37. X. Zhang and G. Cheng (2018). Gaussian Approximation for High Dimensional Vector under Physical Dependence. Bernoulli. 24, 2640-2675.
  38. Q. Li, G.  Cheng, J. Fan and Y. Wang (2018). Embracing the Blessing of Dimensionality in Factor Models.  Journal of American Statistical Association – T&M. 113, 380-389.
  39. W. Sun∗, G.  Cheng and Y. Liu (2018). Stability Enhanced Large-Margin Classifier Selection. Statistica Sinica. 28, 1-25.
  40. Z. Shang∗∗ and G. Cheng (2017). Computational Limits of a Distributed Algorithm for Smoothing Spline. Journal of Machine Learning Research. 18, 1-37.
  41. S.-K. Chao∗∗, S. Vogushev and G. Cheng (2017). Quantile Processes for Semi and Nonparametric Regression. Electronic Journal of Statistics. 11, 3272-3331.
  42. T. Zhao, G. Cheng and H. Liu (2016). A Partially Linear Framework for Massive Heterogeneous Data. Annals of Statistics. 44, 1400-1437.
  43. S. Minsker, Y. Zhao and G. Cheng (2016). Active Clinical Trials for Personalized Medicine. Journal of American Statistical Association – T&M. 111, 875-887.
  44. W. Sun∗, X. Qiao and G. Cheng (2016). Stabilized Nearest Neighbor Classifier and Its Statistical Properties.  Journal of American Statistical Association – T&M. 111, 1254-1265.
  45. X. Zhang and G. Cheng (2016). Simultaneous Inference for High-dimensional Linear Models. Journal of American Statistical Association – T&M. 112, 757-768.
  46. W. Sun∗, J. Lu, H. Liu and G. Cheng (2016). Provable Sparse Tensor Decomposition. Journal of Royal Statistical Society – B. 79, 899-916.
  47. G. Cheng and Z. Shang∗∗ (2015). Joint Asymptotics for Semi-Nonparametric Regression Models with Partially Linear Structure. Annals of Statistics. 43, 1351-1390.
  48. W. Sun∗, Z. Wang, H. Liu and G. Cheng (2015). Non-convex Statistical Optimization for Sparse Tensor Graphical Model. NIPS. 28, 1081-1089.
  49. D. Pati, A. Bhattacharya and G. Cheng (2015). Optimal Bayesian Estimation in Random Covariate Design with a Rescaled Gaussian Process Prior. Journal of Machine Learning Research. 16, 2837-2851.
  50. G. Cheng (2015). Moment Consistency of the Exchange ably Weighted Bootstrap for Semiparametric M-Estimation. Scandinavian Journal of Statistics. 42, 665-684.
  51. Z. Shang∗∗ and G. Cheng (2015). Nonparametric Inference in Generalized Functional Linear Models. Annals of Statistics. 43, 1742-1773.
  52. G. Cheng, H. H. Zhang and Z. Shang∗∗ (2015). Sparse and Efficient Estimation for Partial Spline Models with Increasing Dimension. Annals of Institute of Statistical Mathematics. 67, 93-127.
  53. G. Cheng, L. Zhou and J. Z. Huang (2014). Efficient Semiparametric Estimation in Generalized Partially Linear Additive Models for Longitudinal/Clustered Data. Bernoulli. 141, 141-163.
  54. G. Cheng, L. Zhou, X. Chen and J. Z. Huang (2014).  Efficient Estimation of Semi- parametric Copula Models for Bivariate Survival Data. Journal of Multivariate Analysis. 123, 330-344.
  55. Z. Shang∗∗ and G. Cheng (2013). Local and Global Asymptotic Inference in Smoothing Spline Models. Annals of Statistics. 41, 2608-2638.
  56. G. Cheng, Z. Yu∗ and J. Z. Huang (2013). The Cluster Bootstrap Consistency in Generalized Estimating Equations, Journal of Multivariate Analysis. 115, 33-47.
  57. G. Cheng (2013). How Many Iterations Are Sufficient for Efficient Semiparametric Estimation? Scandinavian Journal of Statistics. 40, 592-618.
  58. G.  Cheng, Y. Zhao and B. Li (2012). Empirical Likelihood Inferences for Semiparametric Additive Isotonic Regression. Journal of Multivariate Analysis. 112, 172-182.
  59. P. Du, G. Cheng and H. Liang (2012) Semiparametric Regression Models with Additive Nonparametric Components and High Dimensional Parametric Components. Computational Statistics and Data Analysis. 56, 2006-2017.
  60. H. H. Zhang, G. Cheng and Y. Liu (2011). Linear or Nonlinear? Automatic Discovery for Partially Linear Models.  Journal of American Statistical Association – Theory & Methods. 106, 1099-1112.
  61. C. Liang, G. Cheng, D. Wixon.  and T. Balser (2011).  An Absorbing Markov Chain Approach to Understanding the Microbial Role in Soil Carbon Stabilization.  Biogeochemistry. 106, 303-309.
  62. G. Cheng and X. Wang (2011). Semiparametric Additive Transformation Models under Current Status Data. Electronic Journal of Statistics. 5, 1735 - 1764.
  63. G. Cheng and J. Z. Huang (2010). Bootstrap Consistency for General Semiparametric M-estimate. Annals of Statistics. 38, 2884-2915.
  64. G. Cheng (2009). Semiparametric Additive Isotonic Regression. Journal of Statistical Planning and Inference. 139, 1980-1991.
  65. G. Cheng and M. R. Kosorok (2009). The Penalized Profile Sampler.  Journal of Multivariate Analysis. 100, 345-362.
  66. G. Cheng and M. R. Kosorok (2008b). General Frequentist Properties of the Posterior Profile Distribution. Annals of Statistics. 36, 1819-1853.
  67. G. Cheng and M. R. Kosorok (2008a).  Higher Order Semiparametric Frequentist Inference with the Profile Sampler. Annals of Statistics. 36, 1786-1818.