清华主页 EN
导航菜单

Model Selection for Optimal Regression Learning

来源: 09-22

时间:Fri., 4:00-5:00pm, Sept.23,2022

地点:近春园西楼三层报告厅 Lecture Hall, Floor 3,Jin Chun Yuan West Bldg.;Zoom ID: 271 534 5558; PW: YMSC

主讲人:Prof.Yuhong Yang(University of Minnesota)

In statistical learning, various mathematical optimalities are used to characterize performances of different learning methods. They include minimax optimality from a worst-case standpoint and asymptotic efficiency from a rosy view that the regression function to be learned sits there to be discovered. When multiple models, e.g., trees, neural networks and support vector machines, are considered as possible candidates to describe the unknown regression function behind the data at hand, one hopes to develop a model selection method to automatically achieve the optimal performance offered by the candidate models as if one knew the best model to begin with. Fundamental questions include: 1. How should one conduct model selection to achieve such adaptive optimality? 2. Can different optimalities be attained simultaneously by a powerful learning procedure?


In this talk, I will give a glimpse of some foundational theories on model selection for optimal regression learning. First, we will understand why AIC type of model selection criteria lead to adaptive minimax optimal estimation. Second, we will provide insights on if hallmark theoretical properties of different model selection methods guided by different principles can or cannot be integrated in any new “super” selection criterion. Third, we will examine arguably the most widely used model selection method in statistical and machine learning applications, namely, cross-validation (CV). In particular, we will illustrate the puzzling cross-validation paradox, address a couple of widely spread deceptive misconceptions, and present a new electoral college cross-validation approach for a more reliable and trustworthy learning.



返回顶部
相关文章
  • Methods and Theory on Model Selection and Model Averaging

    DescriptionModel selection and its diagnosis are foundational elements in modern statistical and machine learning applications that serve the purpose of obtaining reliable information and reproducible results. In this short course, we introduce the principles and theories on model selection and model averaging and their applications in high-dimensional regression. Model selection methods includ...

  • Covariate-shift Robust Adaptive Transfer Learning for High-Dimensional Regression

    AbstractThe main challenge that sets transfer learning apart from traditional supervised learning is the distribution shift, reflected as the shift between the source and target models and that between the marginal covariate distributions. High-dimensional data introduces unique challenges, such as covariate shifts in the covariate correlation structure and model shifts across individual featur...