Model Selection for Optimal Regression Learning

Time：Fri., 4:00-5:00pm, Sept.23,2022

Venue：近春园西楼三层报告厅 Lecture Hall, Floor 3,Jin Chun Yuan West Bldg.；Zoom ID: 271 534 5558; PW: YMSC

Speaker：Prof.Yuhong Yang(University of Minnesota)

In statistical learning, various mathematical optimalities are used to characterize performances of different learning methods. They include minimax optimality from a worst-case standpoint and asymptotic efficiency from a rosy view that the regression function to be learned sits there to be discovered. When multiple models, e.g., trees, neural networks and support vector machines, are considered as possible candidates to describe the unknown regression function behind the data at hand, one hopes to develop a model selection method to automatically achieve the optimal performance offered by the candidate models as if one knew the best model to begin with. Fundamental questions include: 1. How should one conduct model selection to achieve such adaptive optimality? 2. Can different optimalities be attained simultaneously by a powerful learning procedure?

In this talk, I will give a glimpse of some foundational theories on model selection for optimal regression learning. First, we will understand why AIC type of model selection criteria lead to adaptive minimax optimal estimation. Second, we will provide insights on if hallmark theoretical properties of different model selection methods guided by different principles can or cannot be integrated in any new “super” selection criterion. Third, we will examine arguably the most widely used model selection method in statistical and machine learning applications, namely, cross-validation (CV). In particular, we will illustrate the puzzling cross-validation paradox, address a couple of widely spread deceptive misconceptions, and present a new electoral college cross-validation approach for a more reliable and trustworthy learning.

DATESeptember 22, 2022

Related News

0
Covariate-shift Robust Adaptive Transfer Learning for High-Dimensional Regression
AbstractThe main challenge that sets transfer learning apart from traditional supervised learning is the distribution shift, reflected as the shift between the source and target models and that between the marginal covariate distributions. High-dimensional data introduces unique challenges, such as covariate shifts in the covariate correlation structure and model shifts across individual featur...
1
How to Choose a Model? A Consequentialist Approach Applied to Portfolio Selection in Continuous-Time
Speaker: Moris Strub (Warwick Business School )Time: Nov. 11, 15:00-16:00Venue: A3-4-301ZOOM: 230 432 7880 (BIMSA)Organizers: Zhen Li, Fei Long, Yi-Shuai Niu, Yajuan WangAbstractWe propose a consequentialist approach to model selection: Models should be chosen not according to statistical criteria, but in view of how they are used. This principle is then studied in detail for continuous-time po...