Theory and Algorithms in Deep Learning: From a Numerical Analysis Perspective

Time：Tues. 9:50-11:25 am Wed. 13:30-15:05 April 15-June 4, 2025

Venue：C654 Shuangqing Complex Building A ✦+

Speaker：Juncai He

Speaker

Juncai He 何俊材

Time

Tues. 9:50-11:25 am

Wed. 13:30-15:05

April 15-June 4, 2025

Venue

C654

Shuangqing Complex Building A

Course description

This course will systematically explore and analyze some key theories and algorithms in deep learning from a numerical analysis perspective. Traditionally, the foundational theory in deep learning is largely concerned with approximation and generalization error estimates for different types of neural networks. We will interpret and study these aspects through a series of more fundamental results, particularly the expressivity of neural networks.

We will begin by presenting and proving a series of results on the connections between classical function classes in numerical analysis and deep neural networks with ReLU and ReLU$^k$ activation functions. Building on these foundational results, we will decompose various error estimates into more fundamental components related to the representational and interpolation capabilities of neural networks.

Regarding models, algorithms, and network architectures in deep learning, we will analyze them through the lens of multilevel algorithms, a well-established framework in numerical partial differential equations, particularly multigrid methods. If time permits, we will also discuss some fundamental ideas, properties, and theoretical aspects of Transformers in large language models.

Prerequisite:

Calculus, Linear Algebra, Basics of Numerical Analysis and Differential Equations

Reference:

1. Anthony, Martin, and Peter L. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 2009.

2. Goodfellow, Ian, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. Deep learning. Vol. 1, no. 2. Cambridge: MIT Press, 2016.

3. Papers by the instructor about this topic.

Target Audience:

Undergraduate and Graduate Students

About the Speaker

I am currently an Assistant Professor at Yau Mathematical Sciences Center at Tsinghua University. Before that, I was a research scientist in Computer, Electrical and Mathematical Science and Engineering Division (CEMSE) at King Abdullah University of Science and Technology (KAUST).

I received a B.S. degree in Mathematics and Applied Mathematics from Sichuan University in 2014. In the Summer of 2019, I received my Ph.D. degree in Computational Mathematics under the supervision of Prof. Jinchao Xu and Prof. Jun Hu at Peking University in Beijing, China. From 2019 to 2020, I worked as a Postdoctoral Scholar supervised by Prof. Jinchao Xu in the Department of Mathematics at The Pennsylvania State University, University Park. From 2020 to 2022, I was an R.H. Bing postdoctoral fellow working with Prof. Richard Tsai and Prof. Rachel Ward in the Department of Mathematics at UT Austin, Austin.

Research:

· Deep Learning, Stochastic Optimization.

· Numerical Analysis, Finite Element Methods, Multigrid Methods.

DATEApril 9, 2025

Related News

0
Deep Generative Models
Description: This course introduces how to develop deep generative models (DGMs) by integrating probabilistic graphical models and deep learning to generate realistic data including images, texts, graphs, etc. Course contents include 1) basics of probabilistic graphical models, including Bayesian network and Markov random field; 2) posterior inference methods, including message passing, variat...
1
High-Dimensional Statistical Learning Theory
Description:High-dimensional statistical learning has become an increasingly important research area. In this course, we will provide theoretical foundations of high-dimensional learning for several widely studied problems with many applications. More specifically, we will review concentration inequalities, VC dimension, metric entropy and statistical implications, consider high-dimensional lin...