Speaker
Juncai He 何俊材
Time
Tues. 9:50-11:25 am
Wed. 13:30-15:05
April 15-June 4, 2025
Venue
C654
Shuangqing Complex Building A
Course description
This course will systematically explore and analyze some key theories and algorithms in deep learning from a numerical analysis perspective. Traditionally, the foundational theory in deep learning is largely concerned with approximation and generalization error estimates for different types of neural networks. We will interpret and study these aspects through a series of more fundamental results, particularly the expressivity of neural networks.
We will begin by presenting and proving a series of results on the connections between classical function classes in numerical analysis and deep neural networks with ReLU and ReLU$^k$ activation functions. Building on these foundational results, we will decompose various error estimates into more fundamental components related to the representational and interpolation capabilities of neural networks.
Regarding models, algorithms, and network architectures in deep learning, we will analyze them through the lens of multilevel algorithms, a well-established framework in numerical partial differential equations, particularly multigrid methods. If time permits, we will also discuss some fundamental ideas, properties, and theoretical aspects of Transformers in large language models.
Prerequisite:
Calculus, Linear Algebra, Basics of Numerical Analysis and Differential Equations
Reference:
1. Anthony, Martin, and Peter L. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 2009.
2. Goodfellow, Ian, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. Deep learning. Vol. 1, no. 2. Cambridge: MIT Press, 2016.
3. Papers by the instructor about this topic.
Target Audience:
Undergraduate and Graduate Students
About the Speaker
I am currently an Assistant Professor at Yau Mathematical Sciences Center at Tsinghua University. Before that, I was a research scientist in Computer, Electrical and Mathematical Science and Engineering Division (CEMSE) at King Abdullah University of Science and Technology (KAUST).
I received a B.S. degree in Mathematics and Applied Mathematics from Sichuan University in 2014. In the Summer of 2019, I received my Ph.D. degree in Computational Mathematics under the supervision of Prof. Jinchao Xu and Prof. Jun Hu at Peking University in Beijing, China. From 2019 to 2020, I worked as a Postdoctoral Scholar supervised by Prof. Jinchao Xu in the Department of Mathematics at The Pennsylvania State University, University Park. From 2020 to 2022, I was an R.H. Bing postdoctoral fellow working with Prof. Richard Tsai and Prof. Rachel Ward in the Department of Mathematics at UT Austin, Austin.
Research:
· Deep Learning, Stochastic Optimization.
· Numerical Analysis, Finite Element Methods, Multigrid Methods.