清华主页 EN
导航菜单

Optimization Methods for Machine Learning

来源: 04-15

时间:2024-04-16 ~ 2024-06-16 Tue,Thu 19:10-21:35

地点:A3-2-303 Zoom: 435 529 7909 Password: BIMSA

主讲人:Yishuai Niu (Associate Professor)

Introduction

Stochastic Gradient Descent (SGD), in one form or another, serves as the workhorse method for training modern machine learning models. Amidst its myriad variations, the SGD domain is both extensive and burgeoning, presenting a significant challenge for both practitioners and even experts to understand its landscape and inhabitants. This course offers a mathematically rigorous and comprehensive introduction to the field, drawing upon the most recent advancements and insights. It meticulously constructs a theory of convergence and complexity for SGD's serial, parallel, and distributed variants across strongly convex, convex, and nonconvex settings, incorporating randomness from subsampling, compression, and other sources.

The curriculum also delves into advanced techniques such as acceleration through Polyak momentum or Nesterov extrapolation. A notable portion of the course is dedicated to a unified analysis of a large family of SGD variants. Historically, these variants have demanded distinct intuitions, convergence analyses, and applications, evolving separately across various communities. This framework includes but not limited to the useful techniques: variance reduction, data sampling, coordinate sampling, arbitrary sampling, importance sampling, mini-batching, quantization, sketching, dithering, and sparsification, as well as their combinations. This comprehensive exploration aims to equip learners with a deep understanding of SGD's intricate landscape, fostering the ability to adeptly apply and innovate upon these methods in their work.


Lecturer Intro

Yi-Shuai Niu, a tenured Associate Professor of Mathematics at Beijing Institute of Mathematical Sciences and Applications (BIMSA), specialized in Optimization, Scientific Computing, Machine Learning, and Computer Sciences. Before joining BIMSA in October 2023, he was a research fellow at the Hong Kong Polytechnic University (2021-2022); an associate professor at Shanghai Jiao Tong University (2014-2021), where he led the “Optimization and Interdisciplinary Research Group” and double-appointed at the ParisTech Elite Institute of Technology and the School of Mathematical Sciences. His earlier roles include postdoc at the University of Paris 6 (2013-2014) and junior researcher both at the French National Center for Scientific Research (CNRS) and Stanford University (2010-2012). He was also a lecturer at the National Institute of Applied Sciences (INSA) of Rouen (2007-2010) in France, where he earned a Ph.D. in Mathematics-Optimization in 2010 and double Masters in Pure and Applied Mathematics and Genie Mathematics in 2006. His research covers a wide range of applied mathematics, with a spotlight on optimization theory, machine learning, high-performance computing, and software development. His works span various interdisciplinary applications including: machine learning, natural language processing, self-driving car, finance, image processing, turbulent combustion, polymer science, quantum chemistry and computing, and plasma physics. His contributions encompass fundamental research, emphasizing novel algorithms for large-scale nonconvex and nonsmooth problems, and practical implementations, focusing on efficient optimization solvers and scientific computing packages using high-performance computing techniques. He developed more than 33 pieces of software and published about 30 articles in prestigious journals and conferences (including Journal of Scientific Computing, Combustion and Flames, Applied Mathematics and Computation). He was PI of 5 research grants and members of 5 joint international research projects. He was awarded of shanghai teaching achievement award (First prize) in 2017, two outstanding teaching awards (First prize) at Shanghai Jiao Tong University in 2016 and 2017 respectively, as well as 17 awards in international contests of mathematics MCM/ICM (including the INFORMS best paper award in 2017).


返回顶部
相关文章
  • Bayesian machine learning

    Record: YesLevel: GraduateLanguage: EnglishPrerequisiteProbability theory, Mathematical statistics, Machine learningAbstractProbabilistic approach in machine and deep learning leads to principled solutions. It provides explainable decisions and new ways for improving of existing approaches. Bayesian machine learning consists of probabilistic approaches that rely on Bayes formula. It can help in...

  • Machine Learning for Theoretical Physics

    PrerequisiteElementary multivariate calculus, elementary statistics. Some basic General Relativity and Statistical Mechanics may help in following the applications.AbstractThe course is targeted to those who know beginning graduate level physics but do not know machine learning. We will cover important methods in machine learning with a view to their applications to current physics such as stri...