清华主页 EN
导航菜单

Ten minutes for the transformer

来源: 10-26

时间:Oct.26 15:00-16:30

地点:A3-4-312 Zoom: 787 662 9899(PW: BIMSA)

组织者:Xiaopei Jiao

主讲人:Congwei Song BIMSA

Abstract

Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and so on. As the core of the architecture, the self-attention mechanism is a kind of kernel smoothing method, or "local model" by the speaker's word. The whole architecure also could be seen as a sequence model of meanshift algorithm that is a classic clustering method. The report aims to give a brief introduction to Transformer for the researchers who benefit from it as soon as possible.


Speaker Intro

Congwei Song received the master degree in applied mathematics from the Institute of Science in Zhejiang University of Technology, and the Ph.D. degree in basic mathematics from the Department of Mathematics, Zhejiang University, worked in Zhijiang College of Zhejiang University of Technology as an assistant from 2014 to 2021, from 2021 on, worked in BIMSA as asistant researcher. His research interests include machine learning, as well as wavelet analysis and harmonic analysis.

返回顶部
相关文章