Academics

Ten minutes for the transformer

Time:Oct.26 15:00-16:30

Venue:A3-4-312 Zoom: 787 662 9899(PW: BIMSA)

Organizer:Xiaopei Jiao

Speaker:Congwei Song BIMSA

Abstract

Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and so on. As the core of the architecture, the self-attention mechanism is a kind of kernel smoothing method, or "local model" by the speaker's word. The whole architecure also could be seen as a sequence model of meanshift algorithm that is a classic clustering method. The report aims to give a brief introduction to Transformer for the researchers who benefit from it as soon as possible.


Speaker Intro

Congwei Song received the master degree in applied mathematics from the Institute of Science in Zhejiang University of Technology, and the Ph.D. degree in basic mathematics from the Department of Mathematics, Zhejiang University, worked in Zhijiang College of Zhejiang University of Technology as an assistant from 2014 to 2021, from 2021 on, worked in BIMSA as asistant researcher. His research interests include machine learning, as well as wavelet analysis and harmonic analysis.

DATEOctober 26, 2023
SHARE
Related News
    • 0

      BIMSA-YMSC Number Theory Lunch Seminar: The Pila-Zannier Method

      BIMSA-YMSC Number Theory Lunch SeminarThis is an in-person seminar at BIMSA over lunch, aimed to promote communications in the Number Theory teams at BIMSA and YMSC. Each talk is 45 minutes long and does not focus on research results. Instead, we encourage each speaker to discuss either (1) a basic notion in Number Theory or related fields or (2) applications or computational aspects of Number ...

    • 1

      Intrinsic randomness under general measurements

      AbstractQuantum physics can provide intrinsically unpredictable randomness which is a significant resource in cryptography. How to quantify intrinsic randomness of the outcomes from a generic measurement is a basic but unsettled problem. We establish an adversarial scenario and extend the usual Naimark extension approach. Then we characterize intrinsic randomness and find some interesting pheno...