Academics

Facilitating model-based clustering by dimension reduction

Time:Mon., 14:00- 15:00, Sept. 22, 2025

Venue:C548, Shuangqing Complex Building A

Organizer:Yunan Wu

Speaker:Wei Luo

Statistical Seminar

Organizer

Yunan Wu 吴宇楠 (YMSC)

Speaker:

Wei Luo 骆威

浙江大学数据科学研究中心

Time:

Mon., 14:00- 15:00, Sept. 22, 2025

Venue:

C548, Shuangqing Complex Building A

Title:

Facilitating model-based clustering by dimension reduction

Abstract:

The Gaussian Mixture Model (GMM) has been widely used for clustering analysis. It is commonly fitted by the maximal likelihood approach, which is computationally challenging due to the non-convex minimization, especially as the dimensionality grows. To address this issue, we propose a two-step approach by recovering the intrinsic low-dimensional structure of GMM under additional constraints on its heterogeneity; that is, there exists a low-dimensional linear transformation of the data, given which the rest of the data are normally distributed and thus redundant for clustering. Our approach first recovers the desired low-dimensional data based on Stein's Lemma and then uses the reduced data only to fit GMM. Its computational efficiency comes from both the lower dimensionality and denoising of the data. Under a sparsity assumption of the clustering pattern, our approach can be generalized in high-dimensional settings. With the aid of a novelly constructed pseudo response, it can also be embedded into a general framework of sufficient dimension reduction, which encompasses a wider class of methods beyond Stein's Lemma to recover the low-dimensional structure of GMM. These findings are illustrated in the numerical studies at the end.

DATESeptember 20, 2025
SHARE
Related News
    • 0

      Fibers with good reduction in a family

      MCM-YMSC p-adic Geometry Learning SeminarOrganizers:Shizhang Li (MCM), Koji Shimizu (YMSC)Speaker:Han Hu (PKU)Time:Mon., 14:30-16:00March 3, 2025Venue:MCM 110Title:Fibers with good reduction in a familyAbout the Seminar:We study the Lawrence-Venkatesh method in the Spring of 2025.B. Lawrence and A. Venkatesh, Diophantine problems and p-adic period mappings, Invent. Math. 221 no.3 (2020), 8...

    • 1

      Factor Modeling for Clustering High-dimensional Time Series

      AbstractWe propose a new unsupervised learning method for clustering a large number of time series based on a latent factor structure. Each cluster is characterized by its own cluster-specific factors in addition to some common factors which impact on all the time series concerned. Our setting also offers the flexibility that some time series may not belong to any clusters. The consistency with...